First, you did a nice job asking your question including the error and the command which threw it as well as a snippet of your data.

Here is a reproducible example though which I think will help as you can run the code and play with it yourself,

```
n <- 10
p <- 3
set.seed(123)
df <- data.frame(matrix(sample(4, n * p, TRUE), nrow = n, dimnames = list(NULL, c("y", "x1", "x2"))))
df[sample(n, 5), "x2"] <- NA
df
#> y x1 x2
#> 1 3 4 NA
#> 2 3 2 4
#> 3 3 2 NA
#> 4 2 1 NA
#> 5 3 2 1
#> 6 2 3 NA
#> 7 2 4 4
#> 8 2 1 2
#> 9 3 3 NA
#> 10 1 3 2
```

So, we've made some data with `NA`

values. Let's see what happens when we try to produce models from the data using a variety of `na.action`

choices.

It's worth noting, the default is `na.omit`

, so you should ahve a defensible reason for choosing something else before you do.

```
(m0 <- lm(y ~ x1 + x2, df))
#>
#> Call:
#> lm(formula = y ~ x1 + x2, data = df)
#>
#> Coefficients:
#> (Intercept) x1 x2
#> 2.5811 -0.3784 0.2027
```

```
(m1 <- lm(y ~ x1 + x2, df, na.action = "na.omit"))
#>
#> Call:
#> lm(formula = y ~ x1 + x2, data = df, na.action = "na.omit")
#>
#> Coefficients:
#> (Intercept) x1 x2
#> 2.5811 -0.3784 0.2027
```

```
(m2 <- lm(y ~ x1 + x2, df, na.action = "na.exclude"))
#>
#> Call:
#> lm(formula = y ~ x1 + x2, data = df, na.action = "na.exclude")
#>
#> Coefficients:
#> (Intercept) x1 x2
#> 2.5811 -0.3784 0.2027
```

You'll notice the three results are the same, because again, `"na.omit"`

is the default and `"na.exclude"`

does the same thing, though it does a better job of keeping track of what happened as we'll see next.

```
fitted(m1)
#> 2 5 7 8 10
#> 2.635135 2.027027 1.878378 2.608108 1.851351
fitted(m2)
#> 1 2 3 4 5 6 7 8
#> NA 2.635135 NA NA 2.027027 NA 1.878378 2.608108
#> 9 10
#> NA 1.851351
```

You can see here `"na.exclude"`

kept track of which observations were problematic and has `NA`

's for the fitted values (the residuals as well).

The rest produce errors for different reasons. The first, `"na.fail"`

is hopefully not too hard to understand why. It tells R to throw an error if the data contains any NA values.

```
m3 <- lm(y ~ x1 + x2, df, na.action = "na.fail")
#> Error in na.fail.default(structure(list(y = c(3L, 3L, 3L, 2L, 3L, 2L, : missing values in object
```

Finally, choosing `NULL`

I believe has the same effect as choosing `"na.pass"`

which causes `lm()`

to simply do what you asked, no more, no less... it does no pre-processing on the data and happily throws errors when the computations fail because of the `NA`

's.

```
m4 <- lm(y ~ x1 + x2, df, na.action = NULL)
#> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): NA/NaN/Inf in 'x'
m5 <- lm(y ~ x1 + x2, df, na.action = "na.pass")
#> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): NA/NaN/Inf in 'x'
```

