Why doesn't this function works?

fit.model <- function(x) {
summary(lm(formula = crim ~ x, data = boston))
}

fit.model(indus)

it returns the error message
Error in eval(predvars, data, env) : object 'indus' not found

When you pass the unquoted name indus to the function, R is expecting that to be the name of an object when it is actually the name of a column in the boston data frame. Here is an example that works with the mtcars data set. I paste together the formula and then use as.formula. The as.formula is not actually necessary but I think it makes the logic clearer.

fit.model <- function(x) {
  summary(lm(formula = as.formula(paste("mpg ~", x)), data = mtcars))
}

fit.model("disp")
#> 
#> Call:
#> lm(formula = as.formula(paste("mpg ~", x)), data = mtcars)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.8922 -2.2022 -0.9631  1.6272  7.2305 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 29.599855   1.229720  24.070  < 2e-16 ***
#> disp        -0.041215   0.004712  -8.747 9.38e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 3.251 on 30 degrees of freedom
#> Multiple R-squared:  0.7183, Adjusted R-squared:  0.709 
#> F-statistic: 76.51 on 1 and 30 DF,  p-value: 9.38e-10

Created on 2020-02-09 by the reprex package (v0.2.1)

1 Like

Another method is to use non-standard evaluation.

data(mtcars)

fit.model <- function(x){
  
  eval(substitute(summary(lm(formula=mpg~x, data=mtcars))))
  
}

fit.model(cyl)
#> 
#> Call:
#> lm(formula = mpg ~ cyl, data = mtcars)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.9814 -2.1185  0.2217  1.0717  7.5186 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  37.8846     2.0738   18.27  < 2e-16 ***
#> cyl          -2.8758     0.3224   -8.92 6.11e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 3.206 on 30 degrees of freedom
#> Multiple R-squared:  0.7262, Adjusted R-squared:  0.7171 
#> F-statistic: 79.56 on 1 and 30 DF,  p-value: 6.113e-10

Created on 2020-02-09 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.