calling the glm() function within a user-defined function shows Error in eval(predvars, data, env)

Hi,
I have been desperately trying to create a function that uses a glm() inside it. But I always get an error message. It looks like the function does not retrieve the value of the variable. Any help would be greatly appreciated. Not sure how to solve this

set.seed(234)
sex <- sample(c("M", "F"), size=100, replace=TRUE)
age <- rnorm(n=100, mean=20 + 4*(sex=="F"), sd=0.1)
dsn <- data.frame(sex, age)
rm(sex, age) #remove sex and age from the global environment for reproducibility

to_analyze <- function(dep, indep, data){
  glm(dep~factor(indep), data=data)
}

to_analyze(dep=age, indep=sex, data=dsn)
#> Error in eval(predvars, data, env): object 'age' not found

Hi @simRock. The error from the argument dep that you pass the variable age which is not available. You can change the function code and call as follow.

set.seed(234)
sex <- sample(c("M", "F"), size=100, replace=TRUE)
age <- rnorm(n=100, mean=20 + 4*(sex=="F"), sd=0.1)
dsn <- data.frame(sex, age)
rm(sex, age) #remove sex and age from the global environment for reproducibility

to_analyze <- function(dep, indep, data){
 glm(formula(paste0(dep, "~factor(", indep, ")")), data=data)
}

to_analyze(dep="age", indep="sex", data=dsn)
#> 
#> Call:  glm(formula = formula(paste0(dep, "~factor(", indep, ")")), data = data)
#> 
#> Coefficients:
#>  (Intercept)  factor(sex)M  
#>       23.984        -3.984  
#> 
#> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
#> Null Deviance:       396.2 
#> Residual Deviance: 0.837     AIC: -188.5

Created on 2020-02-28 by the reprex package (v0.3.0)

1 Like

After running your code, look what's in the namespace

ls()
[1] "dsn"        "to_analyze"

As @raytong notes , the function invocation doesn't provide for looking inside the function to discover that the third argument is a data frame with the first and second arguments as variables within it.

You can simply things somewhat by relying on positional argument

to_analyze("age", "sex", data=dsn)

Thank you raytong. It worked

Thank you @ technocrat

@raytong and @ technocrat I was able to find a shorter code that seems to work well see below

to_analyze <- function(dep, indep, data){
  glm(substitute(dep ~ factor(indep)), data=data)
}

to_analyze(dep=age, indep=sex, data=dsn)
#> 
#> Call:  glm(formula = substitute(dep ~ factor(indep)), data = data)
#> 
#> Coefficients:
#>  (Intercept)  factor(sex)M  
#>       24.006        -4.034  
#> 
#> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
#> Null Deviance:       397.3 
#> Residual Deviance: 0.8152    AIC: -191.2

However, I am stuck again because I am trying the call the output from this model in lsmeans::lsmeans() to predict marginal means and return the output but it is giving me an error. Any help would be greatly appreciated

to_predict_lsmeans <- function(dep, indep, data){
  model <- glm(substitute(dep ~ factor(indep)), data=data)
  pred <- lsmeans:: lsmeans(model, substitute(~ factor(indep)), offset=substitute(data)$log(age), type ="response" )
  return(pred)
}

pred <- to_predict_lsmeans(dep=age, indep=sex, data=dsn)
#> Error in ref_grid(object, ...): We are unable to reconstruct the data.
#> The variables needed are:
#>  sex
#> Are any of these actually constants? (specify via 'params = ')
#> The dataset name is:
#>  data
#> Does the data still exist? Or you can specify a dataset via 'data = '
pred
#> Error in eval(expr, envir, enclos): object 'pred' not found

Could I bother you to delete it here and repost as a new topic, since this one has already a marked solution? That will keep searches more efficient.

Thank you. I just reposted it. here the link to it
[post]calling the lsmeans() function within a user-defined function does not work with substitute()

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.