Why can't I use a character as the variable in 'fit' or 'predict' ? Is there any way around this?

Why can't character variables be used in fit in my tidymodels example below? both should equal "ridership"


#Try to predict ridership variable from the variables 'Clark_Lake' & 'Quincy_Wells'

n <- nrow(Chicago)
Chicago <- Chicago %>% select(ridership, Clark_Lake, Quincy_Wells) #Subset

#Split into training and testing sets
Chicago_train <- Chicago[1:(n - 7), ]
Chicago_test <- Chicago[(n - 6):n, ]

#Model specs
bt_reg_spec <- 
  boost_tree(trees = 15) %>% 
  # This model can be used for classification or regression, so set mode
  set_mode("regression") %>% 

#ORIGINAL method (from Tidymodels tutorial):
bt_reg_fit <- bt_reg_spec %>% fit(ridership ~ ., data = Chicago_train)

predict(bt_reg_fit, Chicago_test)

#This way doesn't work: 
myvar <- "ridership"
bt_reg_fit <- bt_reg_spec %>% fit(myvar ~ ., data = Chicago_train) #Doesn't work

It's the difference between and expression (or formula) and a character string. They are not the same thing.

You can start with a character string and make it into a formula though:

myvar <- "ridership"
chr_f <- paste(myvar, "~ .")
#>  chr "ridership ~ ."

f <- as.formula(chr_f)
#> Class 'formula'  language ridership ~ .
#>   ..- attr(*, ".Environment")=<environment: R_GlobalEnv>

Created on 2022-05-10 by the reprex package (v2.0.1)

then you can use

bt_reg_fit <- bt_reg_spec %>% fit(f, data = Chicago_train) 
1 Like

Thanks @Max! This solved my problem and allowed the model to be fit inside a loop.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.