Replacing '+' of a formula by ',' and list them with quotation marks

I try to combine an output in a different way, so that in the end I get the columns values of selected variables after the backward selection. Unfortunately I have problems with the combination of the variables. After the backward selection the variables are output as formula and I would have liked to separate this formula by , instead of + and put the variables in quotation marks "". I tried different ways and unfortunately I did not get a result.

external <- ts(mtcars, start = c(2015,1), end = c(2019,52), frequency = 52)
# define all possible variables for variable selection
x <- window(external, end= c(2018,52))
data.variables <- lm(mpg ~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb, data=window(external, start=c(2015,1), end=c(2018,52)))
# Variable selection backwards
step <- step(data.variables, direction="backward", k=log(208))  
# provide matrix regression
## Here I want to paste the calculated Variables listed like this: "disp","hp","wt","qsec","am"
xregs <- as.matrix(window(external[,c(paste(gsub("+", ",",(deparse(step[["call"]][["formula"]][[3]])), fixed=T))), end = c(2018, 52))) 
# the acutal line should look like this:
xregs <- as.matrix(window(external[,c("disp","hp","wt","qsec","am")], end = c(2018, 52)))
# but I only receive this:
xregs <- as.matrix(window(external[,c("disp,hp,wt,qsec,am")], end = c(2018, 52)))

Especially when I use the following:

a <-gsub("+", ",",(deparse(step[["call"]][["formula"]][[3]])), fixed=T)
cat(sapply(strsplit(a, '[, ]+'), function(x) toString(dQuote(x, FALSE))))
#"disp", "hp", "wt", "qsec", "am"

the right line is printed, but I don't know how to use the printed line in the other command.

How about this?

vs <- deparse(step[["call"]][["formula"]][[3]]) # get formula from lm
vs <- str_trim(str_split(vs, pattern = "\\+", simplify = TRUE)) # split string
xregs <- as.matrix(window(external[,vs], end = c(2018, 52))) # subset x

Thank you first of all for your help. It does work!

But if i apply this to my data, where way more variable are chosen, I actually get:

My procedure and my results:

  # provide xregs for training data
  step <- step(data.variables, direction="backward", k=log(length(x)))    
  # provide xregs for training data
  vs <- deparse(step[["call"]][["formula"]][[3]]) # get formula from lm
#[1] "F7_1 + F11_1 + F32_1 + F33_0 + F34__1 + F37_1 + F39_2 + F61_0 + "
#[2] "    AO12 + LS39 + AO63 + AO71"   
  a <- str_trim(str_split(vs, pattern = "\\+", simplify = TRUE)) # split string
# [1] "F7_1"   "AO12"   "F11_1"  "LS39"   "F32_1"  "AO63"   "F33_0"  "AO71"   "F34__1" ""      
#[11] "F37_1"  ""       "F39_2"  ""       "F61_0"  ""       ""       ""     
  xregs <- as.matrix(window(external_data.ccf_ts[,a], end = c(2018, 52)+(i-1)/52)) # subset x
#Error in `[.default`(external_data.ccf_ts, , a) : 
#Indexing outside the boundaries

I think it cant handle two rows of formula. Do you have any idea how to solve this?

You need to add something like this to combine the rows of the formula.

vs <- paste(vs)
1 Like

some advice, for your own sanity, try to avoid naming objects you make with the name of existing R functions that you use. with that in mind , assuming step object is called step_ to differentiate it from the step function then you could do

attr(step_$terms,"term.labels") %>% paste0(collapse=",")

instead of the deparse stuff to get the indepentent terms out as a character string with comma sep

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.