creating new columns in data.table by renaming

I have created a function to calculate log of selected variables and rename them. The function however works for 1 data set and does not work for the other. Please help. The working example is attached below.

        Year = c("1960Q1", "1960Q2", "1960Q3", "1960Q4", "1961Q1", "1961Q2"),
        CONS = c(6.028278, 6.042633, 6.073044, 6.104793, 6.12905, 6.126869),
         INC = c(6.111467, 6.142037, 6.184149, 6.200509, 6.232448, 6.253829),
         INV = c(5.192957,5.187386,5.220356,

#function to calculate log of selected variables
        for(col in cols){
                DT[,(new_name):= log(get(col))]

# variable selection
cols=data %>% select_if(is.numeric) %>% names() #select columns

log_fn(data,cols) #calculate log of selected variables

#second data

                                check.names = FALSE,
                            `CALENDAR YEAR` = c("1996Q2",
                `Remittance.Billion Rs...3` = c("9177",
             `GDP_mp (2011).Billion Rs...5` = c("7731.664146858664","7128.9166154954801",
                  `Oil Price.US$ per Barel` = c("19.476666666666667","20.543333333333333",
                        `Exchange rate.US$` = c("34.744",
                         `REER.Trade based` = c("97.23",
                                  NEER....9 = c("103.543333333333","100.61","100.9",
                            `Libor.3-month` = c("5.5197240983606557","5.5965146153846153",
                             `T-bill....11` = c("12.438499999999999","9.1639333329999992",
               `IR_differential.14-90 days` = c("6.9187759016393438","3.567418717615384",
  `US GDP.Billions of Chained 2012 Dollars` = c("10998.322",
               `US_unemployment rate....14` = c("5.5",

data2<-data2 %>% mutate(
        across(-`CALENDAR YEAR`, as.numeric))
#columns to calculate log
vars<-data2 %>% 
        select(-c(`CALENDAR YEAR`,`IR_differential.14-90 days`,
                  `Libor.3-month`,`US_unemployment rate....14`,
                  `T-bill....11`)) %>%

#calculating log of the above variables
log_fn(data2,cols = vars)
#> Warning in `[.data.table`(DT, , `:=`((new_name), log(get(col)))):
#> Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the
#> data.table so that := can add this new column by reference. At an earlier point,
#> this data.table has been copied by R (or was created manually using structure()
#> or similar). Avoid names<- and attr<- which in R currently (and oddly) may
#> copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?
#> setnames and ?setattr. If this message doesn't help, please report your use case
#> to the data.table issue tracker so the root cause can be fixed or this message
#> improved.
If you change the order of the steps for data2, does this do what you expect?

# data2 as originally defined
# now define your vars
vars <- data2 %>% 
  select(-c(`CALENDAR YEAR`,`IR_differential.14-90 days`,
            `Libor.3-month`,`US_unemployment rate....14`,
            `T-bill....11`)) %>%

# If I understand, you only need to apply your function to the cols in vars?
#  perform as.numeric on the 'vars' columns

data2[, (vars) := lapply(.SD, as.numeric), .SDcols = vars]
# try the function and check
log_fn(data2, cols = vars)

I think the key thing, is to try and avoid the

data2 <- data2 %>% 

pattern with data.table

That’s so cool John.. I almost forgot the .SD argument of data.table.. Thanks a lot.

Hi John, one more question in this regard, what is the data.table way to convert all columns except the first column to numeric?

data2[,.SD, .SDcols = !c('CALENDAR YEAR')]

Should work

The problem is that removes my calender year column. I want that to be retained as such and all other columns to be converted to numeric

data2[, (vars) := lapply(.SD, as.numeric), .SDcols = !c('CALENDAR YEAR') ]

Thanks John.. however i dont want to define vars either.. I want it to be done without defining vars. convert all columns of data.table to numeric except Calender Year. This is what i want to do

See below
starting with your data2 data frame

cols <- data2 %>% select(-`CALENDAR YEAR`) %>% names()

vars <- data2 %>% 
  select(-c(`CALENDAR YEAR`,`IR_differential.14-90 days`,
            `Libor.3-month`,`US_unemployment rate....14`,
            `T-bill....11`)) %>%

# set all cols except `CALENDAR YEAR` to numeric
for (j in cols) {
  set(data2, j = j, value = as.numeric(data2[[j]]))

# apply your function. 
# if you don't want to define vars, then pass in character vector
log_fn(data2, cols = vars)
data2[]  # use `[]` to update data2 in the console. Or hit refresh in Environment pane in RStudio

thats helpful John!!

