Applying function to columns and generating new column names

jjiao · November 5, 2019, 9:17pm

I have a dataframe with multiple columns. For each column, I want to create a new column with a name that corresponds to the column, and some transformation of that column. For example, for column a, I want a new column that is named a_fit which all the values of a are multipled by 2. I would want the new columns named a_fit, b_fit, c_fit, etc. in the dataframe.

Then, ideally I would be able to call the function over a series of columns (20+ columns) in one call.. using lapply maybe?

mydata <- data.frame(a = 1:10, b = 2:11, c = 3:12, d = 4:13, e = 5:14)

newcol <- function(df, x) {
  df[[paste(x, "_fit", sep='')]] <= df[[x]] * 2
}

newcol(mydata, a)

Any guidance would be helpful!

andresrcs · November 5, 2019, 9:28pm

You are almost there, just a little syntax issues

mydata <- data.frame(a = 1:10, b = 2:11, c = 3:12, d = 4:13, e = 5:14)

newcol <- function(df, x) {
    df[[paste(x, "_fit", sep='')]] <- df[[x]] * 2
    df
}

newcol(mydata, "a")
#>     a  b  c  d  e a_fit
#> 1   1  2  3  4  5     2
#> 2   2  3  4  5  6     4
#> 3   3  4  5  6  7     6
#> 4   4  5  6  7  8     8
#> 5   5  6  7  8  9    10
#> 6   6  7  8  9 10    12
#> 7   7  8  9 10 11    14
#> 8   8  9 10 11 12    16
#> 9   9 10 11 12 13    18
#> 10 10 11 12 13 14    20

^{Created on 2019-11-05 by the reprex package (v0.3.0.9000)}

Or if you are looking for the tidyverse way to do it, this could work

library(dplyr)

mydata <- data.frame(a = 1:10, b = 2:11, c = 3:12, d = 4:13, e = 5:14)

mydata %>% 
    mutate_all(list(fit = ~ . * 2))
#>     a  b  c  d  e a_fit b_fit c_fit d_fit e_fit
#> 1   1  2  3  4  5     2     4     6     8    10
#> 2   2  3  4  5  6     4     6     8    10    12
#> 3   3  4  5  6  7     6     8    10    12    14
#> 4   4  5  6  7  8     8    10    12    14    16
#> 5   5  6  7  8  9    10    12    14    16    18
#> 6   6  7  8  9 10    12    14    16    18    20
#> 7   7  8  9 10 11    14    16    18    20    22
#> 8   8  9 10 11 12    16    18    20    22    24
#> 9   9 10 11 12 13    18    20    22    24    26
#> 10 10 11 12 13 14    20    22    24    26    28

jjiao · November 6, 2019, 12:54am

Hi!

Thanks so much for the syntax help-- that cleared it up! However, when I call mydata after performing the function, the new column a_fit is not there. How do I ensure that the function is adding the new column to the existing data frame instead of creating a new one to output?

andresrcs · November 6, 2019, 2:43am

Just assign back the results to the original data frame

mydata <- newcol(mydata, "a")
mydata
#>     a  b  c  d  e a_fit
#> 1   1  2  3  4  5     2
#> 2   2  3  4  5  6     4
#> 3   3  4  5  6  7     6
#> 4   4  5  6  7  8     8
#> 5   5  6  7  8  9    10
#> 6   6  7  8  9 10    12
#> 7   7  8  9 10 11    14
#> 8   8  9 10 11 12    16
#> 9   9 10 11 12 13    18
#> 10 10 11 12 13 14    20

system · November 13, 2019, 2:43am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.