Dear helpful R community,
I am in a situation where I have
- many sets of data, whose columns keep changing positions/names (albeit with slight adjustments,typos)
- 2 types of functions that I need to run by group on all datasets/columns depending upon their classification
I asked this question on SO too, but later realised that with the no. of datasets and columns, it will be impractical to manually code the functions.
A sample data set could look like this:
### sample, simplified dataframe
df1 <- tibble(A=c(NA, 1, 2, 3), B = c(1,2,1,NA), C = c(NA,NA,NA,2), D = c(2,3,NA,1), E = c(NA,NA,NA,1))
### sample function dataframe
funcDf <- tibble(colNames = names(df1), type = c(rep("Compulsory", 4), "Conditional"))
funcDf <- funcDf %>%
mutate(func = as.character(glue("is.na({funcDf$colNames})")))
funcDf[funcDf$colNames == "E",]$func <- "ifelse(is.na(E) & !is.na(A), 1, 0)"
I would like to apply the relevant function to the corresponding column, which can be identified from funcDf
, and needs to be applied on df1
.
I thought this would be a standard use-case for tidyeval, but I will be thankful for other advice/suggestions as well.