Tidyeval for user entered column name

I have a situation where I need to process different files. There is an "ID" field in all the files, but sometimes it is named differently depending on the file. For example, it can be called "ID1" or "ID_1", etc.

I want the user to be able to specify the name of the "ID" column and then use that in my R script.
Here is what I have so far, which doesn't work.

id_col <- "ID_1"

bad_length <- file %>% 
  filter(str_length(.$id_col) != 10)

How do I reference id_col in str_length()? I tried using .$(!!id_col) as the argument to str_length() but that didn't work.

Thanks!

Using 'dplyr` with your files in dataframes or tibbles:

new_df <- old_df %>% mutate(ID = ID_1)

Then you can inner_join on ID, use select(-duplicate_columns) and get where I think you're trying to go.

You could write a function using quosures to flexibly specify the column name. For example:

library(tidyverse)

fnc = function(data, id_col, filter.length=10) {
  id_col = enquo(id_col)
  data %>% 
    filter(str_length(!!id_col) != filter.length)
}

fnc(iris, Species)
fnc(diamonds, cut, 9)

If you also want to always rename the id column to "ID", you could do this:

fnc = function(data, id_col, filter.length=10) {
  id_col = enquo(id_col)
  data %>% 
    rename(ID=!!id_col) %>% 
    filter(str_length(ID) != filter.length)
}
5 Likes

Thanks for your reply! Either of these solutions work as well

bad_length <- file %>% 
  filter(str_length(get(id_col)) != 10) 

bad_length <- file %>% 
  filter(str_length(!!sym(id_col)) != 10) 

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.