Passing df column name to function

Hi, all!

I'm trying to create a function that counts the values of a column that I specify in an existing dataframe. I want to select a few specific columns, group by one column (in particular), and then count the column whose name I specify in the function. I feel like this should be straightforward, but I don't know where I'm going wrong. Help would be very much appreciated!


 CoolData <- data.frame(
   Reference = c( rep("Todd", 5), rep("Liz", 5)),
   Var1 = rep(1:2, 5),
   Var2 = rep(1:5, 2),
   Var3 = c( rep(0, 3), rep(1, 7))
 )
 
 CoolData

 Fun1 <- function(df, column){
   
   column <- df[, column]

   df %>%
     select(Reference, column, Var3) %>%
     group_by(Var3) %>%
     count(column) %>%
     print()
 }
 
 Fun1(CoolData, "Var2")

CoolData doesn't have a column named Code2. This leads to the error:

Error in [.data.frame(CoolData, , "Code2") : undefined columns selected

I think you may find this link useful: Programming with dplyr

Hi, there!

Thanks for pointing that out. I meant to type "Var2" instead of "Code2." With the change, it still doesn't work!

I know, and that's why shared the link. Have you gone through that?

Do any of these help?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

CoolData <- data.frame(Reference = rep(x = c("Todd", "Liz"),
                                       each = 5),
                       Var1 = rep(1:2, 5),
                       Var2 = rep(1:5, 2),
                       Var3 = c( rep(0, 3), rep(1, 7)))

Fun1 <- function(df, column)
{
  df %>%
    select(Reference, !!column, Var3) %>%
    group_by(Var3) %>%
    count(!!column)
}

Fun1(df = CoolData,
     column = quo(Var2))
#> # A tibble: 8 x 3
#> # Groups:   Var3 [2]
#>    Var3  Var2     n
#>   <dbl> <int> <int>
#> 1     0     1     1
#> 2     0     2     1
#> 3     0     3     1
#> 4     1     1     1
#> 5     1     2     1
#> 6     1     3     1
#> 7     1     4     2
#> 8     1     5     2

Fun2 <- function(df, column)
{
  column = enquo(arg = column)
  
  df %>%
    select(Reference, !!column, Var3) %>%
    group_by(Var3) %>%
    count(!!column)
}

Fun2(df = CoolData,
     column = Var2)
#> # A tibble: 8 x 3
#> # Groups:   Var3 [2]
#>    Var3  Var2     n
#>   <dbl> <int> <int>
#> 1     0     1     1
#> 2     0     2     1
#> 3     0     3     1
#> 4     1     1     1
#> 5     1     2     1
#> 6     1     3     1
#> 7     1     4     2
#> 8     1     5     2

Created on 2019-08-12 by the reprex package (v0.3.0)

2 Likes

Yaaaahoooo!!! I just tried out the first function on my dataset, and it works! Thanks so much for taking the time to comment/reply; you just saved me about 25 pages of R markdown. Very much appreciated! Thanks!

Yarnabrina's second option is the better and more "standard" one to use, since you can enter the column as a bare column name and avoid the need to use quo. Also, the latest version of dplyr now has an option that is simpler than the enquo(x) !!x pair: Instead of enquo(column) and then !!column, you can just do the single operation {{column}} each time you want to use column in a function. For example:

Fun3 <- function(df, column) {
  
  df %>%
    select(Reference, {{column}}, Var3) %>%
    group_by(Var3) %>%
    count({{column}})
}

Also, the select step is unnecessary (unless you need to select specific columns for other things you want to do within this function), and the group_by can be combined into the count function.

Fun3 <- function(df, column) {
  df %>%
    count(Var3, {{column}})
}
3 Likes

Hi, joels.

Thanks for this information, too! I've learned a ton today and been able to save a mountain of time. Many thanks for time, coding brains, and help!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.