Passing a list of variables inside quos() for purrr::map

Hi there!

I would like to map the following function over a list of columns from my dataframe in order to produce some descriptive statistics for each one of them. I would like to do it only with tidyverse tools.

Lets assume that we have a dataframe called "df", the variables to be summurised var1, var2 & var3. Then the following code is working:

a_function <- function(var, data) {
  data %>%
  summarise(
    Var_Name =  quo_name(var),
    mean = mean({{ var }}, na.rm = TRUE),
    sd   = sd({{ var }}, na.rm = TRUE)
    ) 
}

My_result<- 
  map_dfr(
  .x = quos(var1, var2, var3 ),
  .f = a_function , 
  data = df
  ) 

The problem starts when I am trying to pass a list of variables inside quos(). Bare in mind that I want the variable names to be passed as Var_Name characters strings in the "My_result" df

#The variable list to be passed on map
var_list <- df %>%
  select(var1:var3) %>%
  colnames() 

My_result<- 
  map_dfr(
  .x = quos(var_list  ),
  .f = a_function , 
  data = df
  ) 

The error is saying
Error in summarise():
! Problem while computing Var_Name = quo_name(var_list).
Caused by error in expr_name():
! expr must be a symbol, scalar, or call.

Any ideas?

I am unable to generate the same error, but the alternative solution below, which removes the need for quos(), gets to the desired outcome of descriptive statistics for a list of columns .

library(tidyverse)

df = mtcars %>%
  select(var1 = cyl,
         var2 = mpg,
         var3 = disp)


var_list <- df %>%
  select(var1:var3) %>%
  colnames() 

a_function = function(var, data) {
  d = data[,var]
  
  data.frame(
    Var_Name = var,
    mean = mean(d, na.rm = T),
    sd = sd(d, na.rm = T)
  )
}

My_result = map_dfr(
  .x = var_list,
  .f = a_function,
  data = df
)

My_result
#>   Var_Name      mean         sd
#> 1     var1   6.18750   1.785922
#> 2     var2  20.09062   6.026948
#> 3     var3 230.72188 123.938694

Created on 2022-10-05 with reprex v2.0.2.9000

1 Like

How about something like this?

library(purrr)
library(dplyr, warn.conflicts = FALSE)
df = mtcars
var_list <- df %>%
  select(1:4) %>%
  colnames() 

a_function <- function(var, data) {
  myvar = sym(var)
  data %>%
    summarise(
      Var_Name = var,
      mean = mean({{ myvar }}, na.rm = TRUE),
      sd   = sd({{ myvar}}, na.rm = TRUE)
    ) 
}

(My_result<- 
  map_dfr(
    .x = var_list,
    .f = a_function, 
    data = df
  )
)
#>   Var_Name      mean         sd
#> 1      mpg  20.09062   6.026948
#> 2      cyl   6.18750   1.785922
#> 3     disp 230.72188 123.938694
#> 4       hp 146.68750  68.562868

Created on 2022-10-05 by the reprex package (v2.0.1)

1 Like

Thanks michaelbgarcia your solution works. There is another way of doing it with across() but I was curious of how to work it with purrr::map

Hi scottyd22, thanks for your reply. Your solution works on the example you gave me but when I try to switch to my actual problem I receive this

Error in is.data.frame(x) : 
  'list' object cannot be coerced to type 'double'
In addition: Warning message:
In mean.default(d, na.rm = T) :
  argument is not numeric or logical: returning NA

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.