Trying to learn purrr::map functions, and having trouble with map2 and a function that includes group_by. This could be something elementary that I am doing wrong, but I am not clear on what.
Reprex below
#load libraries
library(rlang)
library(tidyverse)
library(fivethirtyeight)
library(purrr)
# make function
mean_grouped <- function(data, groupvar, meanvar) {
data %>%
group_by({{ groupvar }}) %>%
summarise(mean = mean({{ meanvar }}, na.rm = TRUE))
}
# test function
mean_grouped(starwars, homeworld, height)
#> # A tibble: 49 x 2
#> homeworld mean
#> <chr> <dbl>
#> 1 Alderaan 176.
#> 2 Aleen Minor 79
#> 3 Bespin 175
#> 4 Bestine IV 180
#> 5 Cato Neimoidia 191
#> 6 Cerea 198
#> 7 Champala 196
#> 8 Chandrila 150
#> 9 Concord Dawn 183
#> 10 Corellia 175
#> # … with 39 more rows
# try to purrr::map function over possible options for grouping var
# and for mean var
# first create df of all combinations
group_vars = c("gender", "homeworld", "species",
"hair_color", "eye_color", "skin_color")
mean_vars = c("height", "mass", "birth_year")
vars_list <- list(x = group_vars, y = mean_vars)
cross_list <- cross_df(vars_list)
# Now map attempt
starwars %>%
map2_dfr(.x = cross_list$x,
.y = cross_list$y,
.f = mean_grouped)
#> Error in UseMethod("group_by_"): no applicable method for 'group_by_' applied to an object of class "character"
I think there are a couple reasons your map2_dfr() is giving an error. The first is that unlike many other tidyverse functions, the map family of functions generally don't take dataframes as their inputs, which is what happens when you pipe starwars as the first argument of map2_dfr().
In the case of map2(), it expects .x and .y to be vectors, and you are correctly specifying them here as your grouping and mean vars. But then you want to iterate over those vectors and use them as the arguments in your mean_grouped() function, applied to the starwars data frame.
A second issue is discussed in the last example here: https://www.tidyverse.org/articles/2019/06/rlang-0-4-0/. Because you are passing string vectors to your custom function (by way of map2()), you should refer to them using .data[[var]] rather than {{var}}.
I'm not sure exactly what the final output you're hoping for is, but below is an example that might be what you're after.
As an FYI, there is an RStudio interactive tutorial on purrr via the swirl course "Advanced R Programming". The module is "Functional Programming with purrr.
Thanks! Ideally I would then gather the groupvars, and gather the meanvars, so that I end up with 3 columns of groupvar, meanvar, and value.
I end up a little disappointed that I can not pass string vectors to a function with curly curly. Is this fixable? Part of the appeal of the tidyverse is that is mostly "just works", without a lot of funky exceptions.