Unable to inject variable name to `filter()` in a purrr::map() context

Hi all,

I ran into an issue which do not understand. I can without issue pass a variable as string to a function and with rlang pass the variable to dplyr to execute a filter() as shown below.

library(tidyverse)

## Injecting in a function context [THIS WORKS]
foo <- function(.x) {
  filter(mtcars, .data[[.x]] > 300)
}
foo("hp")
#>               mpg cyl disp  hp drat   wt qsec vs am gear carb
#> Maserati Bora  15   8  301 335 3.54 3.57 14.6  0  1    5    8
foo("disp")
#>                      mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Hornet Sportabout   18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Duster 360          14.3   8  360 245 3.21 3.570 15.84  0  0    3    4
#> Cadillac Fleetwood  10.4   8  472 205 2.93 5.250 17.98  0  0    3    4
#> Lincoln Continental 10.4   8  460 215 3.00 5.424 17.82  0  0    3    4
#> Chrysler Imperial   14.7   8  440 230 3.23 5.345 17.42  0  0    3    4
#> Dodge Challenger    15.5   8  318 150 2.76 3.520 16.87  0  0    3    2
#> AMC Javelin         15.2   8  304 150 3.15 3.435 17.30  0  0    3    2
#> Camaro Z28          13.3   8  350 245 3.73 3.840 15.41  0  0    3    4
#> Pontiac Firebird    19.2   8  400 175 3.08 3.845 17.05  0  0    3    2
#> Ford Pantera L      15.8   8  351 264 4.22 3.170 14.50  0  1    5    4
#> Maserati Bora       15.0   8  301 335 3.54 3.570 14.60  0  1    5    8

However this code breaks when put into a purrr::map() context and I do not understand why... Please note that this has been simplified from a more complex example hence the need for group_by()/map().

library(tidyverse)

## Injecting in a purrr:map context [ERROR]
df <- tibble(var = c("hp", "disp")) %>%
  group_by(var) %>% 
  mutate(out = map(
    .x = var,
    .f = ~filter(mtcars, .data[[.x]] > 300) # If this code is replaced with message("Filtering data based on ", .x)  the variable name is properly passed to .f
  )
  )
#> Error in local_error_context(dots = dots, .index = i, mask = mask): promise already under evaluation: recursive default argument reference or earlier problems?

All suggestions are welcome.

Created on 2022-08-03 by the reprex package (v2.0.1)

Hello,

I do not understand what this second piece of code is supposed to do. Can you show the expected output and describe a bit better what the mapping is for? The function in the map needs to be a function that's applied to the current data, not output generated by filtering something else...

PJ

Hi thank you for your reply,

The second piece of code was expected to behave like the first piece since the code is the same but called from within purrr::map instead of from a standard function.

The drive behind the original code is that I have a large dataset from which to create different subsets based on several criteria (which I can then feed into models or diagnostic plots). To do this I generate a tibble in which each row contains the criteria to generate a specific subset of the original data. The critera are varying but not the original data, hence the choice to map over the arguments rather than the original data. This approach usually works very well for me, but I do not understand the error I am getting from rlang.

Below is a slightly more complex example which hopefully better captures what I am trying to achieve:

library(tidyverse)

df <- tibble(
  ## Define variables on which the data should be filtered
  var    = c("hp", "disp"),
  
  ## Define cutoff values for the filtering
  cutoff = c(300, 400)
) %>% 
  group_by(var, cutoff) %>% 
  
  ## Creating the data subsets
  mutate(out = map2(
    .x = var,
    .y = cutoff,
    .f = ~filter(input, .data[[.x]] > .y),
    input = mtcars
  )
  )
#> Error in local_error_context(dots = dots, .index = i, mask = mask): promise already under evaluation: recursive default argument reference or earlier problems?

Created on 2022-08-04 by the reprex package (v2.0.1)

Of note out of this code I was expecting a list column (out) containing the corresponding subsets to be added to my tibble (df).

1 Like

Here is code with two approaches. the second uses the slider package.

library(tidyverse)

df <- tibble(
  var    = c("hp", "disp"),
  cutoff = c(300, 400)
) 

df |> rowwise() |> 
  group_split() |> map(
    ~{
    filter(mtcars,
           !!sym( .x[["var"]]) >  .x[["cutoff"]])
    }
  )



library(slider)

slider::slide2(df$var,
               df$cutoff,
               ~filter(mtcars,
                       !!sym(.x) > .y)
               )

Thank you for your reply @nirgrahamuk. These are 2 nice workaround to my problem. But would you happen to know why the unquoting of the variable name fails in the example I provided?

I don't know, theres probably some pronoun ambiguity going on , or something about order of evaluations, the closest I can resurrect your own code is the following (though I prefer my earlier solutions).

library(tidyverse)


myfilt <- function(x,y){
  filter(mtcars,
         .data[[x]] > y)
}

df <- tibble(
  ## Define variables on which the data should be filtered
  var    = c("hp", "disp"),
  
  ## Define cutoff values for the filtering
  cutoff = c(300, 400)
) %>% 
  group_by(var, cutoff) %>% 
  ## Creating the data subsets
  mutate(out = map2(
    .x = var,
    .y = cutoff,
    .f = myfilt
  ))

@nirgrahamuk thanks a lot for looking into this !