How to use better the pmap inside mutate?

Please share your recommendation on using the pmap inside mutate? I am looking for legibility and effective code.
In my real application, I have more complex data, nested dataframes and new columns will be added using a series of piped mutate statement where pmap invoking some custom functions.

The pmap can be used easily with dataframes, because the names are preserved (not the parameter order is important).

Here am reusing an example from Advanced R.

library(tidyverse)

params <- tibble::tribble(
  ~ n, ~ min, ~ max,
   1L,     2,    10,
   2L,     4,   100,
   3L,     8,  1000
)

params |> pmap(runif)
#> [[1]]
#> [1] 4.022187
#> 
#> [[2]]
#> [1]  8.699843 89.863384
#> 
#> [[3]]
#> [1] 278.3641 781.0018 194.5025

params |> select(max, min, n) |> pmap(runif)
#> [[1]]
#> [1] 6.974362
#> 
#> [[2]]
#> [1] 65.57562 90.35880
#> 
#> [[3]]
#> [1] 143.0269 628.9549 297.8434

params |> select(min, n, max) |> pmap(runif)
#> [[1]]
#> [1] 7.684047
#> 
#> [[2]]
#> [1] 33.58746 24.00720
#> 
#> [[3]]
#> [1] 592.7216 171.2659 581.2717

A new column can be added in following way, but this solution cannot be replicated easily in a series of piped mutate statements, because the initial dataframe (params) appears both on the left and right side of the pipe operator.

params |> mutate(result = pmap(params , runif))
#> # A tibble: 3 x 4
#>       n   min   max result   
#>   <int> <dbl> <dbl> <list>   
#> 1     1     2    10 <dbl [1]>
#> 2     2     4   100 <dbl [2]>
#> 3     3     8  1000 <dbl [3]>

Something similar would be ideal; the .data pronoun is not working here (it's for a different use).

params |> mutate(result = pmap(.data , runif))
#> Error in `mutate()`:
#> ! Problem while computing `result = pmap(.data, runif)`.
#> Caused by error in `stop_bad_type()`:
#> ! Element 1 of `.l` must be a vector, not an environment

My alternative solution so far is the following; but the main drawback is that the names are not preserved, the order of the arguments is considered instead. On the other hand, I am not sure if this is an effective code or not – considering additional copying.

params |> mutate(result = pmap(list(n, min, max) , runif))
#> # A tibble: 3 x 4
#>       n   min   max result   
#>   <int> <dbl> <dbl> <list>   
#> 1     1     2    10 <dbl [1]>
#> 2     2     4   100 <dbl [2]>
#> 3     3     8  1000 <dbl [3]>

params |> mutate(result = pmap(list(min, n, max) , runif))
#> # A tibble: 3 x 4
#>       n   min   max result   
#>   <int> <dbl> <dbl> <list>   
#> 1     1     2    10 <dbl [2]>
#> 2     2     4   100 <dbl [4]>
#> 3     3     8  1000 <dbl [8]>

Can you suggest any better solution? Thank you.

These are another two formulations that work

params %>% mutate(result = pmap(list(min=min, n=n, max=max) , runif))
params %>% mutate(result = pmap(list(min,n,max) , ~runif(..2,..1,..3)))

I would prefer a more “compact” solution (especially when having more arguments), but this is perfectly one. Thank you.

On the other hand, I was wondering about the effectiveness of this approach (nested dataframe + pmap), I played a bit with profiling, but no conclusion yet.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.