purrr pmap over 1 list, and several vectors

wvictor14 · June 6, 2019, 5:12pm

I'm learning how to use purrr and thought it would be useful in keeping track of some calculations.

However, I'm not sure why I can't do a particular operation using purrr::pmap involving the following components:

List with each element of length n
Vector of length 1
Vector of length 1
Vector of length n

1., 2., and 3. are all in the same data frame (named 'operations_df'). 4. is outside of the dataframe but is a vector of the same length of each list element (which are all the same length). So the function call basically involves multiplying each element in the vectors of 1. by each element in 4., and then doing adding / subtracting the resulting 1 element vectors with 2 and 3.

This works fine if I break things up by map2 functions. But I'm wondering how I can get this to work in one line with pmap?

library(purrr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# generate data
data <- rbeta(n = 10, shape1 = 80, shape2 = 80)

prob_k1 <- rbeta(n = 10, shape1 = 80, shape2 = 10)
prob_k2 <- 1-prob_k1


# perform operations on prob_k and data in a data.frame
operations_df <- tibble(components = c('1', '2'),
                        probability = list(prob_k1, prob_k2)) %>%
  
  # sum over list column
  mutate(n = map_dbl(probability, sum)) %>%
  
  # mean for each row, using list column and a single 1-element vector
  mutate(mu = map2_dbl(probability, n, ~ (1/.y) * sum(data * .x))) 

operations_df
#> # A tibble: 2 x 4
#>   components probability     n    mu
#>   <chr>      <list>      <dbl> <dbl>
#> 1 1          <dbl [10]>   8.93 0.504
#> 2 2          <dbl [10]>   1.07 0.506
  
# this doesn't work
# variance for each row, using list column, and two 1-element vectors
operations_df %>%
  mutate(var = pmap_dbl(probability, n, mu, ~ (1/(..2-1)) * sum(..1 * data^2) - ..3^2))
#> Result 1 must be a single double, not NULL of length 0

# this does work
(1/(operations_df$n[1]-1)) * sum(operations_df$probability[[1]] * data^2) - operations_df$mu[1]^2
#> [1] 0.0342961
(1/(operations_df$n[2]-1)) * sum(operations_df$probability[[2]] * data^2) - operations_df$mu[2]^2
#> [1] 3.800814

# breaking it up into two map2 calls works:
operations_df %>%
  mutate(var = map2_dbl(n, probability, ~ (1/(.x-1)) * sum(.y * data^2))) %>%
  mutate(var = map2_dbl(var, mu, ~ .x - .y^2))
#> # A tibble: 2 x 5
#>   components probability     n    mu    var
#>   <chr>      <list>      <dbl> <dbl>  <dbl>
#> 1 1          <dbl [10]>   8.93 0.504 0.0343
#> 2 2          <dbl [10]>   1.07 0.506 3.80

aosmith · June 6, 2019, 5:43pm

The pmap() function expects a list as input instead of separate vectors.

So to do the work with pmap() you can wrap the variables in list().

operations_df %>%
    mutate(var = pmap_dbl(list(probability, n, mu), ~ (1/(..2-1)) * sum(..1 * data^2) - ..3^2))

# A tibble: 2 x 5
  components probability     n    mu    var
  <chr>      <list>      <dbl> <dbl>  <dbl>
1 1          <dbl [10]>   8.84 0.505 0.0335
2 2          <dbl [10]>   1.16 0.506 1.59

system · June 26, 2019, 11:18pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.