I'm learning how to use purrr and thought it would be useful in keeping track of some calculations.
However, I'm not sure why I can't do a particular operation using purrr::pmap involving the following components:
- List with each element of length n
- Vector of length 1
- Vector of length 1
- Vector of length n
1., 2., and 3. are all in the same data frame (named 'operations_df'). 4. is outside of the dataframe but is a vector of the same length of each list element (which are all the same length). So the function call basically involves multiplying each element in the vectors of 1. by each element in 4., and then doing adding / subtracting the resulting 1 element vectors with 2 and 3.
This works fine if I break things up by map2 functions. But I'm wondering how I can get this to work in one line with pmap?
library(purrr) library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union # generate data data <- rbeta(n = 10, shape1 = 80, shape2 = 80) prob_k1 <- rbeta(n = 10, shape1 = 80, shape2 = 10) prob_k2 <- 1-prob_k1 # perform operations on prob_k and data in a data.frame operations_df <- tibble(components = c('1', '2'), probability = list(prob_k1, prob_k2)) %>% # sum over list column mutate(n = map_dbl(probability, sum)) %>% # mean for each row, using list column and a single 1-element vector mutate(mu = map2_dbl(probability, n, ~ (1/.y) * sum(data * .x))) operations_df #> # A tibble: 2 x 4 #> components probability n mu #> <chr> <list> <dbl> <dbl> #> 1 1 <dbl > 8.93 0.504 #> 2 2 <dbl > 1.07 0.506 # this doesn't work # variance for each row, using list column, and two 1-element vectors operations_df %>% mutate(var = pmap_dbl(probability, n, mu, ~ (1/(..2-1)) * sum(..1 * data^2) - ..3^2)) #> Result 1 must be a single double, not NULL of length 0 # this does work (1/(operations_df$n-1)) * sum(operations_df$probability[] * data^2) - operations_df$mu^2 #>  0.0342961 (1/(operations_df$n-1)) * sum(operations_df$probability[] * data^2) - operations_df$mu^2 #>  3.800814 # breaking it up into two map2 calls works: operations_df %>% mutate(var = map2_dbl(n, probability, ~ (1/(.x-1)) * sum(.y * data^2))) %>% mutate(var = map2_dbl(var, mu, ~ .x - .y^2)) #> # A tibble: 2 x 5 #> components probability n mu var #> <chr> <list> <dbl> <dbl> <dbl> #> 1 1 <dbl > 8.93 0.504 0.0343 #> 2 2 <dbl > 1.07 0.506 3.80