 # Should I move away from do() and rowwise()?

I'm not sure if I understand `do()` correctly, but maybe it depends on the relative cost of allocation.

While `do()` iterates calculation with indices of groups without allocation, `nest()` actually splits the data.frame into many pieces, which needs allocation and thus takes time.

But, in other words, `nest()` can allocate data.frames that is already split, while `do()` can't. So, if you do the same calculation over the same data.frame many times, `nest()` + `map()` can be faster.

``````g <- xx %>%
group_by(x, y)

microbenchmark(
usedo = {
do(g, zz = mean_and_sd(.\$z))
do(g, zz = mean_and_sd(.\$z))
},
usemap = {
n <- nest(g)
transmute(n, x = x, y = y, zz = map(data, ~ mean_and_sd(.\$z)))
transmute(n, x = x, y = y, zz = map(data, ~ mean_and_sd(.\$z)))
},
times = 20
)
#> Unit: milliseconds
#>    expr      min        lq      mean    median        uq       max neval
#>   usedo 909.9741 1040.7445 1164.2193 1190.0480 1290.1719 1361.6217    20
#>  usemap 533.2164  651.7122  735.9906  716.0828  803.2196  948.5835    20
``````
1 Like

6 posts were split to a new topic: Is nest() + mutate() + map() + unnest() really the best alternative to dplyr::do()