The function descriptions are very helpful (for some of them at least).
1st case:
mtcars %>%
group_by(cyl) %>%
summarise_if(is.double, mean)
#> # A tibble: 3 x 11
#> cyl mpg disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 26.7 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
#> 2 6 19.7 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
#> 3 8 15.1 353. 209. 3.23 4.00 16.8 0 0.143 3.29 3.5
From summarise() help file:
‘summarise()’ is typically used on grouped data created by
‘group_by()’. The output will have one row for each group.
So summarise() works at the group level.
‘summarise_if’() operates on columns for which a predicate returns ‘TRUE’.
So you will get a mean for each group for all columns of type double (here, all columns) and you only get one row per group (here 3).
2nd case:
mtcars %>%
group_by(cyl) %>%
modify_if(is.double, mean) %>%
head(3)
#> # A tibble: 3 x 11
#> # Groups: cyl [1]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 20.1 6.19 231. 147. 3.60 3.22 17.8 0.438 0.406 3.69 2.81
#> 2 20.1 6.19 231. 147. 3.60 3.22 17.8 0.438 0.406 3.69 2.81
#> 3 20.1 6.19 231. 147. 3.60 3.22 17.8 0.438 0.406 3.69 2.81
From modify() help file:
‘modify()’ is a short-cut for ‘x[ ] <- map(x, .f); return(x)’.
‘modify_if()’ only modifies the elements of ‘x’ that satisfy a
predicate and leaves the others unchanged.
So modify() will modify each value of your data frame by replacing it with the mean of that column and you get a data frame with the same number of rows as your input. Groups have no effect.
And since you have doubles everywhere, you would have gotten the same thing simply with:
mtcars %>%
group_by(cyl) %>%
modify(mean) %>%
head(3)
3rd case:
mtcars %>%
group_by(cyl) %>%
map_df(mean)
#> # A tibble: 1 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 20.1 6.19 231. 147. 3.60 3.22 17.8 0.438 0.406 3.69 2.81
map() will not take the groups into account either, but unlike the previous, it does not "modify". Instead, it gives you the means for all values as the output (so only one mean per variable).
map() would have returned a list, but with map_df() you get a data frame with those means. So only one row.
You could have added a 4th case:
mtcars %>%
group_by(cyl) %>%
mutate_if(is.double, mean)
# A tibble: 32 x 11
# Groups: cyl [3]
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 19.7 6 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
2 19.7 6 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
3 26.7 4 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
4 19.7 6 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
5 15.1 8 353. 209. 3.23 4.00 16.8 0 0.143 3.29 3.5
6 19.7 6 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
7 15.1 8 353. 209. 3.23 4.00 16.8 0 0.143 3.29 3.5
8 26.7 4 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
9 26.7 4 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
10 19.7 6 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
# ... with 22 more rows
mutate(), as summarise(), takes the groups into account, but it does not summarise the data frame by only giving you the "summary" for each group. Instead, all the rows are maintained (as with modify()), but you get the means per group.
As for your additional question in your edit:
It doesn't really matter. The main difference between modify() and mutate() is that the former does not take groups into account while the 2nd does. But since you are not using group_by() in that last example, the outputs are pretty much the same (except that mutate() gets rid of the rownames and modify() does not). So I guess you could pick one or the other depending on whether you want them or not.