colMeans(df) is same map_dbl(df,mean) ?

I'm learning purrr.
I now know that furrr executes functions on multiple cores(because fast).
I don't understand what Benefits of purrr.

For example, if I had this data,


df <- tibble(
  a = rnorm(100000000),
  b = rnorm(100000000),
  c = rnorm(100000000),
  d = rnorm(100000000)
)

They all have the same execution time.

library(tictoc)
library(robustbase)

output <- vector("double", length(df))
for (i in seq_along(df)) {
  output[[i]] <- median(df[[i]])
}

colMedians(as.matrix(df))

map_dbl(df, median)

Can you tell me the geeky differences inside these calculators?

I've heard that map is implemented in C and is a bit faster, but can't you feel this level of speed?
The other possible advantage is that the code is easier to read once you get used to it?

thank you

colMeans() is written in C, you can check the code.

There is essentially no difference between using colMeans(df) and purrr::map_dbl(df, mean) except that the latter has a tiny bit of additional overhead.

1 Like

@martin.R

Thank you !