Split dataframe by two columns

I am starting to incorporate the purrr package in my daily work. One thing I can't accomplish is splitting a dataframe into multiple dataframes. Following the example of purrr, I can do this:

mtcars %>% 
  split(mtcars$gear)

Which splits the dataframe into three smaller dataframes by "gear".

Now I would like to split the dataframe again, for instance by "am".

I have tried this:

mtcars %>% 
  split(mtcars$gear) %>% 
  map(split, mtcars$am)

which works, but throws a warning.

Following purrr to fit a model, I tried this, but I got an error:

mtcars %>% 
  split(mtcars$gear) %>% 
  map(split, mtcars$am) %>% 
  map(\(df) lm(mpg ~ wt, data = df)) |> 
  map(summary) %>%
  map_dbl("r.squared")

"object 'wt' not found".

How could I fix this?

I recommend going about it using a tidy approach. See my example here:

Which in your case, translates to something along the lines of:

library("tidyverse")
library("broom")
mtcars %>% 
  group_by(gear, am) %>% 
  nest %>% 
  mutate(mdl = map(data, ~lm(mpg ~ wt, data = .x)),
         mdl_summary = map(mdl, glance)) %>% 
  unnest(mdl_summary) %>% 
  ungroup %>% 
  select(gear, am, r.squared)
2 Likes

Interesting approach, works like a charm!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.