Group label is lost in the output of group_map()

Hello experts. Let me start with an example code.

library(tidyverse)

depth_by_cut = diamonds %>% group_by(cut) %>% group_map(~pull(., depth))

length(depth_by_cut)
#[1] 5

names(depth_by_cut)
#NULL

In the above example, how do I know which element of depth_by_cut belongs to which cut category?

Hi @junghoonshin
Short answer is:

names(depth_by_cut) <- levels(diamonds$cut)

But I suspect you want the list named on output! Don't know how to achieve that inside the same dplyr pipe.

2 Likes

my solution got a bit complicated...

library(tidyverse)
library(rlang)
depth_by_cut <- diamonds %>% mutate(cut = as.character(cut)) %>% 
                                      group_by(cut) %>% 
                                      group_map( 
                                          ~list2( !!(unique(pull(.,cut))) 
                                                  := 
                                                    pull(.,depth)),
                                          keep=TRUE) %>% flatten()

length(depth_by_cut)

depth_by_cut[["Very Good"]] %>% head

works though !

2 Likes

Thank you @DavoWW. One question: if cut is a character vector, how does group_by() decide the order of the group?

Hi @junghoonshin,
In this case cut is an "ordered factor" where the order of the levels has been specified
str(diamonds$cut)

otherwise the levels are allocated alphanumerically at the point of reading or creating the vector; of course you can change/reorder/delete/merge these as required.
HTH

1 Like

Here is a rather wacky way to get to the same place.

library(dplyr)
library(purrr)
library(ggplot2)
MyFunc <- function(DF, NM, COL) {
  tmp <- DF[[COL]]
  attr(tmp, "name") <- as.character(NM)
  tmp
}
data("diamonds")
depth_by_cut = diamonds %>% group_by(cut) %>% group_map(~MyFunc(.x, .y$cut, "depth"))
names(depth_by_cut) <- map(depth_by_cut, attr, "name")

length(depth_by_cut)
#> [1] 5

names(depth_by_cut)
#> [1] "Fair"      "Good"      "Very Good" "Premium"   "Ideal"

str(depth_by_cut)
#> List of 5
#>  $ Fair     : num [1:1610] 65.1 55.1 66.3 64.5 65.3 64.4 65.7 67.9 55.1 64.5 ...
#>   ..- attr(*, "name")= chr "Fair"
#>  $ Good     : num [1:4906] 56.9 63.3 64 63.4 63.8 63.3 58.2 64.1 64 65.2 ...
#>   ..- attr(*, "name")= chr "Good"
#>  $ Very Good: num [1:12082] 62.8 62.3 61.9 59.4 62.7 63.8 61 59.4 58.1 60.4 ...
#>   ..- attr(*, "name")= chr "Very Good"
#>  $ Premium  : num [1:13791] 59.8 62.4 60.4 60.2 60.9 62.5 62.4 61.6 59.3 59.3 ...
#>   ..- attr(*, "name")= chr "Premium"
#>  $ Ideal    : num [1:21551] 61.5 62.8 62.2 62 61.8 61.2 61.1 61.9 60.9 61 ...
#>   ..- attr(*, "name")= chr "Ideal"

Created on 2020-06-02 by the reprex package (v0.3.0)

1 Like

Hi @nirgrahamuk, thank you for your great solution! Where can I find description on those !! and := stuffs? I've never seen something like them before..

Perhaps more "basic" solution here (not using rlang).

depth_by_cut = diamonds %>% 
  mutate(cut = as.character(cut)) %>% 
  group_by(cut) %>% 
  group_map(~list(pull(.x,depth)) %>% setNames(unique(pull(.x,cut))), keep=TRUE) %>%
  flatten()

str(depth_by_cut)
#List of 5
# $ Fair     : num [1:1610] 65.1 55.1 66.3 64.5 65.3 64.4 65.7 67.9 55.1 64.5 ...
# $ Good     : num [1:4906] 56.9 63.3 64 63.4 63.8 63.3 58.2 64.1 64 65.2 ...
# $ Very Good: num [1:12082] 62.8 62.3 61.9 59.4 62.7 63.8 61 59.4 58.1 60.4 ...
# $ Premium  : num [1:13791] 59.8 62.4 60.4 60.2 60.9 62.5 62.4 61.6 59.3 59.3 ...
# $ Ideal    : num [1:21551] 61.5 62.8 62.2 62 61.8 61.2 61.1 61.9 60.9 61 ...

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.