Transposing grouped/ filtered selects into a new column.

Hi there,

I'm trying to figure out how to group by operators all shipments id's assigned to them. spread() & transpose() were what I thought of or a for loop.

transpose() generated a list, which I knew wasn't what I would expect but expresses what I'm trying to achieve.

library(tidyverse)
# Sample data
dt <-  tibble(
  "Shipment ID" = c("S00001009", "S00001033",
"S00001034","S00001036", "S00001038", "S00001039", "S00001040","S00001041"),
  "Job Operator" = c("Wayne Martin", "Wayne Martin", "Emil Manuel", 
"Emil Manuel", "Emil Manuel", "Joanne Lano","Joanne Lano" ,"Tony Solis")
)                     

# Group by operators all Shipment ID's assigned to them.
dt %>% group_by(`Job Operator`) %>% 
transpose()

# Expected output:
tibble(
  "Job Operator" = c("Wayne Martin","Emil Manuel",
"Joanne Lano","Tony Solis"),
  "Shipment ID" = c("S00001009, S00001033","S00001034, S00001036, 
S00001038","S00001039, S00001040", "S00001041")
)

`Job Operator` `Shipment ID`                  
  <chr>          <chr>                          
1 Wayne Martin   S00001009, S00001033           
2 Emil Manuel    S00001034, S00001036, S00001038
3 Joanne Lano    S00001039, S00001040           
4 Tony Solis     S00001041           

Any insight would be much appreciated.

Best regards,
LF.

I would group by and summarise the various shipment ids into a list, then process that list into a character string. The code below could be somewhat simplified (no interim summary_dt1 for example)
but I made it a little more verbose so its easier to follow.

summary_dt1<- dt %>% group_by(`Job Operator`) %>% 
  summarise(ship_ids = list(`Shipment ID`))
summary_dt1$ship_ids_flat <- purrr::map_chr(summary_dt1$ship_ids,~paste0(.,collapse=","))
summary_dt1
1 Like

To complement Nir's answer, if you are only interested in getting the result as a character string, you can directly apply paste0() on the summarise() function.

library(tidyverse)
# Sample data
dt <-  tibble(
    "Shipment ID" = c("S00001009", "S00001033",
                      "S00001034","S00001036", "S00001038", "S00001039", "S00001040","S00001041"),
    "Job Operator" = c("Wayne Martin", "Wayne Martin", "Emil Manuel", 
                       "Emil Manuel", "Emil Manuel", "Joanne Lano","Joanne Lano" ,"Tony Solis")
)                     

dt %>% 
    group_by(`Job Operator`) %>%
    summarise(ship_ids = paste0(`Shipment ID`, collapse = ", "))
#> # A tibble: 4 x 2
#>   `Job Operator` ship_ids                       
#>   <chr>          <chr>                          
#> 1 Emil Manuel    S00001034, S00001036, S00001038
#> 2 Joanne Lano    S00001039, S00001040           
#> 3 Tony Solis     S00001041                      
#> 4 Wayne Martin   S00001009, S00001033
2 Likes

Thank you very much for your assistance! What I was thinking of was what @andresrcs proposed, but this is brilliant, I clearly see the applicability with yours !

Thank you once again!
:metal:

Cheers mate!

Once again spot on! :facepunch:

@nirgrahamuk could you please expand a little more on how to do it without the intermediate summary_dt1, I'm trying to implement this on a pipe:

library(tidyverse)
library(purrr)

df_all %>% 
  filter(`Job.Recog.Date`>ydm(20190101) & `Job.Recog.Date`<= ymd(20200201), 
         `Job Status` == "CLS"
  ) %>%  
  group_by(`Job Operator`) %>% summarise(
    Tot.Income = sum(`Total Income (Recognized+Unrecognized REV+WIP)`),
    Tot.Expense = sum(`Total Expense (Recognized+Unrecognized CST+ACR)`),
    Tot.Profit = sum(`Job Profit`),
    Mean_Profit = mean(`Job Profit`),
    Count = n(),
    ship_ids = list(`Shipment ID`)
  ) %>% map_chr(`ship_ids`, ~paste0(., collapse = ", ")) %>%
filter(`Job Operator` == "Emil Manuel") %>% 
  arrange(desc(`Mean_Profit`))

The issue is on the map_chr() since I am able to get an output until executing the map_chr().

df_all %>% 
  filter(`Job.Recog.Date`>ydm(20190101) & `Job.Recog.Date`<= ymd(20200201), 
         `Job Status` == "CLS"
  ) %>%  
  group_by(`Job Operator`) %>% summarise(
    Tot.Income = sum(`Total Income (Recognized+Unrecognized REV+WIP)`),
    Tot.Expense = sum(`Total Expense (Recognized+Unrecognized CST+ACR)`),
    Tot.Profit = sum(`Job Profit`),
    Mean_Profit = mean(`Job Profit`),
    Count = n(),
    ship_ids = list(`Shipment ID`)
  ) 
# A tibble: 63 x 7
   `Job Operator`     Tot.Income Tot.Expense Tot.Profit Mean_Profit Count ship_ids   
   <chr>                   <dbl>       <dbl>      <dbl>       <dbl> <int> <list>     
 1 Abey Mena             210627.    -150373.     60254.        275.   219 <chr [219]>
 2 Agent Routed            4023.      -3868.       155         155      1 <chr [1]>  
 ...

Thank you for your time,
LF.

If you want to use purr::map_() functions on a pipe you have to do it inside a mutate statement. BTW please stay within the scope of the original reprex or provide a new one.

library(tidyverse)
# Sample data
dt <-  tibble(
    "Shipment ID" = c("S00001009", "S00001033",
                      "S00001034","S00001036", "S00001038", "S00001039", "S00001040","S00001041"),
    "Job Operator" = c("Wayne Martin", "Wayne Martin", "Emil Manuel", 
                       "Emil Manuel", "Emil Manuel", "Joanne Lano","Joanne Lano" ,"Tony Solis")
)

dt %>%
    group_by(`Job Operator`) %>% summarise(
        ship_ids = list(`Shipment ID`)
    ) %>% 
    mutate(ship_ids_flat = map_chr(`ship_ids`, ~paste0(., collapse = ", ")))
#> # A tibble: 4 x 3
#>   `Job Operator` ship_ids  ship_ids_flat                  
#>   <chr>          <list>    <chr>                          
#> 1 Emil Manuel    <chr [3]> S00001034, S00001036, S00001038
#> 2 Joanne Lano    <chr [2]> S00001039, S00001040           
#> 3 Tony Solis     <chr [1]> S00001041                      
#> 4 Wayne Martin   <chr [2]> S00001009, S00001033

Created on 2020-03-30 by the reprex package (v0.3.0.9001)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.