Merge Row Values by Grouped Column

Hello!

I am trying to merge together row values based on a grouped column. In this example I am trying to merge the values in "Teacher", "Unit", and "Day" when the "Course.Name" equals "English". I want the "Date" column for the combined rows to be the earliest date available for "English".

Example:
Turn df into df2

df <- data.frame(Course.Name = c("English", "English", "English", "French", "Spanish"),
                 Teacher = c("Gonzalez", "Smith", "Johnson", "Applegate", "Lowell"),
                 Unit = c("A", "B", "C", "D", "E"),
                 Day = c("Monday", "Tuesday", "Thursday", "Wednesday", "Friday"),
                 Date = c("2022-01-01", "2022-01-02", "2022-02-05", "2022-03-31", "2022-02-14"))

df2 <- data.frame(Course.Name = c("English", "French", "Spanish"),
                  Teacher = c("Gonzalez; Smith; Johnson", "Applegate", "Lowell"),
                  Unit = c("A; B; C", "D", "E"),
                  Day = c("Monday; Tuesday; Thursday", "Wednesday", "Friday"),
                  Date = c("2022-01-01", "2022-03-31", "2022-02-14"))```

#I think using dplyr's `grouped_by` function along with `paste/collapse` is the way to go.
#But I can't make it work.

Is this what you are trying to do?

df <- data.frame(Course.Name = c("English", "English", "English", "French", "Spanish"),
                 Teacher = c("Gonzalez", "Smith", "Johnson", "Applegate", "Lowell"),
                 Unit = c("A", "B", "C", "D", "E"),
                 Day = c("Monday", "Tuesday", "Thursday", "Wednesday", "Friday"),
                 Date = c("2022-01-01", "2022-01-02", "2022-02-05", "2022-03-31", "2022-02-14"))

df2 <- data.frame(Course.Name = c("English", "French", "Spanish"),
                  Teacher = c("Gonzalez; Smith; Johnson", "Applegate", "Lowell"),
                  Unit = c("A; B; C", "D", "E"),
                  Day = c("Monday; Tuesday; Thursday", "Wednesday", "Friday"),
                  Date = c("2022-01-01", "2022-03-31", "2022-02-14"))
library(dplyr)

library(lubridate)

New_df <- df %>% group_by(Course.Name) %>%
  mutate(Date = ymd(Date)) %>% 
  summarize(Teacher = paste(Teacher, collapse = "; "),
            Unit = paste(Unit, collapse = "; "),
            Day = paste(Day, collapse = "; "),
            Date = min(Date))

New_df
#> # A tibble: 3 × 5
#>   Course.Name Teacher                  Unit    Day               Date      
#>   <fct>       <chr>                    <chr>   <chr>             <date>    
#> 1 English     Gonzalez; Smith; Johnson A; B; C Monday; Tuesday;… 2022-01-01
#> 2 French      Applegate                D       Wednesday         2022-03-31
#> 3 Spanish     Lowell                   E       Friday            2022-02-14

Created on 2022-04-18 by the reprex package (v0.2.1)

That's exactly what I was trying to do!
I appreciate the help and all the hard work you do in this community

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.