Associate Earliest Dates with Names

Hi, I have a dataset with names in duplicate on rows and corresponding dates on the column:

John G. 2001-03-14
Jane D. 2002-04-22
John G. 2003-06-24
Jane D. 2005-07-27

I'd like to create a new column with the earliest dates for each name. How do I do this?


df <- tibble(
  name = c("John G.", "Jane D.", "John G.", "Jane D."),
  date = c("14/03/2001", "22/04/2002", "24/06/2003", "27/07/2005")) %>% 
  mutate(date = as_date(date, format = "%d/%m/%y")) 

df2 <- df %>% 
  group_by(name) %>% 
  arrange(date) %>% 
  slice(1) %>% 
  rename(first_date = date)

df3 <- df %>% 
  left_join(df2, by = "name")

# A tibble: 4 x 3
  name    date       first_date
  <chr>   <date>     <date>    
1 John G. 2020-03-14 2020-03-14
2 Jane D. 2020-04-22 2020-04-22
3 John G. 2020-06-24 2020-03-14
4 Jane D. 2020-07-27 2020-04-22

Here's another solution using the minimum function:

#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>     date, intersect, setdiff, union

df <- tibble(
  name = c("John G.", "Jane D.", "John G.", "Jane D."),
  date = c("14/03/2001", "22/04/2002", "24/06/2003", "27/07/2005")) %>% 
  mutate(date = as_date(date, format = "%d/%m/%y")) 

df %>%
  group_by(name) %>%
#> # A tibble: 4 x 3
#> # Groups:   name [2]
#>   name    date       first_date
#>   <chr>   <date>     <date>    
#> 1 John G. 2020-03-14 2020-03-14
#> 2 Jane D. 2020-04-22 2020-04-22
#> 3 John G. 2020-06-24 2020-03-14
#> 4 Jane D. 2020-07-27 2020-04-22

Created on 2021-02-11 by the reprex package (v0.3.0)


@StatSteph Hi there. Thanks for your assistance. This was a huge help!

Afterwards I used a merge to bring them to be included in the larger data frame.


@williaml Thanks for your assistance. This was a huge help!

Afterwards I used a merge to bring them to be included in the larger data frame.


