Using case_when to create column based on month and day range

Hi,
I have tried the code below to generate a new column based on my dataframe but the case_when is only generating the age.group column based on the day(DOB) condition.
Thank you.

library(tidyverse)
library(lubridate)

 df <- ohl %>%
  mutate(id = as.numeric(id)) %>%
  select(id, DOB) %>%
  mutate(age.group = case_when(
    month(DOB) >=9 & day(DOB) >15 ~ "late",
    TRUE ~ "early"))

dput(head(df, n = 20))
structure(list(id = c(20576, 20320, 22665, 24023, 19214, 19445,
20162, 19589, 19158, 18882, 19822, 22113, 20498, 17774, 21235,
23625, 20168, 18913, 18271, 26143), DOB = structure(c(6932, 6848,
7342, 7567, 6631, 6665, 6811, 6687, 6623, 6575, 6732, 7214, 6902,
6293, 7036, 7476, 6812, 6579, 6391, 7987), class = "Date")), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))

I don't see any issues with the code. The observations classified as late all have dates on/after 15th September.

# A tibble: 3 x 3
     id DOB        age.group
  <dbl> <date>     <chr>    
1 20576 1988-12-24 late     
2 24023 1990-09-20 late     
3 20498 1988-11-24 late   

Is the result different from your desired output?

I'm looking for any birthdate that is after September 15th to be classified as late.
My code only seems be picking up the day of the year

OK, I see your problem now and this is how I would solve it. This says any date in September 16-30 or any day in October, November, and December regardless of day of month.

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:dplyr':
#> 
#>     intersect, setdiff, union
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union


df <- structure(list(id = c(20576, 20320, 22665, 24023, 19214, 19445,
                            20162, 19589, 19158, 18882, 19822, 22113, 20498, 17774, 21235,
                            23625, 20168, 18913, 18271, 26143), DOB = structure(c(6932, 6848,
                                                                                  7342, 7567, 6631, 6665, 6811, 6687, 6623, 6575, 6732, 7214, 6902,
                                                                                  6293, 7036, 7476, 6812, 6579, 6391, 7987), class = "Date")), row.names = c(NA,
                                                                                                                                                             -20L), class = c("tbl_df", "tbl", "data.frame"))
df %>%
  mutate(
    age.group = case_when(
      (month(DOB) >=9 & day(DOB) >15) | month(DOB)>=10 ~ "late",
      TRUE ~ "early"))
#> # A tibble: 20 x 3
#>       id DOB        age.group
#>    <dbl> <date>     <chr>    
#>  1 20576 1988-12-24 late     
#>  2 20320 1988-10-01 late     
#>  3 22665 1990-02-07 early    
#>  4 24023 1990-09-20 late     
#>  5 19214 1988-02-27 early    
#>  6 19445 1988-04-01 early    
#>  7 20162 1988-08-25 early    
#>  8 19589 1988-04-23 early    
#>  9 19158 1988-02-19 early    
#> 10 18882 1988-01-02 early    
#> 11 19822 1988-06-07 early    
#> 12 22113 1989-10-02 late     
#> 13 20498 1988-11-24 late     
#> 14 17774 1987-03-26 early    
#> 15 21235 1989-04-07 early    
#> 16 23625 1990-06-21 early    
#> 17 20168 1988-08-26 early    
#> 18 18913 1988-01-06 early    
#> 19 18271 1987-07-02 early    
#> 20 26143 1991-11-14 late

Created on 2020-05-27 by the reprex package (v0.3.0)

3 Likes

Thank you very much. That worked! I was beginning to think my & key wasn't working.... :slightly_smiling_face:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.