Why am I getting warnings in my case_when process?

My understanding of dplyr::case_when() was that it would go through each case one by one until all the elements have been dealt with. If all elements are dealt with in case 1, then it does not need to go to case 2. This is how i think it has always behaved before when I have used it.

But I am getting some warnings in my code that make me think I am missing something. Please can someone explain why I am getting the warnings in the reprex below? It may be a lubridate issue but I think it's probably something more obvious that I am getting wrong! Thanks.

convert_excel_date <- function(x) {
  x |>
    as.numeric() |>
    lubridate::as_date(origin = "1899-12-30")
}

tib <- tibble::tribble(
  ~name, ~date,
  "a", "43831", 
  "b", "43862",
  "c", "43891",
  "d", "2022-04-01"
)

tib |>
  dplyr::mutate(across(date, ~ dplyr::case_when(
    stringr::str_detect(., "^[0-9]{5}$") ~ convert_excel_date(.),
    TRUE ~ lubridate::ymd(.)
  )))
#> Warning in lubridate::as_date(as.numeric(x), origin = "1899-12-30"): NAs
#> introduced by coercion
#> Warning: 3 failed to parse.
#> # A tibble: 4 × 2
#>   name  date      
#>   <chr> <date>    
#> 1 a     2020-01-01
#> 2 b     2020-02-01
#> 3 c     2020-03-01
#> 4 d     2022-04-01

tib |>
  dplyr::filter(!name == "d") |>
  dplyr::mutate(across(date, ~ dplyr::case_when(
    stringr::str_detect(., "^[0-9]{5}$") ~ convert_excel_date(.),
    TRUE ~ lubridate::ymd(.)
  )))
#> Warning: All formats failed to parse. No formats found.
#> # A tibble: 3 × 2
#>   name  date      
#>   <chr> <date>    
#> 1 a     2020-01-01
#> 2 b     2020-02-01
#> 3 c     2020-03-01

tib |>
  dplyr::filter(!name == "d") |>
  dplyr::mutate(across(date, convert_excel_date))
#> # A tibble: 3 × 2
#>   name  date      
#>   <chr> <date>    
#> 1 a     2020-01-01
#> 2 b     2020-02-01
#> 3 c     2020-03-01

Created on 2022-12-05 with reprex v2.0.2

Here is a quote from the Examples section of the Help file of case_when()

# case_when() evaluates all RHS expressions, and then constructs its
# result by extracting the selected (via the LHS expressions) parts.
# In particular NaNs are produced in this case:
y <- seq(-2, 2, by = .5)
case_when(
  y >= 0 ~ sqrt(y),
  TRUE   ~ y
)

I do not think that is new behavior, though I can't say it was never different.

1 Like

It is documented but the relevant part of the documentation is amongst the examples.

# case_when() evaluates all RHS expressions, and then constructs its
# result by extracting the selected (via the LHS expressions) parts.
1 Like

Thanks both, I can see how it works better now. I've used case_when() a lot in the past and I think it was only when using it with lubridate functions that I learned how it really works, because lubridate is keen on returning those NA warnings.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.