case_when() with heterogeneous date formats

I wrote a reprex and now I'm doubly confused.

I'm handling import from a bunch of excel files where the dates have been stored in a couple of different ways. I've been working with a list where each input file is a data frame:

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
library(purrr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:lubridate':
#> 
#>     intersect, setdiff, union
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(janitor)
  
  char_dates <- tibble(
    date = c("July 6, 2009", "July 7, 2009"),
    name = "string dates"
    )
  
  excel_dates <- 
    tibble(
      date = c("40000", "40001"),
      name = "excel dates"
    )
  
  list(char_dates, excel_dates) %>% 
    map(
      ~transmute(.x,
        common_date = case_when(
          name == "string dates" ~ mdy(date),
          name == "excel dates" ~ excel_numeric_to_date(as.double(date))
          )
        )
      )
#> Warning in excel_numeric_to_date(as.double(date)): NAs introduced by
#> coercion
#> Warning: All formats failed to parse. No formats found.
#> [[1]]
#> # A tibble: 2 x 1
#>   common_date
#>   <date>     
#> 1 2009-07-06 
#> 2 2009-07-07 
#> 
#> [[2]]
#> # A tibble: 2 x 1
#>   common_date
#>   <date>     
#> 1 2009-07-06 
#> 2 2009-07-07
  
  list(char_dates, excel_dates) %>% 
    map(
      ~transmute(.x,
        common_date = if_else(
          name == "string dates",
          mdy(date),
          excel_numeric_to_date(as.double(date))
          )
        )
      )
#> Warning in excel_numeric_to_date(as.double(date)): NAs introduced by
#> coercion

#> Warning in excel_numeric_to_date(as.double(date)): All formats failed to
#> parse. No formats found.
#> [[1]]
#> # A tibble: 2 x 1
#>   common_date
#>   <date>     
#> 1 2009-07-06 
#> 2 2009-07-07 
#> 
#> [[2]]
#> # A tibble: 2 x 1
#>   common_date
#>   <date>     
#> 1 2009-07-06 
#> 2 2009-07-07

Created on 2019-04-12 by the reprex package (v0.2.1)

two questions:

  1. why are both sets of warnings generated if both date parsing strategies succeed?
  2. This is not actually the behavior I observe on the real data I'm working on--in that case, the call to lubridate::mdy() issues the warning but all the dates are actually NA. I tried parse_date_time() and a more granular order specification, which also fails. The excel_numeric_to_date() works fine, and returns date values in the relevant data frames.

case_when executes all of the right-hand sides and then keeps everything that satisfies left-hand side. Here is an example:

library(purrr)

is_even <- function(n){
  rlang::warn("I am even")
  n %% 2 == 0
}

is_odd <- function(n){
  rlang::warn("I am odd")
  n %% 2 == 1
}

list(tibble::tibble(x = 1)) %>%
  purrr::map( ~ .x  %>%
                dplyr::mutate(is_odd = dplyr::case_when(
                  is_even(x) ~ "even",
                  is_odd(x)  ~ "odd"
                )))
#> Warning: I am even
#> Warning: I am odd
#> [[1]]
#> # A tibble: 1 x 2
#>       x is_odd
#>   <dbl> <chr> 
#> 1     1 odd

Created on 2019-04-12 by the reprex package (v0.2.1)

As you can see, both warnings are printed, but only one of them is relevant

4 Likes

Thanks! As it turns out I had misspecified the condition, which is why I was getting wacky results.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.