correctly modify the wrong imported dates

I have this column of dates that for some reason R import t badly expect some cases:

column
<chr>
1 44686
2 44662
3 16/05/22
4 44686
5 44648
6 44564
7 44571
8 44571
9 16/05/22
10 44564

I want to modify the column in such a way that the dates are displayed correctly, the problem is that when I do this:

column %>%
mutate(column= excel_numeric_to_date(as.numeric(column)))

It converted the dates that were well imported into NA, like this:

column
<date>
1 2022-05-05
2 2022-04-11
3 NA
4 2022-05-05
5 2022-04-04
6 2022-02-08
7 2022-04-04
8 2022-04-04
9 NA
10 2022-02-08

What I can do?

Hi @juandmaz,

Did you try the convert_to_date() function from the janitor package? It accepts different input formats and comes in quite handy here.

My_dates = c("44686", "44662", "16/05/22", "44686", "44648", "44564", "44571", "44571", "16/05/22", "44564" )

janitor::convert_to_date(My_dates)

JW

Hi, thanks but it doesn't work.
When the date is 16/05/2020 turns it well, but when the date is 30/05/22 it turns as 2030-05-22.
Why this happen?

Your original dates are not in an unambiguous format, so it is better to explicitly define the date format, I would say you should take a more individualized approach for both described cases:

library(dplyr)
library(lubridate)
library(stringr)

sample_data <- data.frame(
  stringsAsFactors = FALSE,
            column = c("44686","44662","16/05/22",
                       "44686","44648","44564","44571","44571","16/05/22",
                       "30/05/22")
)

sample_data %>% 
    mutate(column = if_else(
        str_detect(column, "/"),
        dmy(column),
        as.Date(as.numeric(column), origin = "1899-12-30")
    ))
#> Warning: 7 failed to parse.
#> Warning in as.Date(as.numeric(column), origin = "1899-12-30"): NAs introducidos
#> por coerciĆ³n
#>        column
#> 1  2022-05-05
#> 2  2022-04-11
#> 3  2022-05-16
#> 4  2022-05-05
#> 5  2022-03-28
#> 6  2022-01-03
#> 7  2022-01-10
#> 8  2022-01-10
#> 9  2022-05-16
#> 10 2022-05-30

Created on 2022-06-06 by the reprex package (v2.0.1)

Hi, thanks! It works. But it shows me this message:
"

Warning messages:
1: Problem while computing `column = if_else(...)`.
i  221 failed to parse. 
2: Problem while computing `column = if_else(...)`.
i NAs introduced by coercion

"
Should I be worried?

Also, i try with this way very similar and it works:

sample_data %>%
  mutate(column = if_else(str_detect(column , "/"), 
                       dmy(column ),
                       column %>%
                         as.numeric() %>%
                         excel_numeric_to_date()))

But it also shows me the same message

This seems like a bug to me, both commands are being evaluated on the complete dataset regardless of the logical condition but at the end you get the desired result so I would say that you should not be worried.

Yes, it works, I simply didn't want to install an additional package (janitor) just for one function, that is why I did the same thing using base R.

I am looking more carefully and I notice that the dates have been changed so that the year is first, and then the month and then the day, even when I have set it to be dmy(). and I can't change it.
it looks like this:

column                   new column
<chr>                      <date>  
 1 44564                 2022-01-03
 2 16/05/22              2022-05-16
 3 44686                 2022-05-05
 4 44648                 2022-03-28
 5 44564                 2022-01-03
 6 44571                 2022-01-10
 7 44571                 2022-01-10
 8 44571                 2022-01-10
 9 44592                 2022-01-31
 10 44578                2022-01-17
 11 16/05/22             2022-05-16

I think this is what the warning message was referring to.

That function is reading dates in that format not converting them to it

No, it has nothing to do with the warning message, that format is the ISO standard and is the only way to have actual dates in R, if you change the format to anything else the column is converted to character class. Have in mind that you can always change the date format in your output for presentation purposes like in tables (not data frames) and plots.

Oh, so I can't figure out the format of the dates expect for presentation purposes? Too bad. But at least it works

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.