Error with date format

Thanks for the data part of the FAQ: What's a reproducible example (`reprex`) and how do I do one?

A preliminary issue. date is a base function, always in your namespace, and you should avoid using it as a variable name. Same with df, dat is a good substitute.

Second, lubridate takes its best shot at parsing the strings but it isn't able to distinguish between centuries without any information.

So, to begin, let's do the reprex with the wrong answer.

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(lubridate)) 
dat <-data.frame(
          rf = c(7, 7.2, 7.6, 7.35, 7.7, 7.9, 7.6, 7.6, 7.35, 8.45),
        date = as.factor(c("31-07-07","31-08-07",
                           "30-09-07","31-10-07","30-11-07","31-12-07",
                           "31-01-08","29-02-08","31-03-08","30-04-08"))
)

colnames(dat) <- c("rf", "Date")
# if factors are really needed can be converted back
dat <- dat %>% mutate(Date = as.character(Date))
dat <- dat %>% mutate(Date = ymd(Date))
dat
#>      rf       Date
#> 1  7.00 2031-07-07
#> 2  7.20 2031-08-07
#> 3  7.60 2030-09-07
#> 4  7.35 2031-10-07
#> 5  7.70 2030-11-07
#> 6  7.90 2031-12-07
#> 7  7.60 2031-01-08
#> 8  7.60 2029-02-08
#> 9  7.35 2031-03-08
#> 10 8.45 2030-04-08
dat <- dat %>% mutate(Date = Date - years(100))
dat
#>      rf       Date
#> 1  7.00 1931-07-07
#> 2  7.20 1931-08-07
#> 3  7.60 1930-09-07
#> 4  7.35 1931-10-07
#> 5  7.70 1930-11-07
#> 6  7.90 1931-12-07
#> 7  7.60 1931-01-08
#> 8  7.60 1929-02-08
#> 9  7.35 1931-03-08
#> 10 8.45 1930-04-08

Created on 2020-01-23 by the reprex package (v0.3.0)

1 Like