Dates before 1970 in Time Series ggplot

Hello! I have the following dates / values with the Period stored in a vector s (which will go into ggplot):

Period Value
Jan-63 42
Feb-63 35
Mar-63 44
Apr-63 52
May-05 58
Jun-11 48

The -63 refers to 1963 and -05 and -11 is 2005 and 2011 respectively. When I use the following function:

as.Date(format(as.Date(s,format="%m-%y"), "19%y%m%d"), "%Y%m%d")

I get 2063 for all the dates that should be 1963. From what I gather, R starts in 1970. I would like to modify this script to take into account dates I have in the 1900s and 2000s.

Would you happen to have any suggestions? Thank you!

Greg

Hi,

Thank you for that question actually! Thanks to that I've learn that the global variable LC_ALL was important for parsing month :wink:

It seems that you paste the wrong data or the wrong line of code because you specified a format of %m but you have %b

If i'm trying to replicate your line of code

per <- c("Jan-63", "Feb-63", "Mar-63", "Apr-63", "May-05", "Jun-11")

as.Date(per[1], format = "%b-%y") # return NA for me
as.Date(zoo::as.yearmon(per[1], format = "%b-%y")) # return 2063 Jan
as.Date(paste0("01-", per[1]),  format = "%d-%b-%y") # return 2063-01-01

So one line return all in 1900 and the other all in 2000...

as.Date(format(as.Date(paste0("01-", per), format = "%d-%b-%y"), "19%y-%m-%d")) # 1
as.Date(paste0("01-", per), format = "%d-%b-%y") # 2

So you can't really apply the line of code you provide as example. I'm sorry to say that I haven't figure out a better solution than just reconstruct yourself the time period.

If someone comes up with a "real" solution I would be interested as well!
Sorry again

2 Likes

This is not common to find a parsing function that could deal with both at the same time and guess which one is 20-- and which is 19--.
By luck, lubridate :package: has a function that supports this : parse_data_time2 and fast_strptime, which use a C parser, have a cutoff_2000 argument that allows to precise when to change to 2000 and when to stay with 1900

You can use a cutoff below 63: that way two digits number below 63 will be understand as 20th century, and all above as 19th century.

# your data
per <- c("Jan-63", "Feb-63", "Mar-63", "Apr-63", "May-05", "Jun-11")

# parse_date_time2 uses lubridate's abbreviate order
lubridate::parse_date_time2(per, "my", cutoff_2000 = 62L)  
#> [1] "1963-01-01 UTC" "1963-02-01 UTC" "1963-03-01 UTC" "1963-04-01 UTC"
#> [5] "2005-05-01 UTC" "2011-06-01 UTC"
lubridate::parse_date_time2(per, "m-y", cutoff_2000 = 62L)  
#> [1] "1963-01-01 UTC" "1963-02-01 UTC" "1963-03-01 UTC" "1963-04-01 UTC"
#> [5] "2005-05-01 UTC" "2011-06-01 UTC"

# fast_strptime needs full format
# use %b
lubridate::fast_strptime(per, "%b-%y", cutoff_2000 = 62L)    
#> [1] "1963-01-01 UTC" "1963-02-01 UTC" "1963-03-01 UTC" "1963-04-01 UTC"
#> [5] "2005-05-01 UTC" "2011-06-01 UTC"

# %m also work
lubridate::fast_strptime(per, "%m-%y", cutoff_2000 = 62L)    
#> [1] "1963-01-01 UTC" "1963-02-01 UTC" "1963-03-01 UTC" "1963-04-01 UTC"
#> [5] "2005-05-01 UTC" "2011-06-01 UTC"

Created on 2019-01-12 by the reprex package (v0.2.1)

See documentation for those functions
https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/parse_date_time

6 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.