Issues creating datetime

Hello! I am trying to bring date and time data into RStudio and create a datetime column. I seem to be having trouble when I try and use lubridate functions; I receive warning messages that all formats failed to parse. The datetime column is created but all values are "na". Even under "#Parse Date column", I had tried to use mdy() opposed to as.Date(), but mdy() would not work. Please see reprex below. Thank you!

library("tidyverse")
library("lubridate")
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

#Create sample dataset
MV.71 <- data.frame(stringsAsFactors = FALSE,
                    DATE = c(NA, NA, "10/15/2020", "10/15/2020", "10/15/2020", 
                             "10/15/2020"),
                    TIME = structure(c(NA, NA, 48600, 50400, 52200, 54000), 
                            class = c("hms", "difftime"), units = "secs"))

#Parse DATE column 
MV.71 <- MV.71 %>%
  mutate(DATE = as.Date(DATE,
                        tz = "America/New_York",
                        format = "%m/%d/%Y"))  

#Create datetime_string as character from DATE and TIME
#Create datetime from datetime_string
MV.71 <- MV.71 %>% 
  mutate(
    datetime_string = paste(DATE, TIME, sep = "_"),
    datetime = lubridate::mdy_hms(datetime_string,
                                  tz = "America/New_York"))


#> Warning: Problem with `mutate()` input `datetime`.
#> i All formats failed to parse. No formats found.
#> i Input `datetime` is `lubridate::mdy_hms(datetime_string, tz = "America/New_York")`.
#> Warning: All formats failed to parse. No formats found.

Created on 2021-01-08 by the reprex package (v0.3.0)

The TIME variable in your data frame is a duration. Is this representing the number of seconds elapsed since midnight?

You need to change lubridate::mdy_hms to lubridate::ymd_hms:

library("tidyverse")

MV.71 <- tibble(stringsAsFactors = FALSE,
                    DATE = c(NA, NA, "10/15/2020", "10/15/2020", "10/15/2020", 
                             "10/15/2020"),
                    TIME = structure(c(NA, NA, 48600, 50400, 52200, 54000), 
                                     class = c("hms", "difftime"), units = "secs"))

MV.71 <- MV.71 %>%
  mutate(DATE = as.Date(DATE,
                        tz = "America/New_York",
                        format = "%m/%d/%Y"))   

MV.71 %>% 
  mutate(
    datetime_string = paste(DATE, TIME, sep = "_"),
    datetime = lubridate::ymd_hms(datetime_string,
                                  tz = "America/New_York"))

# A tibble: 6 x 5
  stringsAsFactors DATE       TIME   datetime_string     datetime           
  <lgl>            <date>     <time> <chr>               <dttm>             
1 FALSE            NA            NA  NA_NA               NA                 
2 FALSE            NA            NA  NA_NA               NA                 
3 FALSE            2020-10-15 13:30  2020-10-15_13:30:00 2020-10-15 13:30:00
4 FALSE            2020-10-15 14:00  2020-10-15_14:00:00 2020-10-15 14:00:00
5 FALSE            2020-10-15 14:30  2020-10-15_14:30:00 2020-10-15 14:30:00
6 FALSE            2020-10-15 15:00  2020-10-15_15:00:00 2020-10-15 15:00:00
Warning messages:
1: Problem with `mutate()` input `datetime`.
i  2 failed to parse.
i Input `datetime` is `lubridate::ymd_hms(datetime_string, tz = "America/New_York")`. 
2:  2 failed to parse.

Hi Siddharth, yes it is! In my raw data excel spreadsheet, the TIME column is a 24-hour clock (also made sure this was correct in the excel format).

Thank you so much! I tried this and it worked for me. I had tried this before since the console shows the DATE column in ymd, despite having set it to mdy with as.DATE() but I guess the ones that don't parse are just the "na" values in my spreadsheet. Thank you again, I really appreciate your help!

Another approach would be to use lubridate::make_datetime() to hand craft the datetime object.

library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

# Create sample dataset
MV.71 <- data.frame(
  stringsAsFactors = FALSE,
  DATE = c(NA, NA, "10/15/2020", "10/15/2020", "10/15/2020", "10/15/2020"),
  TIME = structure(c(NA, NA, 48600, 50400, 52200, 54000), class = c("hms", "difftime"), units = "secs")
)

# Parse DATE column
MV.71 <- mutate(MV.71, DATE = as.Date(DATE, tz = "America/New_York", format = "%m/%d/%Y"))

# Create datetime from DATE and TIME
make_datetime(
  year = year(MV.71$DATE),
  month = month(MV.71$DATE),
  day = day(MV.71$DATE),
  sec = MV.71$TIME, tz = "America/New_York"
)
#> [1] NA                        NA                       
#> [3] "2020-10-15 13:30:00 EDT" "2020-10-15 14:00:00 EDT"
#> [5] "2020-10-15 14:30:00 EDT" "2020-10-15 15:00:00 EDT"

Created on 2021-01-09 by the reprex package (v0.3.0)

Thank you so much for helping me with this! make_datetime() will definitely be useful. I also really like how you set the warn.conflicts argument to false, this is a fantastic tip that I will use in the future! Thanks again!

1 Like