as.POSIXct when converting time it shows values as NA


#1

I am trying to convert a timestamp into time and date. The date seems to work okay but the time shows as char. How can i convert it into time format and not char. The data frame is a sample of my data.

library(lubridate)
#> Warning: package 'lubridate' was built under R version 3.4.4
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.4.4

df <- data.frame(timestamp = c("1 2018-11-08T07:41:55.921Z",
                               "2 2018-11-08T07:42:29.842Z",
                               "3 2018-11-08T07:42:29.845Z",
                               "4 2018-11-08T07:42:29.845Z",
                               "5 2018-11-08T07:43:13.957Z"))
df %>%
  mutate(time_clean = ymd_hms(timestamp,tz = 'UTC')) %>%
  separate(time_clean, c('date', 'time'), sep = ' ') %>%
  mutate_at(vars(date), funs(as.Date))


df$time = as.POSIXct(df$time, format = '%h:%m:%s')

#2

Posixct is not a time only class, POSIXct represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector. It's a datetime class.


#3

I tried using this instead, adding this at the end of my code it didn't work. Any idea where i am going wrong please?

mutate(datetime = lubridate::ymd_hms(time))

#4

Is not clear to me what are you trying to accomplish, can you elaborate a little more in your question?


#5

Sorry for not been clear. I have dataframe that contains timestamp. I am spliting into date and time then changing the date format to date and the time to time, but the time isn't converting into time it's getting converted as char


#6

There is no time class in base R, because time without date would be just a discrete variable, that is why it gets converted to character.
What do you want to do wit this time variable?


#7

Simply convert into time format instead of char.


#8

Yes, but with what purpose? Perform aggregation, use it as an axis for a plot?
You can have a character variable with time format but there is not such thing as a time class variable in base R.


#9

I need it to subtract time difference between two events in my data then plot it into a graph


#10

You can do that with a datetime variable (i.e. timestamp)


#11

What do i need to amend in my code to do this please? Thanks


#12

You can get time periods with the lubridate package.

library(lubridate)
x <- interval(as.POSIXct("2018-11-08 07:41:47"), as.POSIXct("2018-11-08 07:41:50")) #interval
as.period(x)
#> [1] "3S"

I can't work directly in your code since you haven't provided a reproducible example.


#13

I have amned the code now. I used the reprex to create it.


#14

Making a reproducible example doesn't mean pasting non reproducible code inside the reprex() function and just posting the results.

You have to make your code reproducible first, your code above is not reproducible because you are not providing complete example data (e.g. there is no hostname, eventType variables in data 1)


#15

I have generated my code using a reprex. Please can someone help me out?


#16

Ah, I see, you edited your original post!


Heres an option.

library(lubridate)
library(tidyr)
library(dplyr)

df <- data.frame(timestamp = c("1 2018-11-08T07:41:55.921Z",
                               "2 2018-11-08T07:42:29.842Z",
                               "3 2018-11-08T07:42:29.845Z",
                               "4 2018-11-08T07:42:29.845Z",
                               "5 2018-11-08T07:43:13.957Z"))

df %>%
  
  #note your timestamp has index info, separating that out
  separate(timestamp, c('index', 'datetime'), sep = ' ') %>% 
  
  # converting the form of "2018-11-08T07:41:55.921Z" to POSIXct date-time objects
  # check out `?ymd_hms` docs if you want to force a timezone other than UTC
  mutate(time_clean = lubridate::ymd_hms(datetime)) %>% 
  
  mutate(
    # see https://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html 
    # formating details
    date = format(time_clean,format = '%F'),
    time = format(time_clean,format = '%T')
  )
#>   index                 datetime          time_clean       date     time
#> 1     1 2018-11-08T07:41:55.921Z 2018-11-08 07:41:55 2018-11-08 07:41:55
#> 2     2 2018-11-08T07:42:29.842Z 2018-11-08 07:42:29 2018-11-08 07:42:29
#> 3     3 2018-11-08T07:42:29.845Z 2018-11-08 07:42:29 2018-11-08 07:42:29
#> 4     4 2018-11-08T07:42:29.845Z 2018-11-08 07:42:29 2018-11-08 07:42:29
#> 5     5 2018-11-08T07:43:13.957Z 2018-11-08 07:43:13 2018-11-08 07:43:13

Created on 2019-01-11 by the reprex package (v0.2.1)


#17

I have an example with my own data that I think it's similar to what you want to do at the end.

library(tidyverse, quietly = TRUE)
library(lubridate)

conection_speed <- data.frame(
    time = c("2019-01-09 00:01:30", "2019-01-09 01:01:27",
             "2019-01-09 02:01:18", "2019-01-09 03:01:05",
             "2019-01-09 04:01:36", "2019-01-09 05:01:27", "2019-01-09 06:01:12",
             "2019-01-09 07:02:11", "2019-01-09 08:01:17",
             "2019-01-09 14:00:33", "2019-01-09 15:00:37", "2019-01-09 16:00:39",
             "2019-01-09 17:00:36", "2019-01-09 18:00:41",
             "2019-01-09 19:00:33", "2019-01-09 20:00:33", "2019-01-09 21:00:35",
             "2019-01-09 22:00:34", "2019-01-09 23:00:34",
             "2019-01-10 00:00:37", "2019-01-10 01:01:22", "2019-01-10 02:00:39",
             "2019-01-10 03:00:34", "2019-01-10 04:00:36", "2019-01-10 05:00:34",
             "2019-01-10 06:00:38", "2019-01-10 07:00:37",
             "2019-01-10 08:00:37", "2019-01-10 09:00:13", "2019-01-10 10:01:16",
             "2019-01-10 11:00:33", "2019-01-10 12:00:46",
             "2019-01-10 13:00:32", "2019-01-10 14:00:31", "2019-01-10 15:00:36",
             "2019-01-10 16:00:32", "2019-01-10 17:00:33",
             "2019-01-10 18:00:32", "2019-01-10 19:00:34", "2019-01-10 20:00:34",
             "2019-01-10 21:00:32", "2019-01-10 22:00:32", "2019-01-10 23:00:30",
             "2019-01-11 00:00:31", "2019-01-11 01:00:31",
             "2019-01-11 02:00:33", "2019-01-11 03:00:38", "2019-01-11 04:00:32",
             "2019-01-11 05:00:32", "2019-01-11 06:00:32",
             "2019-01-11 07:00:37", "2019-01-11 08:00:30", "2019-01-11 09:00:32",
             "2019-01-11 10:01:19", "2019-01-11 11:00:31",
             "2019-01-11 12:00:32", "2019-01-11 13:00:30", "2019-01-11 14:00:32",
             "2019-01-11 15:00:31"),
    download_speed = c(1.39, 1, 1.7, 1.67, 2.11, 1.12, 2.23, 0.94, 1.18, 7.94,
                       18.64, 2.1, 17.74, 19.91, 21.06, 21.12, 20.32, 20.96,
                       4.05, 19.14, 17.13, 14.63, 19.28, 21.03, 20.68, 21.23, 4.07,
                       18.61, 0, 1.19, 21.23, 18.73, 20.62, 20.63, 17.1, 20.67,
                       20.02, 21.32, 21.01, 20.69, 21.03, 20.5, 20.9, 19, 20.95,
                       20.88, 19.24, 21.2, 20.91, 21.36, 17.24, 21.12, 20.8, 2.7,
                       21.47, 20.86, 20.91, 20.57, 21.09)
)
conection_speed %>% 
    mutate(event_type = ifelse(download_speed <= 8, 'fault', 'normal')) %>% 
    filter(event_type == 'fault') %>% 
    mutate(lag_time = lag(time), tbf = as.period(interval(lag_time, time), unit = 'seconds') / 3600) %>% 
    tail(-1) %>% 
    ggplot(aes(x = '', y = tbf)) +
    geom_boxplot(fill = '#FF303094') +
    coord_flip() +
    labs(title = 'Time Between Failures',
         x = '',
         y = 'Hours')