Mean function and format the result to H:M:S

Hi everyone
I am calculating the mean of time duration by running this code:
mean(new2019_2020$trip_duration)
[1] 879.7847
then I tried to format the result in H:M:S by running this code:
format(as.POSIXct(mean(new2019_2020$trip_duration)), "%H:%M:%S")
[1] "19:14:39"
as you can see the result is wrong it should be 00:14:39 which = 879.7847 sec
what am I doing wrong here?

POSIXct is for recording a particular date-time, for example "the 12th of December 2023 at 2 pm". What you have here is an interval of time, that R calls a difftime. A difftime is the difference between two date-times.

You can convert a number to a difftime like this:

my_difft <- as.difftime(879.7847, units = "secs")

Then you can recover this difftime in various units:

my_difft
#> Time difference of 879.7847 secs
as.numeric(my_difft, "secs")
#> [1] 879.7847
as.numeric(my_difft, "mins")
#> [1] 14.66308
as.numeric(my_difft, "hours")
#> [1] 0.2443846

If you use the package {lubridate}, a similar object is called a duration, and by default is given in seconds:

library(lubridate)

mean_duration <- duration(879.7847)
mean_duration
#> [1] "879.7847s (~14.66 minutes)"

While R's difftime and lubridate's duration are similar, durations are a bit more practical in many small ways, if you need to work with them a lot.

Now, the format "H:M:S" is pretty common when giving a time (a fixed time of day), it is uncommon for a duration. So, as far as I can tell, neither base R nor lubridate offer a way to represent it that way. And I would ask if its a good idea, since for many users 3:0:0 would mean 3 o'clock rather than a duration of 3 hours. Maybe it would be better as 3h0'0"? But if needed, it's pretty simple to implement it ourselves:

my_format <- function(dt){
  stopifnot(inherits(dt, "difftime") || inherits(dt, "Duration"))
  
  h <- floor(as.numeric(dt, unit = "hours"))
  m <- floor(as.numeric(dt, unit = "mins"))
  s <- floor(as.numeric(dt, unit = "secs"))
  
  paste(h, m-h*60, s-m*60,
        sep = ":")
}

I think I didn't make my question clear. I am calculating the average time so I wrote this code

mean(v2$trip_duration)
[1] 879.7847

then to format the result as H:M:S I run the code

average_trip_t <- mean(v2$trip_duration)
average_trip_t <- format(as.POSIXct(average_trip_t), "%H:%M:%S")
the result shows
"00:14:39"
however when I run the same code in the Rstudio
the result in R studio was "19:14:39"

as you can see the correct answer is "00:14:39" so why there is 19 hours in the r studio?

I hope this makes it more clear, thank you.

as.POSIXCT() is converting to your local time zone. I'm in the Mountain time zone and I get the following:

format(as.POSIXct(879.7847), "%H:%M:%S")
#> [1] "17:14:39"
as.POSIXct(879.7847)
#> [1] "1969-12-31 17:14:39 MST"
format(as.POSIXct(879.7847), "%H:%M:%S", tz = "UTC")
#> [1] "00:14:39"

Created on 2024-02-06 with reprex v2.0.2

2 Likes

Thank you so much it worked