Running analysis on time data - estimating average time

HI, I am trying to analyze data and I have not been able to find the answer anywhere. My data looks like below:

#> sample dataset below
df <- data.frame(
    site = c(1, 2, 2, 3, 3, 3, 4),
    time.spent = c("05:00", "12:00", "06:43", "02:00", "17:00", "09:45", "06:00"),
    emissions = c(1.2, 3.6, 2.4, 9.6, 1.8, 5.3, 1.6))

#> this is what I ran
library(ddply)
emit <- ddply (df, c("site"), summarize,  
                 emission  = sum (emissions, na.rm=T), 
                 Time     = mean (time.spent, na.rm=T))

what I want to do is get a summary by site on average time and emissions
However, I am not able to run it as the time.spent variable is in character or factor format. If I try to change it to numeric - it does not covert from 06:00 to 6 but instead picks a random number and gives 5, or for 29:00 gives 17.
I have searched and searched - so please help me. even tried strptime package and other stuff. Hoping someone here can help me out now!

Would this work for you?
Note: In the future, please make you questions with a REPRoducible EXample (reprex) like this one

library(hms)
library(dplyr)

# Sample data
df <- data.frame(
    site = c(1, 2, 2, 3, 3, 3, 4),
    time.spent = c("05:00", "12:00", "06:43", "02:00", "17:00", "09:45", "06:00"),
    emissions = c(1.2, 3.6, 2.4, 9.6, 1.8, 5.3, 1.6))

df %>% 
    mutate(time.spent = as.numeric(parse_hm(time.spent))/3600)%>% 
    group_by(site) %>% 
    summarise(mean_time_spent = mean(time.spent),
              mean_emissions = mean(emissions))
#> # A tibble: 4 x 3
#>    site mean_time_spent mean_emissions
#>   <dbl>           <dbl>          <dbl>
#> 1     1            5              1.2 
#> 2     2            9.36           3   
#> 3     3            9.58           5.57
#> 4     4            6              1.6

Created on 2019-07-30 by the reprex package (v0.3.0)

Thanks @andresrcs - I will make the data in format you suggested next time. It was my first time posting. I will try to edit it right now too for easier reading for others (thank you).
I tried your code, but the issue is the same. When the Time numbers are converted to numeric they change in my outputs. For example the "05:00" time changes to "0.08333" in numeric. Or 12 changes to 0.2.
I am not sure why it is doing that. Hopefully you can help with this issue. Maybe this is because it is changing it to fractions?
But now I have some of your code to work with. Thanks!

Maybe your problem has to do with the structure of your actual dataset, try to make a reproducible example using dput() for getting the sample data, it would be easier if we could work seen at the same data.

Ok, here is the data

   number time frequency
1     1      0:05    2.44
2     1      0:05    0.00
3     1      0:05    0.00
4     2      0:12    1.19
5     2      0:12    0.63
6     2      0:12    0.00
7     3      0:05    0.36
8     3      0:05    1.13
9     4      0:07    2.37
10    4      0:07    0.92
11    5      0:07    1.11
12    5      0:07    2.53
13    5      0:07    0.25
14    6      0:12    2.71
15    6      0:12    0.04
16    7      0:07    0.56
17    8      0:10    0.04
18    8      0:10    1.29
19    9      0:03    1.89
20    9      0:03    1.96
21   10      0:16    0.74
22   10      0:16    2.71
23   10      0:16    0.86
24   10      0:16    2.08

#> how the dataframe looks
> str(df)
#>'data.frame':	24 obs. of  3 variables:
#> $ number   : int  1 1 1 2 2 2 3 3 4 4 ...
#> $ time     : Factor w/ 6 levels "0:03","0:05",..: 2 2 2 5 5 5 2 2 3 3 ...
#> $ frequency: num  2.44 0 0 1.19 0.63 0 0.36 1.13 2.37 0.92 ...

Please let me know if you find the same issue as well?
I am not sure how to use dput() or attach my file here. Hopefully this helps?

Ok, your data says "0:05" (five minutes) not "05:00" (five hours) as you mentioned, so the result is correct "0:05" (five minutes) is equal to 0.08333 hours

library(hms)
as.numeric(parse_hm("0:05")) / 3600
#> [1] 0.08333333

I was assuming that your data is in HH:MM format, could you specify what is the actual format of your data and what are the units of your desired output?

If you want to learn how to post a proper reproducible example (and use dput) check this link

Thanks for the update. Yes, I want the output in minutes, but when you mentioned the format it was in and what its doing. it work. Instead of 3600 I used 60 and it works out. Thanks so much. I really appreciate your help! And will read up dput() etc if I have to post again in future.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.