Sum not working as desired on a column with time pattern and of character type and how to use top n

I have a dataframe which has three columns:Machine Name,Usage Hours and Region.The first two are of character type and last is of factor type.I want to sum up usage hours per machine name so that i get total usage hours per machine and after that i want to find out the top 10 machines based on usage hours both increasing and decreasing.When i am summing up usage hours using aggregate function,its summing up to a whole number instead of displaying in hours:minutes:seconds the way the usage hours data is in the df.

Please find below a snapshot of data.
Machine Name Usage hours Region
ABC 41:43:00 APAC
DEF 0:00:00 APAC
EFH 133:20:00 APAC

I added another row for DEF so you could see summation happening

(df<- tribble(~MachineName,~Usagehours,~Region,
"ABC"," 41:43:00", "APAC",
"DEF","0:00:00","APAC",
"EFH","133:20:00", "APAC",
"DEF","1:1:1", "APAC"))

library(lubridate)
library(tidyverse)

(df2 <- mutate(df,
              Usagehours_dur = hms(Usagehours) %>% as.duration()))

(df3 <- group_by(df2,
                 MachineName) %>% summarise(sum_as_hours=sum(Usagehours_dur) /60^2)
)
library(lubridate)
library(tidyverse)

(df<- tribble(~MachineName,~Usagehours,~Region,
              "ABC"," 41:43:00", "APAC",
              "DEF","0:00:00","APAC",
              "EFH","133:20:00", "APAC",
              "DEF","1:1:1", "APAC"))


(df2 <- mutate(df,
               hms = hms(Usagehours),
               h=hour(hms),
               m=minute(hms),
               s=second(hms)))


(df3 <- group_by(df2,
                 MachineName) %>% summarise_at(.vars = c("h","m","s"),
                                               .funs = sum)
)

(d4 <- mutate(df3,
              hms_elapsed = paste(
                h, str_pad(m, width = 2, pad = "0"),
                str_pad(s, width = 2, pad = "0"), sep = ":")
))

Thanks for all the inputs.Really appreciate.One more thing i wanted to ask you is after doing this and then trying to find out top 10 records,it doesn't give the top 10 records actually.It somehow gives records which have hours in two digits like below:

top_10_machines_by_usage <- d4%>%top_n(10,d4$hms_elapsed)

top_10_machines_by_usage_sort<-top_10_machines_by_usage%>%arrange(desc(hms_elapsed))

Machine.Name h m s hms_elapsed

1 xxx 99 49 0 99:49:00
2 yyy 98 71 0 98:71:00
3 PFNCLI 98 68 0 98:68:00
4 RIJKES 98 65 0 98:65:00
5 ABETUL 98 20 0 98:20:00
6 HEPC 98 135 0 98:135:00
7 L1412 96 24 0 96:24:00
8 AM-L-9512 95 41 0 95:41:00
9 SERH 94 73 0 94:73:00
10 D180 94 42 0 94:42:00

whereas there are records with hours in 3 digits.i have tried to do some debugging but its still not producing the desired result.I am relatively new to R.so trying to figure out things.

its simply not going to work to arrange by a character string with representations where its possible to have 0 hours and 10000minutes as one entry..
If you want to make principled comparisons of relative elapsed durations, I think you'll agree that some standardisation is necessary, like total seconds, and arrange on that.
just make a new variable, sum_seconds, with 60^2 hours + 60* minutes + seconds as its equation and use that one to arrange. use the pasted together character representation for display purposes only, not to compute on.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

So it will return the sum for each machine in hours:minutes: seconds right.Also how and which function to use for top n since I used dplyr and base r for the same,it didn't give the desired result.

I tried this but its giving output in the form of 234.56,0.000 but i want summation in the same format as hours:minutes:seconds such as below:

Machine Name usage Hours region
ABC 230:45:24 APAC
DEF 50:36:47 APAC
ABC 25:32:56 APAC

So the output for ABC should be as follows:
ABC 255:77:80 APAC

It should return in a similar way for other machines also.

ABC 255:77:80 APAC

Thats one possibility , but wouldnt it be better to put minutes over 60 as some number in the hours field ?
ditto for having seconds over 60 as some number of additional minutes ?

Requirement of client is such and they are not budging a bit on the same.Hence,the output has to be shown that way.