Hey everyone,
I've encountered this very issue before. It seems like the column in your Excel file is displaying the duration in a "days:hours:minutes" format, which is quite typical in Excel for durations longer than 24 hours. However, when imported into RStudio, it's being read as a time, which can cause some confusion.
Here's a potential way to handle this in R:
Import the CSV file as usual. Let's say the problematic column is called "Duration."
Use the strsplit function to split the column into three separate columns (days, hours, and minutes).
Convert these columns into numeric and then combine them into a single duration column in hours.
Here's a bit of code that might help:
Assuming df is your data frame and Duration is the column
duration_split <- strsplit(as.character(df$Duration), ":")
df$Days <- sapply(duration_split, [
, 1)
df$Hours <- sapply(duration_split, [
, 2)
df$Minutes <- sapply(duration_split, [
, 3)
Convert to numeric
df$Days <- as.numeric(df$Days)
df$Hours <- as.numeric(df$Hours)
df$Minutes <- as.numeric(df$Minutes) / 60 # convert minutes to fraction of hour
Combine into single duration column in hours
df$TotalHours <- df$Days * 24 + df$Hours + df$Minutes
Optional: remove the split columns
df$Days <- NULL
df$Hours <- NULL
df$Minutes <- NULL
This way, TotalHours should now have the correct duration in hours. Do give it a try and see if it helps.
Cheers and happy coding!
Ahmad