Dear R-community,
I am quiet new to R.
My dataframe with more than 800.000 patients looks like this:
df <- data.frame("ID" = c("1","1","1","1","1","2","2","2","3","3"), "therapies" = c("A51", "B32", "A67","A99","L37","A64","A51","L45","B32","A55"), "number of therapies" = c(8,2,6,1,7,15,3,2,9,10), "duration of therapie in min." = c(240, 120, 189, 30, 210, 450, 60,60, 180,400), "Total duration of inpatient treatment" = c(21,21,21,21,21,24,24,24,18,18))
- First I want to aggregate all A- therapies, all B-therapies and all L-therapies by ID to perform a descriptive stats on the three different therapies (and their numbers and durations)
- After that I want to sum up all therapies(A+B+L) (and their numbers and durations) by ID.
- To calculate the average duration and number of each ID I would use dplyr:: mutate (average_duration = aggregated duration/ total duration of inpatient treatment)Right?
- After aggregating everything (to get one ID per row) I want to merge this dataframe with another by ID.
I would be very grateful if you could help me especially with the 1. and 2. Thanks in advance.