Greetings fellow R users,
I started using R only a week ago and got very much stuck on a problem I have hammered my head against for days now (unsucessfully of course).
My data includes records about the duration of activities over several years with month and year as seperate variables. I now wish to create a new variable which depicts the mean durations for the respective month besides each observation in a seperate column in the same dataframe, so I can generate a nice plot later on, showing said monthly average as a reference.
Here is a data example, similar yet much smaller than my original dataset.
year <- c(2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019) month <- c(6, 7, 9, 2, 2, 6, 6, 9, 7, 2, 2, 2) duration <- c(13, 8, 15, 7, 18, NA, 13, 19, 4, 9, 11, 11) activity <- data.frame (year, month, duration)
Created on 2020-07-21 by the reprex package (v0.3.0)
My approach to reach my goal was using the
group_by() and the
mutate() commands as follows. I want to keep everything inside that one dataframe because I have multiple dataframes in my project.
activity$duration_mean_month <- activity %>% group_by(activity$year, activity$month) %>% mutate(activity$duration_mean_month = mean(activity$duration, na.rm = TRUE))
But I always get the error message:
Error: unexpected '=' in: " group_by(activity$year, activity$month) %>% mutate(activity$duration_mean_month ="
For several days now I tried to find a solution for this problem, even desperately going as far as trying to write a loop (which, with my very limited programming experience went poorly) to solve this problem differently. I hope some wise soul out there will find a way to fix my code and thus expand my horizon.