I think we are getting closer. This worked for the top (first year) of my dataset but it did not produce this outcome at the beginning of any of the following years; it still gave me the same outcome as when I used the 'rollmax' function. Since the discharge period for my dataset runs from March to end of October, I have NAs from November to end of February. I would prefer not to separate by data into individual years then remerge it together to accomplish this. I have about 7 decades worth of data I am trying to do this for.
So here is a dataset that has NA in the middle of it. Using the code you provided, the top dates get filled but the dates following a large chunk of NAs, do not get filled like at the top.
df <- data.frame(
date = c('1996-03-01','1996-03-02','1996-03-03','1996-03-04','1996-03-05','1996-03-06','1996-03-07','1996-03-08','1996-03-09','1996-03-10','1996-03-11','1996-03-12','1996-03-13','1996-03-14','1996-03-15','1996-03-16','1996-03-17','1996-03-18','1996-03-19','1996-03-20','1996-03-21','1996-03-22','1996-03-23','1996-03-24','1996-03-25','1996-03-26','1996-03-27','1996-03-28','1996-03-29','1996-03-30','1996-03-31','1996-04-01','1996-04-02','1996-04-03','1996-04-04','1996-04-05','1996-04-06','1996-04-07','1996-04-08','1996-04-09','1996-04-10','1996-04-11','1996-04-12','1996-04-13','1996-04-14','1996-04-15','1996-04-16','1996-04-17','1996-04-18','1996-04-19','1996-04-20','1996-04-21','1996-04-22','1996-04-23','1996-04-24','1996-04-25','1996-04-26','1996-04-27','1996-04-28','1996-04-29','1996-04-30'),
discharge = c(0.236,0.241,0.254,0.322,0.363,0.42,0.463,0.515,0.506,0.497,0.484,0.464,0.478,0.495,0.518,0.526,0.519,0.515,0.509,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,4.13,8.41,8.62,8.67,8.69,8.8,8.87,9.56,10.1,10.4,10.5,10.7,10.8,10.9,11,12.7,18.1,27.3,34.2,63.3)
)
df_prev <- df %>%
mutate(`2_d` = rollapplyr(discharge, 2L, max, fill = NA, partial = TRUE)) %>%
mutate(`4_d` = rollapplyr(discharge, 4L, max, fill = NA, partial = TRUE)) %>%
mutate(`8_d` = rollapplyr(discharge, 8L, max, fill = NA, partial = TRUE))
print(df_prev)
Now my goal is to have an outcome where at the beginning of each year is filled with a MAX values even though there are NAs prior to the first discharge value of the year, (producing similar results that you were able to provide me in your last reply). Below is an example of what I am trying to accomplish when there are NA values between each year; however, below I put NAs in the middle of two month just to give an example of what I mean (I am just showing the result of the "4_d" and "8_d" output.
df_wanted_output <- data.frame(
date = c('1996-03-01','1996-03-02','1996-03-03','1996-03-04','1996-03-05','1996-03-06','1996-03-07','1996-03-08','1996-03-09','1996-03-10','1996-03-11','1996-03-12','1996-03-13','1996-03-14','1996-03-15','1996-03-16','1996-03-17','1996-03-18','1996-03-19','1996-03-20','1996-03-21','1996-03-22','1996-03-23','1996-03-24','1996-03-25','1996-03-26','1996-03-27','1996-03-28','1996-03-29','1996-03-30','1996-03-31','1996-04-01','1996-04-02','1996-04-03','1996-04-04','1996-04-05','1996-04-06','1996-04-07','1996-04-08','1996-04-09','1996-04-10','1996-04-11','1996-04-12','1996-04-13','1996-04-14','1996-04-15','1996-04-16','1996-04-17','1996-04-18','1996-04-19','1996-04-20','1996-04-21','1996-04-22','1996-04-23','1996-04-24','1996-04-25','1996-04-26','1996-04-27','1996-04-28','1996-04-29','1996-04-30'),
discharge = c(0.236,0.241,0.254,0.322,0.363,0.42,0.463,0.515,0.506,0.497,0.484,0.464,0.478,0.495,0.518,0.526,0.519,0.515,0.509,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,4.13,8.41,8.62,8.67,8.69,8.8,8.87,9.56,10.1,10.4,10.5,10.7,10.8,10.9,11,12.7,18.1,27.3,34.2,63.3),
"4_d" = c(0.236,0.241,0.254,0.322,0.363,0.42,0.463,0.515,0.515,0.515,0.515,0.506,0.497,0.495,0.518,0.526,0.526,0.526,0.526,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,4.13,8.41,8.62,8.67,8.69,8.8,8.87,9.56,10.1,10.4,10.5,10.7,10.8,10.9,11,12.7,18.1,27.3,34.2,63.3),
"8_d" = c(0.236,0.241,0.254,0.322,0.363,0.42,0.463,0.515,0.515,0.515,0.515,0.515,0.515,0.515,0.518,0.526,0.526,0.526,0.526,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,4.13,8.41,8.62,8.67,8.69,8.8,8.87,9.56,10.1,10.4,10.5,10.7,10.8,10.9,11,12.7,18.1,27.3,34.2,63.3)
)
print(df_wanted_output)
Is there anyway this can be accomplished?
Also, just so I can better understand what you did in your previous reply...
- The 'warn.conflict = FALSE' when loading the package, does this allow for the NAs to be ignored?
- What is the reasoning for the width argument to have an 'L' after the number?