Hi and welcome, @budugulo! Here's on approach using the relatively new, slider package. I've found it very useful to create moving averages and the like with a tidyverse-friendly approach.
You can use the .before and .after arguments of slide_dbl() to define the window you want to use as you iterate over the rows. Here, we say look back 2 rows, and don't include the current row. If you do this on a grouped data frame, it respects the groups, and will look back within the group -- in this case, region, gender, and month.
library(dplyr, warn.conflicts = FALSE)
library(slider, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
toy_data <- tibble(
var_missing = rnorm(mean = 10, sd = 3, n = 360),
region = rep(c("A", "B", "C"), each = 2, times = 60),
gender = rep(c("Male","Female"), times = 180),
year = rep(2010:2014, each = 72),
month = rep(1:12, each = 6, times = 5)
) %>%
mutate(month = month(month, label = TRUE))
# replace with NA
toy_data[355, 1] <- NA
toy_data[360, 1] <- NA
toy_data %>%
group_by(region, gender, month) %>%
mutate(
var_missing =
if_else(
is.na(var_missing),
slider::slide_dbl(var_missing, mean, .before = 2, .after = -1),
var_missing
)
) %>%
ungroup()
#> # A tibble: 360 x 5
#> var_missing region gender year month
#> <dbl> <chr> <chr> <int> <ord>
#> 1 6.24 A Male 2010 Jan
#> 2 8.40 A Female 2010 Jan
#> 3 9.71 B Male 2010 Jan
#> 4 7.50 B Female 2010 Jan
#> 5 8.08 C Male 2010 Jan
#> 6 6.72 C Female 2010 Jan
#> 7 10.0 A Male 2010 Feb
#> 8 9.42 A Female 2010 Feb
#> 9 12.2 B Male 2010 Feb
#> 10 14.2 B Female 2010 Feb
#> # ... with 350 more rows
Created on 2020-06-23 by the reprex package (v0.3.0)