How to use recipes to impute unknown variables in the test set

I have the following reproducible example which represents the common situation where some predictors are useful to train on, but not known when a forecast is generated (i.e. temperature).

The goal is to fill these missing values in with a rolling median imputation based off of the training set.

Is it possible to modify the following example to fill in the NA values in the test set with rolling origin values calculated from the training set?

set.seed(145)
example_data <-
  data.frame(
    day = ymd("2012-06-07") + days(1:12),
    x1 = round(runif(12), 2),
    x2 = round(runif(12), 2),
    x3 = round(runif(12), 2)
  )
d <- initial_time_split(example_data)
trn <- training(d)
tst <- testing(d)
tst$x2 <- NA

library(recipes)
seven_pt <- recipe(~ . , data = trn) %>%
  update_role(day, new_role = "time_index") %>%
  step_rollimpute(x2, window = 7) %>%
  prep(training = trn, retain = TRUE)

juice(seven_pt)
bake(seven_pt, new_data = tst)

How about step_knnimpute() instead?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.