I have a model that as a predictor has the previous prediction. e.g. target ~ lag(target prediction)
Using purrr::accumulate I'm able to write a custom function to predict. Example of some silly data and a silly model that illustrates:
### A model that uses a lag prediction as a predictor using purrr::accumulate() ###
my_diamonds <- diamonds %>%
group_by(cut) %>%
mutate(cumprice = cumsum(price)) %>% # cumulative within groups
mutate(lag_cumprice = lag(cumprice)) %>%
mutate(InitialValue = min(cumprice)) %>%
filter(!is.na(lag_cumprice)) %>%
select(cut, cumprice, lag_cumprice, x, InitialValue)
silly_model <- glm(formula = cumprice ~ x + lag_cumprice, family = 'poisson', data = my_diamonds)
This model uses the previous prediction as input to the next prediction. I'm able to write a custom function to mutate a prediction:
# when predicting won't have lag_cumprice, instead the result of the previous pediction should be an input to the model:
accPrice <- function(mod, acc, cur) {
db=cur_data_all() # grouped data segment
x = db$x[cur] # cur is the current row in the data, use it to get 'this' iterations value of x
total_exponent <- mod$coefficients['(Intercept)'] +
(mod$coefficients['x'] * x) +
(mod$coefficients['lag_cumprice'] * acc) # acc is the accumulated prediction for cumprice
}
# now predict
my_diamonds <- my_diamonds %>%
mutate(predicted = accumulate(.x = row_number()[-1], .init = InitialValue %>% unique, .f = accPrice, mod = silly_model))
So far so good. In this example I used the previous prediction acc
as an input.
But, I created a variation model that now uses two lagged variables as predictors:
### now a model with lag on two variables not just one ###
my_diamonds2 <- diamonds %>%
group_by(cut) %>%
mutate(cumprice = cumsum(price)) %>% # cumulative within groups
mutate(lag_cumprice = lag(cumprice)) %>%
mutate(InitialValue = min(cumprice)) %>%
mutate(rn = row_number()) %>%
mutate(cumrn = cumsum(rn)) %>%
mutate(lag_cumrn = lag(cumrn)) %>%
filter(!is.na(lag_cumprice)) %>%
select(cut, cumprice, lag_cumprice, lag_cumrn, x, InitialValue)
silly_model2 <- glm(formula = cumprice ~ x + lag_cumprice + lag_cumrn, family = 'poisson', data = my_diamonds2)
### Stuck after here ###
How can I modify the function accPrice() above to accumulate 2 variables, both lag_cumprice and lag_cumrn as opposed to just lag_cumprice as before?