Hierarchical Time Series Forecasts Suggestions for Improvements

Hello everyone, this is my first post. I've been using R for several years and have been using it to forecast the enrollment funnel for a school's programs. It's relatively accurate, however I still thought it would be good to ask if someone here has done something similar or has any suggestions for making it more accurate that I might not have thought of.

The basis of any marketing funnel is that not everyone moves from one stage of the funnel to the next. For a school we look at it in terms of applications. So our marketing funnel looks like this:

Started Applications >= Completed Applications >= Admits >= Deposits >= Enrollments

I approached this problem with a hierarchical time series model, with each data point of the time series being a snapshot of the funnel on the first of the month. Because it's a hierarchical time series, the monthly observations are from the bottom level of the time series where they are broken down by enrollments, deposits but not enrolleds, admits but not deposited, etc. and the sum of each row is the total number of applications at all stages of the funnel.

Here is how one year of data looks:

Enrollments Deposits Not enrolled Non-deposits Enrolled Admits Not deposited Non adm1 Non adm2 Non adm3 Applied Not admitted Non app1 Non app2 Non app3 Non app4 Non app5 Non app6 Non app7 Started app Not applied
0 16 0 28 0 0 0 31 0 0 0 0 0 0 0 96
0 18 0 26 0 0 0 44 0 0 0 0 0 0 0 254
0 23 0 38 0 0 0 95 0 0 0 0 0 0 0 549
0 27 0 53 0 0 0 146 0 0 0 0 0 0 0 857
0 28 0 173 0 0 0 296 0 0 0 0 0 0 0 987
0 51 0 361 0 0 0 353 0 0 0 0 0 0 0 1036
16 111 0 529 0 0 0 376 0 0 0 0 0 0 0 938
116 128 0 512 0 0 0 388 0 0 0 0 0 0 0 941
247 89 0 491 0 0 0 377 0 0 0 0 0 0 0 987
295 55 0 492 0 0 0 387 0 0 0 0 0 0 0 1043
317 56 0 499 0 0 0 385 0 0 0 0 0 0 0 1059
369 37 0 496 0 0 0 379 0 0 0 0 0 0 0 1053

Here's the code I use to read in the data and make a forecast:

CASts <- ts(CAS_Time_Series, start = c(2014, 1), end = c(2019, 8), frequency = 12)
nodes <- list(2, rep(2, 2), rep(2, 4), rep(2, 8))
y <- hts(CASts, nodes = nodes)
allf <- forecast(y, h = 4, FUN = function(x) ets(x))
fcdf <- as.data.frame(allts(allf))
write.table(fcdf, file="MAY_forecastCAS_ets.csv", sep=",", row.names=FALSE)

I then have a function that repeats the process for the following prediction methods

  1. hw
  2. stlf
  3. auto.arima
  4. nnetar

The final forecast is an average of the five methods.

My question is this: is there a better, more accurate, or more elegant way to do this? I'm looking to make sure that I'm making the most accurate time series forecast that I can produce. I appreciate all feedback!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.