Forecasting panel data

Good day!

I am trying to forecast my dependent variable 9 periods ahead, having history of 25 years. I have panel data with 34 countries and 25 years for each country – 850 observations in total. Currently I am having hard times with making predictions based on my models (pooled ols and time specific fixed effects) due to the fact, that I have many countries with their specific historical values of the dependent variable. But I want to use the results of my models to make those predictions 9 periods ahead.

The model: y = x1 + x2 + x3

pooled <- plm(log(Depvar+1)~lag(x1,8)+lag(x2,8)+lag(x3,8), index=c("country","year"), data=RMB2, model="pooling")

The code for forecasting (tried numerous combinations, but the best result so far is the same predicted value 9 periods ahead):

forecast = forecast(pooled1[["model"]][,1],h=10)

Probably I need to introduce a training set, but not sure how.

Hope someone may help me.


Referred here by Forecasting: Principles and Practice, by Rob J Hyndman and George Athanasopoulos

1 Like

HI @Rostislav, can you share the data your are using? It will make it easier for people to help out.

It is basically simple panel dataset, nothing special - one dependent variable and three independent. I just need the code to correctly build the forecast 9 periods ahead out of plm model.

I understand it might be simple to you, but for others (including myself) it is a tremendous help if you provide a full reproducible example.

1 Like

Hello Rostislav, Hello Nate,

im new here, nice to meet you.

I am currently working on a forecast and have exactly the same question. I am an absolute beginner in R and am currently trying my luck with panel data analysis. Nice to see that I am on the right track. Unfortunately my data is strictly confidential, otherwise I would be happy to provide it. I hope someone can help us.

BR, Jim

Good day, Jim.

I actually have the same problem - I am not sure if I can share my data due to our company compliance stipulations. But I gues the general mechanics should be the same for plm models, where each unit has it unique values over time. By the way, all the four variables are stationary in my case according to the Dickey–Fuller test.

BR, Rostislav.

Hello,
I have anonymized a part of my data set and can now share this file with you. It is economic data for 147 countries for the period 2002-2018, with one dependent variable (Y) and several independent variables (X).
The goal is a forecast of the variable Y for the years 2019 to 2025.
Am I correct in assuming that a panel data analysis with the plm function in R is the right way to do this? I would apply all models (pooled, fixed & random) and finally check which method is the most suitable using the B-P LM test and Hausman test.

I still have some questions:

  1. are there (as with multiple linear regression) certain requirements that my data must fulfill in order to be allowed to perform a panel data analysis? Unfortunately, I have not found any information on this.
  2. in the end I get a regression equation after all, right?. In order to make a forecast for the years 2019-2025 for the variable Y, I then need the X-values for the years 2019-2025. Once I have the values, I simply put them into the regression equation (Y = ax1+bx2... etc.) and can thus predict Y? Or is there any other way to do this?

I hope with my data we can find a solution together.

Thanks a lot for your help!!
Best regards,
Jim

Panel_data: Panel_Data

My case is quite similar, though I have lagged variables and can use them for predictions. But there should be a method, smth like exponential smoothing etc., which does not require inputs and works with several independent variables besides just autocorrelation with lagged dependent variable.