When I perform the Ljung-Box test with the checkresiduals() function from package forecast, I do not get why we would want to set the "lag" option. Suppose I have three simple AR-models on the same time serie:
- AR(1)
- AR(2)
- AR(3)
Then for the first model I can use
checkresiduals(AR1$residuals, lag = 10, test = "LB")
I can also use
checkresiduals(AR1$residuals, lag = 30, test = "LB")
Depending on the number I choose for the argument "lag" the test-statistic and the p-value will differ. But WHY wouldn't I want to include the entire time series when performing the test? That is, what is the point in setting lag to a lower number than 50 if I have 50 observations?
Also, how do think when I choose this "lag" argument for my different models? Do I choose the same number for the AR(1) as for the AR(3) model? Should I choose the same number of lags as I have in the model, so that for an AR(3)-model i pick lag=3?
I have a hard time finding an answer to this. I have looked in the online book by Hyndman and Athanasopolous here: 5.4 Residual diagnostics | Forecasting: Principles and Practice (3rd ed) but it does not answer my question. They only set lag=10 in all examples.
Any explanation appreciated!