dlm package - Kalman filter with mean reversion

yaji · January 5, 2021, 11:49pm

I like to estimate the model below with dlm package.

The unknown parameters are a , b and variances of ε, e. These have to be estimated by Maximum likelihood method.
The second equation can be rewritten as the third equation.
I am not sure how I should apply dlm package for these equations as dlmModReg function seems not to be appropriate.

If someone can help me out, that would be very appreciated.

gueyenono · January 6, 2021, 4:45am

Hi @yaji and welcome to the RStudio Community

I think I may be able to help; however, I have three questions for you:

\alpha_t is a dynamic intercept (or dynamic level), but your system does not model it. Or do you just assume the standard model: \alpha_t = \alpha_{t-1} + v_t (where v_t is an i.i.d. error term with constant variance)?
Do you have more information about the parameters a and b?
Finally, do you have a time series (data) that you would like to fit this dynamic linear model to? If you do, you may want to share it so that the community will be able to use it to fit your model.

yaji · January 6, 2021, 10:21am

Thank you very much for replying.

This analysis is to estimate optimal hedge ratio for Australian wheat futures against US wheat futures. Therefore the main interest is in beta(optimal hedge ratio) rather than alpha. This is the reason I forgot to specify a model for alpha. I am happy to use your iid model for alpha.
Because b is the long term mean of optimal hedge ratio, it would be between 0 and 1. This is all I have now.
Absolutely. However, I do not know how to upload csv file on this chat.

gueyenono · January 6, 2021, 4:08pm

@yaji You could upload it to any cloud platform of your choice and then share a link here (e.g. Google Drive, Onedrive, ...)

yaji · January 6, 2021, 9:47pm

Hi @gueyenono,

Thanks for your instruction.

Please see below link.
https://1drv.ms/u/s!AmCJKMB6FP7JgakHyo9RhGTmWvLWaQ?e=GvdRor

The data contains Australian wheat futures(ASX) and US wheat futures(HRW) price series both in USD.
y_t is the diffirence in price of ASX between time t and t-1.
x_t is the difference in price of HRW between time t and t-1.

gueyenono · January 7, 2021, 1:51am

@yaji It seems like the file is corrupted. I am not able to open it
I suggest you convert your file into a csv file. Also, just send me a download link so that I can directly download it rather than access the file online.

yaji · January 7, 2021, 1:57am

@gueyenono Sorry for that. Does this link work for you?

https://1drv.ms/u/s!AmCJKMB6FP7JgakITvPepc1bDOUSLQ?e=lnm4Vv

gueyenono · January 7, 2021, 2:00am

Yes it does! @yaji I'll look into the question for you and get back to you.

yaji · January 7, 2021, 2:19am

Thanks. I really appreciate your help!

gueyenono · January 7, 2021, 4:09am

I wasn't really getting why you called b the long-term mean of \beta_t, the optimal hedge ratio. I figured it out. \beta_t is an AR(1) process and assuming that it is stationary (i.e. 0 < a <1), its long-term mean is computed as:

\beta_t = ab + (1-a)\beta_{t-1} + e_t \\ E(\beta_t) = \frac{ab}{1 - (1-a)} \\ E(\beta_t) = \frac{ab}{a} \\ E(\beta_t) = b

where ab is the constant and (1-a) is the coefficient of the AR(1) process. So this is MY take on how to tackle your problem. Let me first start with the theoretical side of things. Here is the general matrix representation of dynamic linear models:

y_t = F_t \theta_t + \varepsilon_t \\ \theta_t = G \theta_{t-1} + u_t

where \varepsilon_t and u_t are normally distributed mean zero processes of constant variance. The system under consideration here is a dynamic linear model with a dynamic level (\alpha_t) and a dynamic slope (\beta_t):

y_t = \alpha_t + \beta_t x_t + \epsilon_t \\ \alpha_t = \alpha_{t-1} + v_t \\ \beta_t = ab + (1-a) \beta_{t-1} + w_t

where \epsilon_t, v_t and w_t are normally distributed mean zero processes of constant variance. This system cannot be directly put into the DLM matrix form without a bit of transformation. This is because of the constant term ab in the equation for \beta_t. But it is possible to get rid of it by demeaning \beta_t.

We define \tilde{\beta}_t = \beta_t - b, so the system can be rewritten:

y_t = \alpha_t + \tilde{\beta}_t x_t + \tilde{\epsilon}_t \\ \alpha_t = \alpha_{t-1} + v_t \\ \tilde{\beta}_t = (1-a) \tilde{\beta}_{t-1} + \tilde{w}_t

Note that the constant ab is no longer present. Given our new system, we have:

\theta_t = [\alpha_t \ \ \tilde{\beta}_t]' \\ F_t = [1 \ \ x_t]' \\ G = \begin{bmatrix} 1 & 0 \\ 0 & 1-a \end{bmatrix}

In traditional models such as this one, matrix G is slightly different:

G = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

So we will have to manually modify the G matrix in the code.

Now that the system has been identified, it is time to do some coding. You mentioned earlier that the data you uploaded was in first differences (y_t - y_{t-1}). I'm having a hard time to accept that because the average values for each of the two price series are: \$217.39 for ASX and \$166.45 for HRW. Whether the currency is US dollars or Australian dollars, it is hard to believe that these prices change by over $100 daily! So I am going to assume that your data is the actual prices, not the first differences.

First, the HRW column has some NA values, which are problematic for the regression model we want to fit. So I am going to impute them with the mice package.

# Load packages
library(dlm)
library(readxl)
library(mice)

# Import the dataset
dat <- read_xlsx(here::here("kalman_filter/Futures202011.xlsx"))

# Impute the NA values
dat[, -1] <- complete(mice(data = dat[, -1]))

# Function for estimating the maximum likelihood parameters
build_model <- function(u){
  dlmModReg(X = dat$HRW_c1, dV = exp(u[1]), dW = exp(u[2:3]))
}

mod_mle <- dlmMLE(y = dat$ASX_c1_us, parm = rep(0, 3), build = build_model) # Actual estimation of maximum likelihood parameters

# Build dynamic linear model with MLE parameters
mod <- build_model(u = mod_mle$par)

# Currently matrix G is an identity matrix of size 2...
mod$GG

     [,1] [,2]
[1,]    1  0.0
[2,]    0  1

# So we edit it to suit our needs, let's assume a = 0.2 so that 1 - a = 0.8
mod$GG[2, 2] <- 0.8
mod$GG

     [,1] [,2]
[1,]    1  0.0
[2,]    0  0.8

# Apply the Kalman filter
mod_filtered <- dlmFilter(y = dat$ASX_c1_us, mod)

The estimated level and slope coefficients can be obtained with:

mod_filtered$m

The first column is for the level and the second column is for the slope. This model does a pretty good job fitting the data. In the following visualization, I plot the actual series against the estimated levels (in red):

plot(dat$ASX_c1_us, type = "l")
lines(mod_filtered$m[-(1:2), 1], col = "red")

yaji · January 7, 2021, 4:46am

@gueyenono
Thank you very much. You fully understand my strugle which is the constant ab in the equation of β. Your interpretation of long term mean and the price series is also correct. Sorry for my lack of explanation.

Could I please ask some questions to your answer?

In your transformation of β, can I understand b is absorbed in α in the equation of y?
What we would like to know is the value of β(theoritically this is the optimal hedge ratio). Because we get rid of b in the transformation, we do not know β. We only know β^hat. Do you know the way we somehow estimate b as well?
I did not realize we could modify GG like that so this is very helpful. Is it possible to estimate a by Maximum likelihood method like the one you did for variances?

gueyenono · January 7, 2021, 6:25am

@yaji

You're very welcome.

The answer is no. It can be shown that if a stationary AR(1) process is of the form:

y_t = \alpha + \beta y_{t-1} + \varepsilon_t

Then the mean is:

\mu = E(y_t) = \frac{\alpha}{1 - \beta}

If you demean the series, then the constant disappears:

y_t - \mu = \beta( y_{t-1} - \mu) + \varepsilon_t

So the constant ab is not absorbed. If I were to fully write the new system explicitly, it would be:

y_t = \alpha_t + (\beta_t - b) x_t + \epsilon_t \\ \alpha_t = \alpha_{t-1} + v_t \\ \beta_t - b = (1-a)(\beta_{t-1} - b) + w_t

I'll look into this and get back to you whenever possible.
The main difference between classical regressions and dynamic linear models is that in classical regressions, you seek to estimate the system parameters (i.e. coefficients); however, in DLMs we're are more interested in the hyperparameters (i.e. the observation and state variances). Based on this fact, I really do not think that a model parameter such as a can be estimated using the dlmMLE() function. You may want to estimate the coefficient 1-a and then derive the value of a as a first step using a classical method. Then you can plug it into the G matrix in the DLM. If you want to use maximum likelihood, check the answer of this question: https://stackoverflow.com/questions/18685629/manual-maximum-likelihood-estimation-of-an-ar-model-in-r

yaji · January 7, 2021, 6:58am

@gueyenono
Thanks a lot.

I completely understand 1. and 3. thanks to you.
You have already been a great help but if you could provide me with something in relatino to 2., that would be appreciated.

gueyenono · January 7, 2021, 7:55am

@yaji

The only thing I can think of now regarding point number 2 is that you can keep the same estimation process in R with a slightly different narrative regarding the model for \beta_t. We can just assume that the data generating process for \beta_t has no ab constant term by adding it to the error term:

\beta_t = (1-a)\beta_{t-1} + z_t

where z_t = ab + w_t. Again, the estimation in R remains the same, but the narrative is different. This way the estimated slope is indeed \beta_t and not the demeaned version. Also, here is how it converges to this value in the long-run:

plot(mod_filtered$m[, 2], type = "l", col = "blue")

Rplot2
By the way, I did not mention it earlier (you may already know it), but the optimal hedge ratio according to this model is:

mod_filtered$m[nrow(mod_filtered$m), 2]

[1] 0.05429177

5.43\%

yaji · January 7, 2021, 8:28am

Hi @gueyenono, thank you for your reply again and again.

I think the system you gave me last time is the best. We cannot identify optimal hedge ratio itself (=β) but we get to know how β moves in time from β - b.

As you may be aware, your last model is hard to interprete as the equation can be rewritten:

This equation does not look like mean reversion model.

Anyway, thank you so much for your great help!

gueyenono · January 7, 2021, 11:57pm

@yaji

I am not sure why you think that the last model is hard to interpret. Also, it definitely exhibits a mean-reverting process so long as 0 < a < 1. This quick simulation here can show it to you:

n <- 2000
z <- rnorm(n)
a <- 0.5
b <- 0.8

y <- vector(length = n)
y[1] <- 0

for(t in 2:n){
  y[t] <- (1-a)*y[t-1] + a*b + z[i]
}

z2 <- z[-(1:(n/2))]

plot(z2, type = "l")

Rplot2

Finally, I am glad I was able to help you. Consider marking my answer as the solution to the question for people who may look for help in the future.

system · January 14, 2021, 11:57pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.