Correlation between two time series to fill up NAs

Hi everyone,

I am stuck with the analysis of my data. I am working with an 18 year time series of daily temperature measurements (4 measurements per day, at times 00, 06 12, 18 UTC). In the series I have several missing data.

Also, I have the time series (of the same variable and for the same period, with the same measurement times) from a nearby point.

For this reason, I would like to make a correlation between them to see how similar they are, and to see if it is viable to use the second one to complete the missing data of the first one (which is my series of interest).

Is this logic correct? And if so, what kind of correlation should I do and how should I perform it?
Any conceptual contribution will help me to better understand my problem.

Thank you very much!

This is a simple possibility with a linear regression.

library(tidyverse)

# define data set with some missing values in x1
data <- tibble(id = 1:100,
               x1 = runif(100),
               x2 = x1 + 10 + runif(100)*0.2) %>%
  mutate(x1 = if_else(0 == id %% 7, NA_real_, x1))
         
# fit a model
model <- lm(x1 ~ x2, data = data)
print(summary(model))  
#> 
#> Call:
#> lm(formula = x1 ~ x2, data = data)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -0.111235 -0.044490  0.003034  0.046862  0.100762 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -9.53580    0.23622  -40.37   <2e-16 ***
#> x2           0.94763    0.02227   42.55   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.05603 on 84 degrees of freedom
#>   (14 observations deleted due to missingness)
#> Multiple R-squared:  0.9557, Adjusted R-squared:  0.9551 
#> F-statistic:  1811 on 1 and 84 DF,  p-value: < 2.2e-16

# impute for missing values of x1
data <- data %>%
  mutate(
    imputed = is.na(x1),
    x1_imputed = if_else(imputed, predict(model, data), x1)
  )

# plot
ggplot(data) +
  aes(x2, x1_imputed, color = imputed) +
  geom_point()

Created on 2021-05-31 by the reprex package (v1.0.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.