Pearson Correlation between years on climate data

Hello everyone,

I need some hints to do the pearson correlation (COR) test between years.

I want to get the analog years using a climate index, so I want to do COR of the last six months (sep 2019 to march 2020) with all the same period of the previous years from 1950 to 2019 and choose the years with higher correlation. Which could be the better way to do this?

The data is this: https://tropical.colostate.edu/colorado-state-university-amo/

I have a big Excel Sheet that does the calculation but Im looking for a new a fastest way using R.

Thanks a lot

If you can get the individual sequences of six months into a list, you can use one of the map functions from the purrr package to iterate along the list and calculate the correlations. In the example below, I invented some data to illustrate the idea.

Data2020 <- 1:6
set.seed(1)
#invent some data
ListOfYears <- list(Data2017 = 1:6 + rnorm(6), 
                    Data2018 = 1:6 + rnorm(6, 0, 2), 
                    Data2019 = 1:6 + rnorm(6, 0, 4), 
                    Data2020 = 1:6)
GetR <- function(x) cor(Data2020, x)
Rvalues <- purrr::map_dbl(ListOfYears, GetR)
Rvalues
#>  Data2017  Data2018  Data2019  Data2020 
#> 0.9031604 0.8602935 0.7674164 1.0000000
sort(Rvalues)
#>  Data2019  Data2018  Data2017  Data2020 
#> 0.7674164 0.8602935 0.9031604 1.0000000

Created on 2020-04-21 by the reprex package (v0.3.0)

1 Like

Woww FJCC!!

It gave me a great idea of how to start working. I'm going be on that. If I have a new or more specific question, I'll ask you again.

Thanks!!!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.