Evaluate correlation between two values for multiple columns

0

Version: RStudio 2022.07.0+548 "Spotted Wakerobin" Release (34ea3031089fa4e38738a9256d6fa6d70629c822, 2022-07-06) for macOS Mozilla/5.0 (Macintosh; Intel Mac OS X 12_4_0) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.12.10 Chrome/69.0.3497.128 Safari/537.36

Hi,

I have a data set containing dates in julian days, and snow precipitation in different time periods (e.g. hebd_04_a_snow). My goal is to determine if there's a relationship between a date and it's corresponding snow precipitation value. I know how to get my r2 value, my p value, etc. by the summary() function :

regression <- lm (date_test$datesJulian ~ datesponte$hebd_04_a_snow)

summary(regression)

Call: lm(formula = date_test$datesJulian ~ datesponte$hebd_04_a_snow)

Residuals:
     Min       1Q   Median       3Q      Max 
-15.4346  -3.0864  -0.8205   1.1795  16.5654 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)               133.4346     1.7879  74.632   <2e-16 ***
datesponte$hebd_04_a_snow   0.7199     0.7663   0.939    0.355    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.005 on 29 degrees of freedom
Multiple R-squared:  0.02953,   Adjusted R-squared:  -0.003934 
F-statistic: 0.8824 on 1 and 29 DF,  p-value: 0.3553

The thing is that I'll have more than 150 climate values just like «hebd_04_a_snow» column to determine if there's strong relationship or not.

In the following lines, I provide columns of climate values just like the ones I'm using:

structure(list(datesJulian = c(151L, 132L, 132L, 132L, 131L, 
132L, 134L, 137L, 131L, 135L, 134L, 130L, 133L, 140L, 135L, 132L, 
132L, 137L, 140L, 138L, 135L, 137L, 140L, 131L, 150L, 133L, 134L, 
140L, 128L, 118L, 134L), hebd_04_a_snow = c(3.14285714285714, 
0.914285714285714, 2.02857142857143, 3.31428571428571, 3.31428571428571, 
2.28571428571429, 0, 3.31428571428571, 0.914285714285714, 3.31428571428571, 
2.02857142857143, 0.314285714285714, 3.31428571428571, 2.4, 2.2, 
3.14285714285714, 3.14285714285714, 3.31428571428571, 3.31428571428571, 
3.31428571428571, 2.4, 3.31428571428571, 0, 0, 0, 2.94285714285714, 
0, 0, 0, 0, 0), hebd_05_a_snow = c(0, 0, 0, 0, 0, NA, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0), bihe_04_a_snow = c(1.81428571428571, 0.6, 1.9, 2.12857142857143, 
2.22857142857143, 2.34285714285714, 0, 2.12857142857143, 0.6, 
2.12857142857143, 1.9, 0.157142857142857, 2.12857142857143, 1.27142857142857, 
2, 1.81428571428571, 1.81428571428571, 2.12857142857143, 2.22857142857143, 
2.12857142857143, 1.27142857142857, 2.22857142857143, 0.0571428571428571, 
0, 0.0571428571428571, 1.47142857142857, 0.0571428571428571, 
0, 0, 0, 0), bihe_05_a_snow = c(0, 0, 0, 0.214285714285714, 0, 
NA, 0, 0.214285714285714, 0, 0.214285714285714, 0, 0, 0.214285714285714, 
0, 0, 0, 0, 0.214285714285714, 0, 0.214285714285714, 0, 0, 0.228571428571429, 
0, 0.228571428571429, 0, 0.228571428571429, 0, 0, 0, 0), mens_4_snow = c(0.907142857142857, 
0.3, 0.95, 1.17142857142857, 1.22142857142857, 1.49285714285714, 
0.407142857142857, 1.17142857142857, 0.3, 1.17142857142857, 0.95, 
0.792857142857143, 1.17142857142857, 1.30714285714286, 1.07142857142857, 
0.907142857142857, 0.907142857142857, 1.17142857142857, 1.22142857142857, 
1.17142857142857, 1.30714285714286, 1.22142857142857, 0.371428571428571, 
0.407142857142857, 0.371428571428571, 0.735714285714286, 0.371428571428571, 
0.171428571428571, 0.171428571428571, 0.171428571428571, 0.171428571428571
)), row.names = c(NA, -31L), class = c("tbl_df", "tbl", "data.frame"
))

In summary, is there an effective way to figure out which set of date is correlated to my date values ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.