# Coefficient of determination for each month

I have two datasets as in the link.
I am trying to calculate the coefficient of determination for each month i.e. jan-dec for Tmax.
How could I do it?
Thanks

``````df=read_csv("df.csv",col_names = TRUE)
``````
``````# A tibble: 6 x 4
datetime             Tmax month  year
<dttm>              <dbl> <dbl> <dbl>
1 1976-09-24 15:00:00 20.3      9  1976
2 1976-10-06 09:15:00 15.2     10  1976
3 1976-11-27 16:00:00 17.8     11  1976
4 1976-12-06 15:00:00  2.54    12  1976
5 1977-01-09 20:45:00  7.62     1  1977
6 1977-02-24 00:30:00 20.3      2  1977
``````
``````df1=read_csv("df1.csv",col_names = TRUE)
``````
``````# A tibble: 6 x 4
DateTime             Tmax month  year
<dttm>              <dbl> <dbl> <dbl>
1 1976-09-24 15:45:00  17.2     9  1976
2 1976-10-06 09:45:00  11.0    10  1976
3 1976-11-28 11:00:00  15      11  1976
4 1976-12-06 15:00:00   4      12  1976
5 1977-01-14 08:00:00   9       1  1977
6 1977-02-24 00:15:00  12       2  1977
``````

This is more of a methodological problem than a coding problem. There are at least two issues.

1. Six points may not be very informative
2. Sequentially ordered observations * may*present autocorrelation issues, which violates an assumption in obtaining R^2
``````x <- seq(1:6)
y <- c(17.2,11.0,15,4,9,12)
plot(x,y)
`````` ``````fit <- lm(x ~ y)
summary(fit)
#>
#> Call:
#> lm(formula = x ~ y)
#>
#> Residuals:
#>       1       2       3       4       5       6
#> -1.3348 -1.5732  0.2258 -0.9715  1.0273  2.6265
#>
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept)   5.7705     2.1244   2.716   0.0532 .
#> y            -0.1998     0.1751  -1.141   0.3177
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.817 on 4 degrees of freedom
#> Multiple R-squared:  0.2454, Adjusted R-squared:  0.05676
#> F-statistic: 1.301 on 1 and 4 DF,  p-value: 0.3177
par(mfrow = c(2,2))
plot(fit)
`````` ``````gvlma::gvlma(fit)
#>
#> Call:
#> lm(formula = x ~ y)
#>
#> Coefficients:
#> (Intercept)            y
#>      5.7705      -0.1998
#>
#>
#> ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
#> USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
#> Level of Significance =  0.05
#>
#> Call:
#>  gvlma::gvlma(x = fit)
#>
#>                     Value p-value                Decision
#> Global Stat        3.0222  0.5541 Assumptions acceptable.
#> Skewness           0.3768  0.5393 Assumptions acceptable.
#> Kurtosis           0.2369  0.6264 Assumptions acceptable.
#> Link Function      1.6805  0.1949 Assumptions acceptable.
#> Heteroscedasticity 0.7279  0.3936 Assumptions acceptable.
``````

Created on 2020-09-29 by the reprex package (v0.3.0.9001)

2 Likes

``````lm(mtcars\$mpg ~ mtcars\$drat)