Regression in R, indexing of statistics

I am doing a simple linear regression using a nested for loop. I am plotting all ys versus a particular x at a time.
I am new to R, there is something wrong with the statistics code, I am getting this error:

Error in summary1[i, j] <- summary(lm(y[, j] ~ x[, i])) :
incorrect number of subscripts on matrix

Here is the code:

X <-
  mtcars[, c("mpg",
             "cyl",
             "disp",
             "hp")]

x = data.matrix(X)

Y <-
  mtcars[, c("drat",
             "wt",
             "qsec",
             "vs",
             "am")]

y = data.matrix(Y)

for (i in 1:ncol(x)) {
  summary1 <- list()
  summary2 <- list()
  correlation <- list()
  confidence <- list()
  
  for (j in 1:ncol(y)) {
    summary1[j,i] <- summary(lm(y[, j] ~ x[, i]))
    summary2[j,i] <- summary.aov(lm(y[, j] ~ x[, i]))
    correlation[j,i] <- cor(y[, j], x[, i])
    confidence[j,i] <- confint(lm(y[, j] ~ x[, i]))
  }
} 

summary1
summary2
correlation
confidence

You are trying to index empty lists as if they are two dimensional objects. You should look into the difference between indexing a list with [ ] and with [[ ]]. I suggest you use nested lists as shown below. I named the list elements, which is not required but I think would make the output easier to understand.

X <-
  mtcars[, c("mpg",
             "cyl",
             "disp",
             "hp")]

x = data.matrix(X)
Xnames <- colnames(x)
Y <-
  mtcars[, c("drat",
             "wt",
             "qsec",
             "vs",
             "am")]

y = data.matrix(Y)
Ynames <- colnames(y)

summary1 <- list()
summary2 <- list()
correlation <- list()
confidence <- list()
for (i in 1:ncol(x)) {
  summary1_j <- list()
  summary2_j <- list()
  correlation_j <- list()
  confidence_j <- list()
  for (j in 1:ncol(y)) {
    summary1_j[[Ynames[j]]] <- summary(lm(y[, j] ~ x[, i]))
    summary2_j[[Ynames[j]]] <- summary.aov(lm(y[, j] ~ x[, i]))
    correlation_j[[Ynames[j]]] <- cor(y[, j], x[, i])
    confidence_j[[Ynames[j]]] <- confint(lm(y[, j] ~ x[, i]))
  }
  summary1[[Xnames[i]]] <- summary1_j
  summary2[[Xnames[i]]] <- summary2_j
  correlation[[Xnames[i]]] <- correlation_j
  confidence[[Xnames[i]]] <- confidence_j
} 

summary1[["mpg"]][["drat"]]
#> 
#> Call:
#> lm(formula = y[, j] ~ x[, i])
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.7163 -0.2712 -0.0238  0.2491  0.8827 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  2.38249    0.24841   9.591 1.20e-10 ***
#> x[, i]       0.06043    0.01186   5.096 1.78e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.3979 on 30 degrees of freedom
#> Multiple R-squared:  0.464,  Adjusted R-squared:  0.4461 
#> F-statistic: 25.97 on 1 and 30 DF,  p-value: 1.776e-05

Created on 2020-04-12 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.