Loop regressions (by geographic area)

clauser · June 16, 2020, 2:16am

Hello,
I'm new in R. I have a data set with 245 variables and 32 regions or geographic area. I need to obtain the ols model coefficients for each of the 32 areas,
I thought this code was going to work:

y<- c(MMSI_2016$escacu_inf) #dependent variable
x1<- c(MMSI_2016$escacu_pad) #independent variable
x2<- c(MMSI_2016$escacu_mad) #independent variable
edo<- c(MMSI_2016$ubica_geo_enh) #This variable goes fron 1:32
predictorlist<- list("x1","x2")

for (i in edo){
model[i] <- lm(y[i] ~ x1[i] + x2[i])
summary(model_[i])
}

It works but it returns the result for all the data set, not each different region (1:32)
I'll be very grateful if someone can help me.
Thanks

FJCC · June 16, 2020, 2:50am

You can do this clleanly with the map function from the purrr package.
See this vignette for an example.

library(dplyr)
library(purrr)

###Inventing some data
DF <- data.frame(pad = runif(100, min = 0, max = 10),
                 mad = runif(100, min = 0, max = 10),
                 noise = runif(100),
                 ubica = rep(1:10, each = 10))
#Calculate the inf column with known coef
DF <- DF %>% mutate(inf = 3 * pad - 2 * mad + 2.3 + noise)
###


FITS <- DF %>% split(.$ubica) %>% 
  map(~lm(inf ~ pad + mad, data = .)) %>% 
  map(summary)
FITS$`1`
#> 
#> Call:
#> lm(formula = inf ~ pad + mad, data = .)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.47000 -0.19139  0.03641  0.20225  0.49864 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  2.90177    0.36210   8.014 9.02e-05 ***
#> pad          2.97840    0.05238  56.866 1.36e-10 ***
#> mad         -2.01318    0.04507 -44.670 7.36e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.3606 on 7 degrees of freedom
#> Multiple R-squared:  0.9986, Adjusted R-squared:  0.9983 
#> F-statistic:  2570 on 2 and 7 DF,  p-value: 9.28e-11

^{Created on 2020-06-15 by the reprex package (v0.3.0)}

clauser · June 16, 2020, 2:59am

Thanks a lot. I'll try it!

system · July 7, 2020, 2:59am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Loop regressions (by geographic area)

for (i in edo){ model[i] <- lm(y[i] ~ x1[i] + x2[i]) summary(model_[i]) }

for (i in edo){
model[i] <- lm(y[i] ~ x1[i] + x2[i])
summary(model_[i])
}