 # Loop regressions (by geographic area)

Hello,
I'm new in R. I have a data set with 245 variables and 32 regions or geographic area. I need to obtain the ols model coefficients for each of the 32 areas,
I thought this code was going to work:

y<- c(MMSI_2016\$escacu_inf) #dependent variable
edo<- c(MMSI_2016\$ubica_geo_enh) #This variable goes fron 1:32
predictorlist<- list("x1","x2")

## for (i in edo){ model[i] <- lm(y[i] ~ x1[i] + x2[i]) summary(model_[i]) }

It works but it returns the result for all the data set, not each different region (1:32)
I'll be very grateful if someone can help me.
Thanks

You can do this clleanly with the map function from the purrr package.
See this vignette for an example.

``````library(dplyr)
library(purrr)

###Inventing some data
DF <- data.frame(pad = runif(100, min = 0, max = 10),
mad = runif(100, min = 0, max = 10),
noise = runif(100),
ubica = rep(1:10, each = 10))
#Calculate the inf column with known coef
DF <- DF %>% mutate(inf = 3 * pad - 2 * mad + 2.3 + noise)
###

FITS <- DF %>% split(.\$ubica) %>%
map(summary)
FITS\$`1`
#>
#> Call:
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -0.47000 -0.19139  0.03641  0.20225  0.49864
#>
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept)  2.90177    0.36210   8.014 9.02e-05 ***
#> pad          2.97840    0.05238  56.866 1.36e-10 ***
#> mad         -2.01318    0.04507 -44.670 7.36e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.3606 on 7 degrees of freedom
#> Multiple R-squared:  0.9986, Adjusted R-squared:  0.9983
#> F-statistic:  2570 on 2 and 7 DF,  p-value: 9.28e-11
``````

Created on 2020-06-15 by the reprex package (v0.3.0)

Thanks a lot. I'll try it! This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.