Hello! Any help with my question would be greatly appreciated, and I thank you for your time in advance. I am fairly new to R, so apologies if this is a simple question. Also, I checked for duplicate posts and found a similar post, but my question is slightly different because it deals with the data in a long format.
Here are screenshots of the data I am working with:
The columns I am interested in are 'dma' 'weekly_deaths' and 'gtrends'. I would like to regress the gtrends data onto the weekly_deaths data for each unique 'dma'. So for example, I would like to create a simple linear regression model for gtrends ~ weekly_deaths for all of the rows with a dma =1, then do the same thing for dma =2, so on and so forth. The data for each unique dma comes after each other, as seen from the second screenshot (the dma =2 data starts after the last row for dma =1 data).
There are 210 dmas total, which is why I would like a loop to do the regressions for me instead of running 210 separate ones.
Ultimately, I would like the loop to give me the linear regression coefficient, p-value, and multiple r-squared for each dma regression.
I've tried things like:
reg <- total_gtrends_deaths_df %>% group_by(total_gtrends_deaths_df$dma) %>% do(model=lm(total_gtrends_deaths_df$gtrends ~ total_gtrends_deaths_df$weekly_deaths))
and
for (i in 1:160) {
reg <- lm(weekly_deaths~gtrends, data = subset(total_gtrends_deaths_df, dma==i))
}
to no avail (again, I am very new to R so apologies if these are bad attempts).
Does anyone have any suggestions? Thank you so much! I really appreciate any and all help.