Hi all,
I am have the following sample dataset that contains stocks being coded as a number( e.g. 10026), and the Fama-French factors mktrf, hml and smb.
I am trying to perform the following regression:
lm(formula= "`10026` ~ mktrf + hml + smb", data= data ,na.action = na.omit)
This works just fine for the single stock, but I need to write a loop so that the I don't have to run the regression over and over again for every stock, especially since my real data set is much bigger.
Once the regression is completed, I will need to compute the SD of residuals on a monthly basis, so ideally I require a data frame that lists the residuals, and has a date column and the respective stock numbers as columns.
Here are my approaches for the loop so far, which all failed:
# creating a list with dependent variables
depVarList <- names(x = retDailyMerged)[2:7670]
# loop
lapply(X = depVarList,
FUN = function(t) lm(formula = paste0("`", t, "` ~ mktrf + hml + smb"), data = retDailyMerged))
# alternative loop
for (i in depVarList)
{lm(formula= "i ~ mktrf + hml + smb", data= retDailyMerged,na.action = na.omit)}`
-> dataset
structure(list(date = structure(c(17533, 17534, 17535, 17536,
17539), tzone = "UTC", tclass = "Date", class = "Date"), `10026` = c(NA,
-0.00998786767581905, 0.0138127158236847, -0.00955052427703185,
0.000741739716790146), `10028` = c(NA, 0.0102061855670104, -0.0205122971731809,
-0.0103146488851844, 0.0631645436361723), `10032` = c(NA, 0.00130975769482644,
0.0122629169391759, 0.00492650621870472, 0.0466929197138954),
`93436` = c(NA, -0.0102330515084391, -0.00828999211977932,
0.00622970567668935, 0.0626382292829057), mktrf = c(0.0085,
0.0059, 0.0042, 0.0066, 0.0019), smb = c(0.0036, -0.0039,
-0.0026, -0.0034, -0.0016), hml = c(-0.0022, -0.0021, 0.0024,
-0.0026, 7e-04)), class = c("data.table", "data.frame"), row.names = c(NA,
-5L), sorted = "date")