Automate Regression in R - Calculate FamaFrench 3 Factor alpha

Every R problem can be thought of with advantage as the interaction of three objects— an existing object, x , a desired object,y , and a function, f, that will return a value of y given x as an argument. In other words, school algebra— f(x) = y. Any of the objects can be composites.

In this case, x is your database, y is your database augmented by an additional variable an intercept value from a regression model. Both x and y are data frames—each contains observations of an object of interest, crsp_fundno arranged row-wise and containing variables, some of which will be used as arguments to lm, which will return an object of class lm, call it fit, containing the value of interest, the intercept, fit$coefficients[1].

Using these pieces we can construct f.

The first thing to note is that functions are first-class objects, which means that they can be given as arguments to other functions. It is convenient to work inside outwards and to create an auxiliary function:

get_intercept <- function(x) {
  (lm(mretFFr$mexret ~ Mkt_RF + SMB + HML, 
      data = your_data[x,]))$coefficients[1]
}

NB: variable names cannot contain blanks or operators; Mkt-RF changed to Mkt_RF. Also, we would normally parameterize your_data and the other arguments, rather than hardwiring them.

get_intercept takes an argument, x (the crsp_fundno of interest, distinct from the nomenclature for the formal object x) and returns the value of a linear regression's intercept coefficient, which is the desired portion of fit to add to each selected crsp_fundno.

Thus

get_intercept(64487)

will return the value for the intercept to be placed, FamaFrench3-factor alpha, which I'll call ff3fa. It would be best for this new variable to be provisioned beforehand.

your_database[,"ff3fa"] <- NA

Another helper function will make the placement

place_intercept <- function(x) your_data[x,"ff3fa"] = get_intercept(x)

We now have a way to place a single crsp_fundno into y

place_intercept(64487)

An auxiliary object, fund_list can be used to identify the specific crsp_fundno to be so processed.

fund_list <- c(
	97403,62638,98168,92509,93172,69885,87073,51929,
	81727,64998,68432,87733,78200,92599,59821,59391,
	51450,56856,94761,65606,60274,94622,50572,65734,
	91201,59542,72588,87752,97495,62544,90312,81084,
	83960,84608,70966,80280,74213,98558,66360,61703,
	96572,98795,71403,94230,90321,81786,85710,92169
	)

From there

lapply(fund_list, place_intercept)

which leads to f and its application

add_intercepts <- function(x) lapply(x, place_intercept)
add_intercepts(fund_list)

See the FAQ: How to do a minimal reproducible example reprex for beginners to illuminate why the specific code may not be reliable in the absence of a representative data object on which to test. Also, I express no opinion as to the appropriateness of any intended application of the intercept in this case.

1 Like