I was hoping someone can help me here with my struggle.
So, I am writing my thesis in behavioral finance and am analyzing mutual fund flows during covid to see if people exhibit risk averse behavior.
My Model is:
Fund Flow = beta1 * recent market return (SP500) + beta2 * implied market volatility (VIX) + beta3 * recent fund performance (last month return) + beta4*long term fund performance (5 year yield)
I want to make a regression to see which coefficient is impacting the fund flows most.
I have daily data (except weekends) for the fund flows, VIX and SP500
Last month performance is the same for the whole month and the 5 year yield is a constant number.
My question is now how can I make a regression with these time series. I don't need to forecast or plot. I only need the coefficients (hopefully statistically significant).
See this chapter from Hyndman's text for time-series specific application of linear regression that addresses such factors as trending and seasonality.
I collected my data already. I have the daily fund flows from Bloomberg for 70 funds. The monthly return, the 5 year yield and the daily return of S&P500 and VIX daily price.
I have it structured in 5 timeframes each 3 month.
That's the answer then. In each regression, long-term performance is a constant. lm() automatically includes a constant, so you have perfect multicollinearity. Nothing to do with R really. In your specification there's no way to separate the effect of long-term performance from an overall constant.