Multi factor time-series regression analysis

Hallo to everyone!

I was hoping someone can help me here with my struggle.
So, I am writing my thesis in behavioral finance and am analyzing mutual fund flows during covid to see if people exhibit risk averse behavior.

My Model is:
Fund Flow = beta1 * recent market return (SP500) + beta2 * implied market volatility (VIX) + beta3 * recent fund performance (last month return) + beta4*long term fund performance (5 year yield)

I want to make a regression to see which coefficient is impacting the fund flows most.
I have daily data (except weekends) for the fund flows, VIX and SP500

Last month performance is the same for the whole month and the 5 year yield is a constant number.

My question is now how can I make a regression with these time series. I don't need to forecast or plot. I only need the coefficients (hopefully statistically significant).

I really hope someone can help me.

See this chapter from Hyndman's text for time-series specific application of linear regression that addresses such factors as trending and seasonality.

The lm() function will do the regression. The only issue is setting up your data. What have you done so far?

I collected my data already. I have the daily fund flows from Bloomberg for 70 funds. The monthly return, the 5 year yield and the daily return of S&P500 and VIX daily price.

I have it structured in 5 timeframes each 3 month.

Why not just run
lm(FundFlow ~ SP500 + VIX + last_month_return + long_term_fund_performance)?

R doesn't accept long-term performance cause its just one value while all others are time series

Is long-term performance one value for each firm or just one value overall?

one value for each fund

Are running separate regressions for each fund?

Maybe it would help if you could show us a short picture of how the data you feed into lm() is organized.

yes, I was going to make a regression for each fund.
But so far with little luck, because R2 is extremely low

That's the answer then. In each regression, long-term performance is a constant. lm() automatically includes a constant, so you have perfect multicollinearity. Nothing to do with R really. In your specification there's no way to separate the effect of long-term performance from an overall constant.