Why do we use random effects in models?

I have a stupid question that unfortunately stems from a lack of knowledge. I have data on heart failure patients and am creating a model for them with both fixed and random effects. What is a random effect in general? And why was it necessary to use it for such data?

1 Like

I will try to answer this but keep in mind that I am not an expert; I am just some guy on the internet. Fixed effects are factors in the data whose level you control and limit to certain values. Random effects are factors that contribute to the outcome but whose levels are not fully sampled or even, perhaps, understood. For example, in a medical study you might be measuring the concentration some blood component and you have a fixed effect with two levels:

  1. treat with new drug
  2. do not treat with new drug

A random effect might be the identity of the hospital where each patient is treated. It is not practical to sample all possible hospitals and you might not even know just why different hospitals give different results. The label of "hospital" might be a stand in for different funding, population, medical culture or many other things.

When you look at the raw results, not taking hospitals into account, you might find a very wide spread in the concentration in both levels of the fixed effect and you cannot see any effect above the data variance. Adding the "random effect" of the hospital, you might find that patients at different hospitals start with different blood concentrations and that there is a clear difference between the two fixed effect levels within each hospital. The random effect of the hospital accounts for much of the observed raw variance and accounting for that allows you to see the fixed effect.
Does that help?

2 Likes

You might want to look at Fixed effects model - Wikipedia.

Fixed effects are factors in the data whose level you control and limit to certain values. Random effects are factors that contribute to the outcome but whose levels are not fully sampled or even, perhaps, understood.

Not sure I agree. For example, if there were only 4 hospitals in the study, you would probably include them as fixed effects. It often comes down to the practical issue of how many levels of the variable exist, rather than a philosophical difference between the two approaches in my opinion.

  • how many levels of the variable. For many levels, including as fixed effects will explode the models degrees of freedom and make the model useless. The decision to use random effects is most often driven by this (at least as far as I've seen).
  • if and how you want to compare the levels. Fixed effects gives you a lot more ability to do this.
  • how the models are fit - fitting random effects uses a clever application of maximum likelihood. for some reason, this is always glossed over in how the subject is taught, but without understanding how this is done, you won't truly understand.

For the record, I'm also a random guy on the internet. :smiley:

The way that mixed models are traditionally taught can be confusing. It caused me a lot of head scratching.

The random effect term allows each level to have its own intercept. This is really confusing because this is also true of fixed effects.

And I feel like there's a lot of beating around the bush on the topic of justifying a random effects term. Almost a denial or rationalization that it's not all about degrees of freedom :innocent:.

Hope I'm not totally off-base - when it comes to stats, I'm a pure practitioner.

If you are talking about a regression model, you can test for fixed vs. random effects using a Hausman test. See Chapter 6 Fixed or random effects | An Introduction to R, LaTeX, and Statistical Inference

1 Like

Thank you guys :slight_smile: I will check all information which I got from you.

Just to add to the general confusion.
https://statmodeling.stat.columbia.edu/2005/01/25/why_i_dont_use/

1 Like

I would suggest consulting someone with more expertise to help with your modeling. For help with learning nuances of statistical modeling I really like Frank Harrell’s work https://www.fharrell.com/ and he has lots of helpful resources

He has a great place to answer questions and a helpful forum for discussion https://discourse.datamethods.org/

If your data analysis has important consequences or is for a published paper I would see if you can get a statistician to help you.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.