Visualizing interaction terms


#1

Hi there,

I'm trying to plot an interaction effect by using effect and plot functions that are described in the following blog post: http://data.library.virginia.edu/visualizing-the-effects-of-proportional-odds-logistic-regression/

My model consists of an ordinal (0,1,2,3) dependent variable (SOICAP), two independent variables (TRA and REL) and a couple of controls. The independent variables are discrete, taking values in the range of 0 to 5. Theoretically speaking, imagine that these two ind. variables are two mechanisms that are (each individually) supposed to increase the outcome. However, I'm also interested to test if they always complement each other or there might be some occasions when they generate substitute effects on the outcome. I ran polr function and here is the result:

Coefficients:
Value Std. Error t value
SIZE_centered 0.7102 0.2345 3.0282**
ASSIM_centered 0.6353 0.2840 2.2366*
PRIOR_centered 6.8723 2.1280 3.2295***
SECTORindustrial -0.1389 0.5833 -0.2382
SECTORnatural stone 0.4087 0.6829 0.5985
SECTORconstruction 2.6778 1.1434 2.3419
TRA_centered 0.9997 0.3001 3.3317***
REL_centered 0.6590 0.2559 2.5756**
TRA_REL_centered -0.5858 0.2319 -2.5263**

Intercepts:
Value Std. Error t value
0|1 -0.6726 0.3805 -1.7676
1|2 1.7666 0.4140 4.2670
2|3 4.2472 0.6224 6.8243

Residual Deviance: 155.6056
AIC: 179.6056

So as the coefficients show, the interaction is negative, and we could assume a substitute effect between the two mechanisms. But the plot (attached), as far as I understand it, shows positive interactions since the probability of SOICAP=3 increases dramatically when both TRA and REL increase.

Could anyone please help me out finding what's going on? Am I interpreting the plot correctly?

Best,
Babak


#2

Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.


#3

I am not positive what that plot is doing, so it would be helpful to include the code for the function that creates it (in a reprex, as @mara asked).

I believe it's that the odds do increase when TRA_centered goes up, because the interaction term coefficient is lower than the TRA_centered term's coefficient. On average, your dependent variable goes up .9997 but down .5858 with the addition of TRA, compared to REL only, for a net positive change. Does that make sense?


#4

Thank you so much for your reply @Stephen.

Here are the codes that I used for visualizing the effect function:

Effect(focal.predictors = c("REL_centered","TRA_centered"), model26)
plot(Effect(focal.predictors = c("REL_centered","TRA_centered"), model26), rug = FALSE)

Your explanation is very clear. So can we conclude whether REL and TRA generate synergistic or antagonistic effect on the dependent variable? As far as I understood, also from your explanation, the effect of REL on the dependent variable increases by any increase in the value of TRA. Is this statement correct? If it is, then why the coefficient is negative? Or maybe the plot is not conclusive since REL and TRA interact differently in various levels of the outcome variable.


#5

Thank you so much for your reply. Please see my reply to @Stephen below.

Looking forward to hearing your thoughts.


#6

So can we conclude whether REL and TRA generate synergistic or antagonistic effect on the dependent variable?

I would call it "mitigated", instead of "antagonistic".

the effect of REL on the dependent variable increases by any increase in the value of TRA.

Generally, REL's effect decreases when TRA increases. But the predicted probability of SOICAP being higher goes up when you increase REL, for any value of TRA. The amount that it increases probability of higher SOICAP goes down, but the probability of SOICAP being higher still goes up.

To answer your question, in the top row, at SOICAP=3, the slope of the line--the effect of increasing REL for a given value of TRA--gets steeper as you increase TRA. So the average (the coefficient reported in your model) is not what you observe at SOICAP=3, and it appears to be an example of synergy.

But the coefficient of the interaction term in your model is negative because increasing TRA (by moving left to right) decreases the effect of increasing REL on average. That is, the slope is less negative at SOICAP=[0,1] and it's less positive at SOICAP=2. Generally, when you hold the other independent variables to "typical" values, the effect of increasing REL is diminished by .5858 when you include TRA.

This could be a function of having a model that isn't well specified or just not enough data to show it. I wouldn't necessarily take this model and make any bold predictions about different scenarios. That is often the case with ordered logistic regression, though.


#7

Wonderful explanation @Stephen. I will discuss it with my colleague to decide how we should continue with the data analysis and the paper. I think the simplest solution would be that we drop the interaction effect from the model.


#8

In general, one cannot interpret the coefficient estimates of interaction (multiplicative) terms directly, so it's great that you created plots to visualise the interaction effects. It seems that you created a multiplicative term TRA_REL_centered that is the product of TRA_centered and REL_centered. Another way to do this is to tell R to multiply the two variables: TRA_centered * REL_centered.

While the plots show the predicted probability of falling into one of the four categories as the values of TRA_centered and REL_centered change, they do not show whether these effects are statistically significant. To do that, one package I use frequently in my research is interplot (https://cran.r-project.org/web/packages/interplot/vignettes/interplot-vignette.html). interplot allows us to plot the marginal effect of one predictor (x) as the value of a moderator variable changes. In your case, it would give you the estimated marginal effect of TRA on SOICAP as REL changes, that is, it plots the effect of TRA on SOICAP at each value of REL. Where the confidence intervals do not cross zero indicate a statistically significant effect. A good practice is to plot the inverse, that is, the estimated effect of REL on SOICAP as TRA change.

If you're interested in learning more about interactions, here are a couple of references:

Improving Tests of Theories Positing Interaction (http://mattgolder.com/files/research/jop2.pdf)

Understanding Interaction Models: Improving Empirical Analysis (http://mattgolder.com/files/research/pa_final.pdf)

One other thing to think about: there is an assumption here that the effects of the moderator are linear across its range. This may or may not be true, depending on your theory and substantive knowledge. Here's a recent paper discussing this (R package included):

How Much Should We Trust Estimates from Multiplicative Interaction Models (https://www.researchgate.net/publication/315530437_How_Much_Should_We_Trust_Estimates_from_Multiplicative_Interaction_Models_Simple_Tools_to_Improve_Empirical_Practice)


#9

Thanks a bunch for your explanation. Greatly appreciated!

As for your first point, I ran the model with TRA_centered * REL_centered, and observed (almost) the same coefficients. BUT the new plot (attached) is quite different (particularly for SOICAP=3) from that with TRA_REL as a multiplicative term. How could this be possible?

Coefficients (when I used TRA*REL):

SIZE_centered 0.8434
ASSIM_centered 0.5333
PRIOR_centered 7.1007
TRA_centered 0.7901
REL_centered 0.6990
TRA_centered:REL_centered -0.5274

Coefficients (when I used the multiplicative term):

SIZE_centered 0.7723
ASSIM_centered 0.7808
PRIOR_centered 6.9093
TRA_centered 0.7389
REL_centered 0.7383
TRA_centered:REL_centered -0.5212

Anyway, I think the new plot is less ambiguous when it comes to the direction of the interaction effect. For example, at the uppermost row of the plot, the probability of SOICAP increases with REL up to a point where it decreases as a result of higher TRA. So we can conclude that TRA and REL are negatively related to each other, both from the coefficient and the effect plot.

As you suggested, I used interplot to illustrate the marginal effects of REL and TRA. Berry et al.'s paper was quite useful. As far as I understand from the marginal effect plots, the marginal effect of REL on SOICAP is positive and significant at the minimum value of TRA, and negative but not significant at its maximum. This is also true if we look at the inverse plot with REL as the moderator. So the conclusion would be that 1) The interaction between REL and TRA is negative since the slope of the lines in the marginal effect plots are negative, but it is not significant since the line crosses zero. 2) The marginal effect of REL on SOICAP is positive and significant for low-medium values of TRA, and vice versa. 3) Even if the marginal effects turn non-significant for the high values of each predictor variable, we can still accept the support of negative interaction since the distribution of moderator variables in the marginal effect plots show that the percentage of observations falling into the insignificant area is low. Are these statmenets correct?

New plot (TRA*REL instread of the multiplicative term)

Marginal effect plot when TRA is the moderator

Marginal effect plot when REL is the moderator


#10

I'm glad you found my post helpful, and tried out interplot! I'm not sure why the results are different when you use REL_centered * TRA_centered as opposed to creating and using a multiplicative term TRA_REL_centered. The coefficient estimates are similar though, and I suspect their standard errors are as well.

The first plot shows that the marginal effect of REL on SOICAP is positive, decreasing and statistically significant at TRA values below approximately 0.5. At higher values of TRA, however, this effect is not significant. The second plot shows a similar effect of TRA on SOICAP: decreasing and significant effect of TRA at low levels of REL, but not at higher levels. That both marginal effects plots show similar patterns is good (there are times when they are different).

To answer question (3), yes, you can still say that there is a negative interaction effect of TRA/REL on SOICAP as REL/TRA changes, but that it is significant only at lower values of REL/TRA.

In a sense, the marginal effects plots are consistent with your predicted probability plots. See how wide the CIs are at higher values of REL_centered? The histograms that interplot creates show that there are relatively few observations at high values of REL_centered (and TRA_centered).

One final thing you might try: are REL_centered and TRA_centered categorical (factor) variables? If so, you could tell interplot to create a dot-and-whisker plot instead.


#11

No, they are not factors but have fewer values than 10, so the plots were originally dot-and-whisker. But when I included the histogram they turned to line-and-ribbon.

One final question, probably a silly one! What do the dots in the marginal effect plots represent? Are they the mean of the effects of REL on SOICAP at each level of REL? And then what should I consider when assessing the significance, the lines or the CI curves (whether they cross zero)?


#12

If they are not factors, then I think the plots you currently have are fine.

The dot in the dot-and-whisker plot indicates the estimated marginal effect of the predictor at that specific value of the moderator. For instance, if your predictor is TRA_centered, then the dot at REL_centered = -1 is the marginal effect of TRA_centered on SOICAP at REL_centered = -1. For the dot-and-whisker plots, the estimate is statistically significant where the whiskers (lines) do not cross zero.

The dot-and-whisker plot is the discrete equivalent of the marginal effects plot you created (for continuous moderators). Hope this helps!


#13

Thanks again for your guidance!