# Finding Full- and Reduced-Model Residuals

I'm trying to illustrate how a simple ANOVA works.

I have mercury levels (`Mercury`) for three types of lakes (`Lake_Type`): Eutropic, Mesotropic, and Oligotropic. The ANOVA compares residuals from a full model and from a reduced model. The reduced model assumes the mean mercury level is the same for all lakes; The full model assumes there are different means for each lake type. I need to find the residuals for each type.

A column containing reduced model residuals is not hard to calculate:

``````YY <- mean(dt\$Mercury)
dt <- cbind (dt, "Reduced" = dt\$Mercury - YY)
``````

I had a little trouble figuring out how to calculate the means for each column. I can do

``````YG <- with(dt, tapply(Mercury, Lake_Type, mean))
``````

to get a list:

``````   Eutropic  Mesotropic Oligotropic
0.5527551   0.4523488   0.3826316
``````

which lets me access the means as `YG["Eutropic"]` if I want.

Alternatively, I can `aggregate()` over the list of names

``````YG = aggregate(dt\$Mercury, list(LT = dt\$Lake_Type), mean)
``````

which gives me a data frame

``````           LT         x
1    Eutropic 0.5527551
2  Mesotropic 0.4523488
3 Oligotropic 0.3826316
``````

You may have already guessed the punch line here. I want to compute the residuals for the full model; that is, for each row in the data table find the difference between `dt\$Mercury` and one of the three means, based on the column `dt\$Lake_Type`. Said another way, for the rows in `dt` where `Lake_Type=="Eutropic"`, I want to subtract the Eutropic Mean found using the `with()` or `aggregate()` methods above.

I'm just blocked and can't seem to figure out how to construct this column. My only restriction is that I can't use pipes, but this shouldn't be driving me this crazy. Anyone want to suggest something?

Wait: Can I use

``````dt\$Mercury - YG[dt\$Lake_Type]
``````

where YG is the `with()` version? I seem to get both lake type and the appropriate mean for this, but it may be what I'm looking for.

`````` Mesotropic     Eutropic     Eutropic     Eutropic     Eutropic     Eutropic
0.627651163 -0.527755102  0.017244898  0.217244898  0.237244898  0.197244898
Mesotropic     Eutropic     Eutropic     Eutropic  Oligotropic     Eutropic
-0.182348837 -0.372755102  0.497244898 -0.242755102  0.427368421  0.027244898
``````

Edit: That's indeed what I wanted. Sorry for taking time on the Community. Can I delete this?

I would merge the YG you get from aggregate() with dt like this.

``````dt2 <- merge(dt, YG, by.x = "Lake_Type", by.y = "LT")
``````

or use the inner_join from dplyr

``````dt2 <- inner_join(dt, YG, by = c("Lake_Type" = "LT"))

I hope I didn't make any mistakes working without example data.``````

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.