# Need to compute Graph of Averages

Can't get out of my own way here.

I'm trying to describe what regression is by using (what I always called) the Graph of Averages: find the average of y-value for each unique x-value, and connect those points.

No tidyverse allowed, unfortunately.

This seems like a task for `aggregate()`. But let me get a reprex going.

Here's the data table you need:

``````# Import Reaction data set
theURL <- "http://lib.stat.cmu.edu/datasets/Andrews/T30.1"
theNames <- c("Table", "Number", "Row", "Experiment", "Temperature", "Concentration", "Time", "Unchanged", "Converted", "Unwanted")
View(Reaction)

# Remove the first four useless columns
Reaction <- Reaction[-c(1:4)]
``````

I'm looking at the bivariate relationship Temperature v Converted. I can get unique values of Temperature...

``````> unique(Reaction\$Temperature)
 162 172 167 177 157 160
``````

...which I think should be my by variable. So if I aggregate thusly:
`aggregate(Reaction\$Converted, by=unique(Reaction\$Temperature), mean)`

...but am told that the by-variable must be a list. No problem!

``````> aggregate(Reaction\$Converted, as.list(unique(Reaction\$Concentration)), mean)
Error in aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length
``````

I haven't the foggiest what that's trying to tell me, but the list looks quite complicated:

``````> as.list(unique(Reaction\$Concentration))
[]
 23

[]
 30

[]
 25

[]
 27.5

[]
 32.5

[]
 22.5

[]
 20

[]
 34
``````

So I try converting to a factor instead, which does no better.

``````> aggregate(Reaction\$Converted, as.factor(unique(Reaction\$Concentration)), mean)
Error in aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length
``````

I'm not sure why I thought a factor would be a better idea:

``````> as.factor(unique(Reaction\$Concentration))
 23   30   25   27.5 32.5 22.5 20   34
Levels: 20 22.5 23 25 27.5 30 32.5 34
``````

So now I'm losing the will to live. Sure could use some help.

I'm not sure if I understand you correctly but I think this is what you want

``````theURL <- "http://lib.stat.cmu.edu/datasets/Andrews/T30.1"
theNames <- c("Table", "Number", "Row", "Experiment", "Temperature", "Concentration", "Time", "Unchanged", "Converted", "Unwanted")
Reaction <- Reaction[-c(1:4)]
aggregate(Converted ~ Temperature, data = Reaction, mean)
#>   Temperature Converted
#> 1         157    46.900
#> 2         160    60.300
#> 3         162    53.875
#> 4         167    55.680
#> 5         172    57.400
#> 6         177    59.800
``````

Created on 2019-03-07 by the reprex package (v0.2.1)

1 Like

That's exactly what I want. Thanks!

Where did I go wrong in explaining what I wanted? I never seem to hit the Goldilocks zone with stating what I want and what I've tried.

To demystify the use of by in aggregate()

``````# Import Reaction data set
theURL <- "http://lib.stat.cmu.edu/datasets/Andrews/T30.1"
theNames <- c("Table", "Number", "Row", "Experiment", "Temperature", "Concentration", "Time", "Unchanged", "Converted", "Unwanted")
View(Reaction)

# Remove the first four useless columns
Reaction <- Reaction[-c(1:4)]

aggregate(Reaction, by = list(Reaction\$Temperature), FUN = mean)
#>   Group.1 Temperature Concentration Time Unchanged Converted Unwanted
#> 1     157         157          27.5  6.5    37.600    46.900 14.70000
#> 2     160         160          34.0  7.5    17.750    60.300 20.70000
#> 3     162         162          26.5  6.0    31.175    53.875 12.77500
#> 4     167         167          27.5  6.5    19.700    55.680 22.38000
#> 5     172         172          27.5  6.5    12.850    57.400 25.32500
#> 6     177         177          22.5  6.5    11.900    59.800 24.83333
``````

Thank you, kindly.

I guess I would have hit it myself had I tried list() instead if as.list().

No, I'm oversimplifying this. How did you know that `list()` would work for the by-variable? All the help I read seemed to say you needed a list of unique values, which lead me to `unique()`, then to `as.list()`, etc. Used alone `list()` just gives all the values of a column, including repeats, in a list. I'm not sure why you don't need `unique()` around that.

You did nothing wrong per se, I'm not a native English speaker and some times I don't trust my mental translations, but maybe you want to keep your vocabulary as simple as possible to maximize your chances of getting help.