Can't get out of my own way here.
I'm trying to describe what regression is by using (what I always called) the Graph of Averages: find the average of y-value for each unique x-value, and connect those points.
No tidyverse allowed, unfortunately.
This seems like a task for aggregate()
. But let me get a reprex going.
Here's the data table you need:
# Import Reaction data set
theURL <- "http://lib.stat.cmu.edu/datasets/Andrews/T30.1"
theNames <- c("Table", "Number", "Row", "Experiment", "Temperature", "Concentration", "Time", "Unchanged", "Converted", "Unwanted")
Reaction <- read.table(theURL, header = F , col.names = theNames)
View(Reaction)
# Remove the first four useless columns
Reaction <- Reaction[-c(1:4)]
I'm looking at the bivariate relationship Temperature v Converted. I can get unique values of Temperature...
> unique(Reaction$Temperature)
[1] 162 172 167 177 157 160
...which I think should be my by variable. So if I aggregate thusly:
aggregate(Reaction$Converted, by=unique(Reaction$Temperature), mean)
...but am told that the by-variable must be a list. No problem!
> aggregate(Reaction$Converted, as.list(unique(Reaction$Concentration)), mean)
Error in aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length
I haven't the foggiest what that's trying to tell me, but the list looks quite complicated:
> as.list(unique(Reaction$Concentration))
[[1]]
[1] 23
[[2]]
[1] 30
[[3]]
[1] 25
[[4]]
[1] 27.5
[[5]]
[1] 32.5
[[6]]
[1] 22.5
[[7]]
[1] 20
[[8]]
[1] 34
So I try converting to a factor instead, which does no better.
> aggregate(Reaction$Converted, as.factor(unique(Reaction$Concentration)), mean)
Error in aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length
I'm not sure why I thought a factor would be a better idea:
> as.factor(unique(Reaction$Concentration))
[1] 23 30 25 27.5 32.5 22.5 20 34
Levels: 20 22.5 23 25 27.5 30 32.5 34
So now I'm losing the will to live. Sure could use some help.