I am trying to lay a linear regression line over this scatterplot using geom_smooth and I am getting a wonky output in return. Any help will be appreciated.
library(UsingR)
install.packages("reshape"); library(reshape)
data(galton)
library(reshape)
long <- melt(galton)
library(ggplot2)
library(dplyr)
y <- galton$child
x <- galton$parent
freqData <- as.data.frame(table(galton$child, galton$parent))
names(freqData) <- c('child', 'parent', 'freq')
freqData$child <- as.numeric(as.character(freqData$child))
freqData$parent <-as.numeric(as.character(freqData$parent))
g <- ggplot(filter(freqData, freq>0), aes (x = parent, y = child)) +
scale_size(range = c(2,20), guide = 'none') +
geom_point(color = 'grey50', aes(size = freq+2, show_guide = FALSE))+
geom_point(aes(colour = freq, size = freq))+
scale_color_gradient(low = 'lightblue', high = 'pink') +
geom_smooth(method='loess', formula = y ~ x )
g
long <- melt(galton)
# or
install.packages("reshape")
and what are
y <- galton$child
x <- galton$parent
doing?
That said those results ore wonky.
I think a major problem is that you are using reshape. It has been replace by reshape2 several years ago. If we use reshape2 we get something that still looks strange up I think a lot closer to what you want.
Try
library(UsingR)
library(reshape2)
data(galton)
library(ggplot2)
library(dplyr)
freqData <- as.data.frame(table(galton$child, galton$parent))
names(freqData) <- c('child', 'parent', 'freq')
freqData$child <- as.numeric(as.character(freqData$child))
freqData$parent <-as.numeric(as.character(freqData$parent))
g <- ggplot(filter(freqData, freq>0), aes (x = parent, y = child)) +
scale_size(range = c(2,20), guide = 'none') +
geom_point(color = 'grey50', aes(size = freq+2, show_guide = FALSE))+
geom_point(aes(colour = freq, size = freq))+
scale_color_gradient(low = 'lightblue', high = 'pink') +
geom_smooth(method='loess', formula = y ~ x )
g
I have the feeling there are better ways to do whatever it is you want but I don't see one immediately.