Hello,
I am working with scatterplots, so for example something like this:
#generate random x and y data from 1 to 100
set.seed(10)
x = runif(100,0,10)
y = runif(100,0,10)
m1 = lm(y ~ x)
#plot
plot <- ggplot(m1, aes_string(x = 'x', y = 'y')) +
geom_point(size=3, colour = "blue") +
geom_abline(intercept = 0, slope = 1) #add line with slope of 1
print(plot)
Now I want to get two things out of this:
- The number of points both below and above the line respectively and
- the distance of each point from the line on the y-axis summarized as one value (so sort of residuals summarized) for all points above and below the line respectively.
Anyone knows an easy way to achieve this?