I am working with scatterplots, so for example something like this:
#generate random x and y data from 1 to 100 set.seed(10) x = runif(100,0,10) y = runif(100,0,10) m1 = lm(y ~ x) #plot plot <- ggplot(m1, aes_string(x = 'x', y = 'y')) + geom_point(size=3, colour = "blue") + geom_abline(intercept = 0, slope = 1) #add line with slope of 1 print(plot)
Now I want to get two things out of this:
- The number of points both below and above the line respectively and
- the distance of each point from the line on the y-axis summarized as one value (so sort of residuals summarized) for all points above and below the line respectively.
Anyone knows an easy way to achieve this?