Residuals from abline in ggplot

Hello,

I am working with scatterplots, so for example something like this:

#generate random x and y data from 1 to 100
set.seed(10)
x = runif(100,0,10)
y = runif(100,0,10)
m1 = lm(y ~ x)

#plot
plot <- ggplot(m1, aes_string(x = 'x', y = 'y')) +
geom_point(size=3, colour = "blue") +
geom_abline(intercept = 0, slope = 1) #add line with slope of 1

print(plot)

Now I want to get two things out of this:

  1. The number of points both below and above the line respectively and
  2. the distance of each point from the line on the y-axis summarized as one value (so sort of residuals summarized) for all points above and below the line respectively.

Anyone knows an easy way to achieve this?

Did I understand you correctly?

y_hat <- x * 1 + 0  #for cases where the slope != 1 or intercept != 0
Resid <- y - y_hat

PosResid <- Resid[Resid >= 0]
NegResid <- Resid[Resid < 0]

#Number of points in each population
length(PosResid)
length(NegResid)

#Sums in each population
sum(PosResid)
sum(NegResid)
2 Likes

Yes you did. It´s exactly, what I was looking for, thank you!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.