Error: Aesthetics must be either length 1 or the same as the data (1020)

Hey, I'm fairly new to R and came across an error I cannot seem to solve. My data has a fair amount of missing values, so when constructing a new column of the residuals using:

linear <- lm(LN_ratio_CD3_Blast ~ ID, data = CMI_LNratioCD3)

CMI_LNratioCD3$residual <- residuals(linear)

I get the following error: Error in $<-.data.frame(*tmp*, residual, value = c(317 = -1.27035435466668, : replacement has 244 rows, data has 1020

I've tried adding na.exclude, na.omit, na.pass, na.fill etc. but I seem to be doing something wrong...

In order to surpass this problem I've tried to not create the residual column, and just plot this in the qq plot directly:

residual <- residuals(linear)

g5 <-ggplot(data = CMI_LNratioCD3,
aes(sample = residual))+
geom_qq() +
geom_qq_line(colour = "red") +
labs(title = "Quantile plot of residuals")

However, this will result in the following error: Error: Aesthetics must be either length 1 or the same as the data (1020): sample

Can someone please tell me what might be going on? And how I can solve this?

Kind regards,

Marjory

Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

install.packages("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

There's also a nice FAQ on how to do a minimal reprex for beginners, below:

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ.

After running repress, I got a more detailed error and figured it out by myself. In short:
I've should have included the 'elimination' of the missingdata(NA) in the lm function when calculating the residuals with na.action=na.exclude

linear <- lm(LN_ratio_CD3_Blast ~ ID, data = CMI_LNratioCD3,na.action=na.exclude)
> CMI_LNratioCD3$residual <- residuals(linear)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.