Can't use the lm function/ na.omit doesn't remove the NA'S

Hey,

I just started working with rstudio and i am currently trying a own analysis for the first time.
For that, I use the dataset of the ESS.
I tried to use an lm function to get an regression analysis between two variables. First i tried it, without clearing out the Answers i don`t need (Refusal,Don't know, etc.) They are coded as numbers. That worked fine, but it falsifys my output.
After i tried to remove them, i could not get the lm function to work. I get the error code:

Error in quantile.default(resid) : factors are not allowed
In addition: Warning message:
In Ops.factor(r, 2) : ‘^’ not meaningful for factors

Using Str(mydata$hmsacld) i get the Output, that it seems to be numeric.
Because i suspected, that the NA's are the cause i tried to remove the NA's using the na.omit function, but i still can see the NA's wuth the summary command or the is.na command.

If you have any tips i would be very gratefull.

mydata$hmsacld[mydata$hmsacld=="7"] <- NA
mydata$hmsacld[mydata$hmsacld=="8"] <- NA
mydata$hmsacld[mydata$hmsacld=="9"] <- NA
mydata$hmsacld[mydata$hmsacld=="NA's"] <- NA
mydata$hmsacld <- factor(mydata$hmsacld)
mydata$hmsacld <- na.omit(mydata$hmsacld)
mydata$hmsacld <- as.numeric(mydata$hmsacld)
summary(mydata$hmsacld)


Germany <- subset(mydata,cntry=="DE")
Merkel <- subset(Germany,prtvede2=="1")
Schulz <- subset(Germany,prtvede2=="2")
Habeck <- subset(Germany,prtvede2=="4")
Gauland <- subset(Germany,prtvede2=="6")
regCDU <- lm (hmsacld ~ rlgblg,data = Merkel, na.rm=TRUE)

Hey Chesse, welcome to the community:
It is hard to help with this limited ammount of information. Could you please make a reproducible example?

I am not completely sure, but I don't think na.omit works in the context you are using. To me, it is like more an argument to deal with NAs inside models like in nlme. not inside a data.frame or in the context of data cleaning/processing.

you should do something like:

mydata <- mydata[is.finite(mydata$hmsacld), ]

Unles na.omit is defined in a specific package that you are using and in its context. Note that I have used is.finite rather than !is.na because it will remove both NAs, NaNs and -Inf, Inf, a source of problem when fitting models, specially for novices.

Anyway, your original error points to factor problem, not to NAs. If you use quantile with NAs, it will vomit something like missing values and NaN's not allowed if 'na.rm' is FALSE. This is most likely due to you have some characters mistype. like an o where it should be a 0.... you can check it with is.finite, but please, provide a reproducible example so we can help you more.
cheers
Fer

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.