I want to get a linar model from almost 900 variables, but lm throws me an error that many colums have NAs, any advice?

Hi to everybody,
This is my first post, I am a novice in this field, but my first assignment is to get a linear model out of almost 900 variables, I have it in wide format, I want to predict price based on variables.
I have tried to run lm but it throws this error:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases

See this post

Consider starting more simply.

fit  <- lm(mpg ~ drat, data = mtcars)

before moving on to trying to assess all variables

fit2 <- lm(mpg ~ ., data = mtcars

Among 900 variables, some will be irrelevant, some redundant and others antagonistic. There is no also no practicable way to explore all possible combinations. For 900 variables, a large number of observations will be required to avoid the curse of dimensionality and feature engineering/dimensionality reduction will probably be required. See Chapter 4 Regression Modeling Strategies With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis, Frank E. Harrell , Jr. (2015).

Thank you!
I have worked with a subset of columns, now I have 19 variables and it run smoothly.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.