My situation:
- I have to test 1700 variables
- dependent variables: 3773 observation
- Independent variables: more than 1700 (observation range from 1400-4000)
- most cases in the sample and most variables have missing values.
- Due to large number of explanatory variables, I intend to use LASSO to save time to select variables first --> then link back to literature and find some hidden dimension that dont exist in literature.
Beside LASSO, I also use multiple regression and compare the results.
My question: How can I handle missing value?
Thank you for your answer.