Hello All,
I have a query regarding the removal of insignificant factor variable and Ordered factor variable from regression model using R.
For Example -
1.) Normal regression model
a) Running the model using training data and I get the below summary for the model (naming model1)
doc1.pdf (54.9 KB)
(Note - here you can assume x1, x2 and x3 to be significant)
b) Then I do the predictions using the above model (naming model1) for the test data.
Doc2.pdf (34.1 KB)
Result - Here I am getting the predicted values of all the data points.
- For Factor Variable (both normal and ordered Factor variable)-
a) I factor the x1 variable and the run the logistic regression on the training data.
I get the below summary for the model (naming model2).
Doc3.pdf (57.5 KB)
Note - Now assume that in x1 factored variable one of the level is insignificant eg x12 (variable x1 with level 2) and rest variables are significant.
b) Then I run the model again (naming model3), summary is below -
Doc5.pdf (56.3 KB)
c) Now I do the prediction using the test data using the above model (model3), then I get the following summary.
doc 6.pdf (32.3 KB)
Result - Here I am getting the not getting the predicted values for some of the data points, it showing NA.
Main Question - Can anyone tell me how should I exclude or deal with the insignificant levels of a factored variable or ordered factor variable in regression model or any other model and after excluding the same how can I predict the variables for all the data points ?
Thanks and Regards,
Karan Sehgal