I am developing a prognostic model to predict an outcome. I have few binary variables with missing values, see below example. Is it better if I create dummy variables out of the below Gender variable in the model or keep it as it is? I have 79 binary variables like this. Please let me know which is best.
One can go either way. If you have 79 binary variables and they are missing values on different observations you could end up with very little data is you drop the observations with missing data. In that case you would be better off with three dummies for each binary.