dummyVars and predict function

Why is dummyVars always followed by a predict function? What are you predicting?


**example:** 
customers <- data.frame(
                id=c(10,20,30,40,50),
                gender=c('male','female','female','male','female'),
                mood=c('happy','sad','happy','sad','happy'),
                outcome=c(1,1,0,0,0))

dmy <- dummyVars(" ~ .", data = customers, fullRank=T)
trsf <- data.frame(predict(dmy, newdata = customers))
print(trsf)
##   id gender.male mood.sad outcome
## 1 10           1        0       1
## 2 20           0        1       1
## 3 30           0        0       0
## 4 40           1        1       0
## 5 50           0        0       0

predict() is generally used for models to apply some calculations to new data. In this case, it is applying an encoding from your training set to new samples. Other verbs, like apply() make more sense but are not generics.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.