I have not clearly understood the tidymodel workflow as it seems on my results.
I have trained a successful xgboost model for binary classification. But when I call predict function on new data, an error arised asking for the target variable (disease, not disease)
predict(disease_wf_model, new_incoming_data[1,] )
Error: Can't subset columns that don't exist.
x Column `disease` doesn't exist.
I suppose the new data has no such variable, so i am asking:
-
should I execute prep() when defining the recipe or not? (some examples do, some other not)
-
should I execute the recipe on the new data to predict?
Thanks in advance