Hello,
I have to apply WOE for both continuous and factor variables and I'm trying to use embed and recipes to my dataframe called train, here's the str():
str(x)
'data.frame': 476383 obs. of 21 variables:
$ output : Factor w/ 2 levels "0","1": 2 1 2 2 1 2 1 2 1 2 ...
$ v1 : int 201701 201701 201701 201701 201701 201701 201701 201701 201701 201701 ...
$ v2 : num 9407 1322 1316 6442 0 ...
$ v3 : num 1 1 2 5 0 2 1 2 5 1 ...
$ v4 : num 9596 0 0 0 0 ...
$ v5 : num 8 200 6291 871 1117 ...
$ v6: int 32 40 88 74 148 138 137 36 25 64 ...
$ v7 : int 35 53 38 33 67 53 68 64 35 34 ...
$ v8 : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 2 1 2 ...
$ v9 : Factor w/ 21 levels " 1",..: 5 16 7 15 20 16 21 7 7 15 ...
$ v10 : Factor w/ 13 levels "1","2","3","4",..: 13 13 13 13 13 13 13 13 13 13 ...
$ v11 : Factor w/ 13 levels "1","2","3","4",..: 13 13 13 13 13 13 13 13 13 13 ...
$ v12 : num 0.84 0 0.07 0.97 0 0.32 0 1 0.56 0.41 ...
$ v13 : num 42107 0 358058 145109 0 ...
$ v14 : Factor w/ 4 levels "CONTADO","INTERES",..: 2 4 2 2 4 2 4 2 2 2 ...
$ v15 : Factor w/ 2 levels "2017","2018": 1 1 1 1 1 1 1 1 1 1 ...
$ v16 : Factor w/ 12 levels "01","02","03",..: 1 1 1 1 1 1 1 1 1 1 ...
$ v17 : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ v18 : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ v19: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ...
$ v20: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
So, as you see I have factor and numeric variables. I need to discretize the numeric variables to work with the Weight of Evidence (WOE). This is what I tried:
rec <- recipe(v1 ~ ., data = x) %>%
step_discretize(all_numeric) %>%
step_woe(all_nominal, outcome = v1)
woe_models <- prep(rec, training = x)
Error: This tidyselect interface doesn't support predicates yet.
i Contact the package author and suggest using eval_select()
.
I tried with the explication here but it didn't work. Does anyone know how to solve it? thanks!