recipes step_nzv for numeric attributes

Hey,
I have troubles to use the step_nzv in recipes to filter out numeric attributes with small variances but contnuous values. To me it seems, that the step applies only for nominal values, as it calculates the number of unique values and the ratio of most common to second most common. However I have an attribut which is almost everywhere close to zero, never zero. Do I have to bin first (and discretize with same sized bins would change everything)?
In the code below, I there is a minimal example. I expect that both columns low_variance_num and low_variance_nom are filtered out:

library(tidymodels)

data <- tibble(num = seq(1000),rand = runif(1000)) %>% 
  mutate(low_variance_num = ifelse(num == 1, 1, rand/10000),
         low_variance_nom = ifelse(num == 1, 1, 0))

data
var(data$low_variance_num)
var(data$low_variance_nom)

recipe <- recipe(formula = num ~., data = data) %>% 
  update_role("num", new_role = "label") %>%
  step_nzv(all_predictors(), freq_cut = 995/5, unique_cut = 10) %>% # 5min bis hier
  prep()
summary(recipe)

Thanks!
P.S: Is there a way to use recipes without providing a formula? In this case the formula is nonsense.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.