Tidymodels - get started-Preprocess your data with recipes

Going through tidymodels - get started -preprocessing with recipes
After the step

flights_rec <- 
  recipe(arr_delay ~ ., data = train_data) %>% 
  update_role(flight, time_hour, new_role = "ID") 

we can see the result in

summary(flights_rec)

as

variable
<chr>
type
<chr>
role
<chr>
source
<chr>
dep_time	numeric	predictor	original	
flight	numeric	ID	original	
origin	nominal	predictor	original	
dest	nominal	predictor	original	
air_time	numeric	predictor	original	
distance	numeric	predictor	original	
carrier	nominal	predictor	original	
date	date	predictor	original	
time_hour	date	ID	original	
arr_delay	nominal	outcome	original

but after the following steps when we add step_date(), step_holiday() etc

flights_rec <-  
  recipe(arr_delay ~ ., data = train_data) %>% 
  update_role(flight, time_hour, new_role = 'ID') %>% 
  step_date(date, features = c('dow', 'month')) %>%
  step_holiday(
    date,
    holidays = timeDate::listHolidays('US'),
    keep_original_cols = FALSE
  ) %>%
  step_dummy(all_nominal_predictors()) %>% 
  step_zv(all_predictors)

nothing changes in summary(flights_rec), there is no indication that 'dow', 'month', 'holidays' have been added and 'date' removed.
Is there some other function that shows the effects of steps?

You will have to prep() the recipe and have it actually process the data to know what is added/removed. There are a lot of steps that could add or remove columns based on the data that you have. For this reason, you have to execute the steps to really know.

If you want more detail, use the log_changes = TRUE option to prep() to get specific details.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.