What is the best way to undo a transformation of a dependent variable with tidymodels? What if the transformation uses parameters estimated with the training data (e.g. mean centering or Yeo-Johnson?).
I read through a bunch of tidymodels resources and couldn't find anything. I saw that Max Kuhn and Julia Silge wrote this:
It is best practice to analyze the predictions on the transformed scale (if one were used) even if the predictions are reported using the original units.
But after picking the "best" model with resampling sometimes I want to untransform my predictions. Here is a reprex. I want to use the sample mean of the training data to un-center the predictions.
library(tidymodels)
# create a recipe
mtcars_rec <- recipe(data = mtcars, formula = mpg ~ .) %>%
step_center(mpg) %>%
add_role(mpg, new_role = "reponse")
# create a model
dt_mod <- decision_tree() %>%
set_engine("rpart") %>%
set_mode("regression")
# create a workflow
mtcars_wflow <-
workflow() %>%
add_model(dt_mod) %>%
add_recipe(mtcars_rec)
# estimate the model
mtcars_fit <-
mtcars_wflow %>%
fit(data = mtcars)
# make a new prediction (mtcars reused for convenience)
predict(mtcars_fit, mtcars)
#> # A tibble: 32 x 1
#> .pred
#> <dbl>
#> 1 -1.83
#> 2 -1.83
#> 3 6.57
#> 4 -1.83
#> 5 -1.83
#> 6 -1.83
#> 7 -6.68
#> 8 6.57
#> 9 6.57
#> 10 -1.83
#> # … with 22 more rows
Created on 2020-10-20 by the reprex package (v0.3.0)