Thanks for the reply, Max.
I was wondering if you [or anybody else] could give me a bit of insight into understanding what units the coefficients in my model are and possibly how to return the coefficients to their unscaled form, since I'm kinda confused with my current interpretation.
I set standardize = FALSE
in my glmnet
model, but I normalized my data in the recipe, so I'd assume my coefficients are only standardized now.
So to "unscale" the coefficient, I assume that I'd first have to multiply the coefficient by its unscaled standard deviation, and then add its unscaled mean, right? At least according to
this it seems like that should be the case.
But when I do that, the coefficient makes no sense.
It's a bit tricky for me to post the data one would need to reproduce my results, but here's my code:
library(tidymodels)
library(tidyverse)
# preps data for model
myrecipe <- mydata %>%
recipe(transactionrevenue ~ sessions + channelgrouping + month + new_user_pct + is_weekend) %>%
step_novel(all_nominal(), -all_outcomes()) %>%
step_dummy(month, channelgrouping, one_hot = TRUE) %>%
step_zv(all_predictors()) %>%
step_normalize(sessions, new_user_pct) %>%
step_interact(terms = ~ sessions:starts_with("channelgrouping") + new_user_pct:starts_with("channelgrouping"))
# creates the model
mymodel <- linear_reg(penalty = 10, mixture = 0.2) %>%
set_engine("glmnet", standardize = FALSE)
wf <- workflow() %>%
add_recipe(myrecipe)
model_fit <- wf %>%
add_model(mymodel) %>%
fit(data = mydata)
# posts coefficients
tidy(model_fit)
If it would help, here's some information that might be useful:
The variable that I'm really focusing on is "sessions." In the model, the coefficient for sessions is 2543.094882
, and the intercept for the model is 1963.369782
. The penalty is also 10
.
The unscaled mean for sessions is 725.2884
and the standard deviation is 1035.381
.
When I multiply the coefficient of sessions by the standard deviation and then add back the mean, I get 2633797.41042
which makes no sense in terms of this situation -- it should probably be a number from 0 to 5.
Hopefully I'm just doing something dumb, because this is driving me a bit cuckoo. Any insight would be very much appreciated.