tidymodels code for ridge, lasso, and elastic net using glmnet

jftnese · February 25, 2020, 6:24pm

I want to verify the code to specify a ridge model, a lasso model, and an elastic net model, using parsnip and glmnet and the penalty and mixture arguments.

I am confused because the documentation states:

mixture : The proportion of L1 regularization in the model.
and
mixture : A number between zero and one (inclusive) that represents the proportion of regularization that is used for the L2 penalty (i.e. weight decay, or ridge regression) versus L1 (the lasso) ( glmnet and spark only).

So I am not sure if the mixture represents the proportion of L1 or L2.

Is this the correct specification for a ridge model?

linear_reg(penalty = .10, mixture = 0) %>% # mixture = 0 meaning no L1 penalty 
  set_mode("regression") %>% 
  set_engine("glmnet") %>% 
  fit(y ~ ., data = dta)

Is this the correct specification for a lasso model?

linear_reg(penalty = .10, mixture = 1) %>% # mixture = 1 meaning no L2 penalty 
  set_mode("regression") %>% 
  set_engine("glmnet") %>% 
  fit(y ~ ., data = dta)

Is this the correct specification for an elastic net model?

linear_reg(penalty = .10, mixture = .6) %>% # this is a mixture of both L1 and L2. Is it 60% L1 or 60% L2?
  set_mode("regression") %>% 
  set_engine("glmnet") %>% 
  fit(y ~ ., data = dta)

joels · February 25, 2020, 7:32pm

Assuming mixture works the same as alpha in glmnet::glmnet, 0 is L2 (ridge) only and 1 is L1 (lasso) only and anything in between is a proportional mixture of both. This would be consistent with the first description of mixture. Hopefully Max will drop by and provide a definitive answer. The documentation could probably be clarified both to make the two mixture descriptions consistent and also to explicitly state which is which, e.g., 0=L2 only and 1=L1 only.

jftnese · February 27, 2020, 11:41pm

Thanks for your response, I appreciate it. I believe you are correct, although a definitive answer would be great.

Max · February 29, 2020, 6:36pm

Yes, mixture = alpha. We're going to document these details a little more definitively.

You can also see this via the translate() function:

library(parsnip)

linear_reg(mixture = .1) %>% 
  set_engine("glmnet") %>% 
  translate()
#> Linear Regression Model Specification (regression)
#> 
#> Main Arguments:
#>   mixture = 0.1
#> 
#> Computational engine: glmnet 
#> 
#> Model fit template:
#> glmnet::glmnet(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
#>     alpha = 0.1, family = "gaussian")

^{Created on 2020-02-29 by the reprex package (v0.3.0)}

jftnese · March 2, 2020, 4:45pm

Thank you, Max! And thanks for sharing the translate() function, that is extremely helpful!

system · March 23, 2020, 4:51pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.