This has to do with how we fit the glmnet
model in tidymodels. Even when you give a penalty, we let it compete the full path (see the backstory here). We do say in the help file that:
For glmnet
models, the full regularization path is always fit regardless of the value given to penalty
.
This changes the lambda
path in the objects:
> result1$lambda
[1] 0
> str(result2$fit$lambda)
num [1:100] 4060 3699 3370 3071 2798 ...
> summary(result2$fit$lambda)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.406 4.063 40.640 456.932 406.297 4059.624
For this type of model, when a penalty value that is not included in the path is used, it does linear interpolation between the values. Since the path values do not bracket your value (zero), it does a poor job at interpolation. I don't think that they expected people to fit an unregularized model with this function.
You can get closer by expanding the range of penalty values using lambda.min.ratio
:
library(ggplot2)
library(tidymodels)
#> ── Attaching packages ──────────────────────────────────────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom 0.7.0 ✓ recipes 0.1.13.9001
#> ✓ dials 0.0.9.9000 ✓ rsample 0.0.8
#> ✓ dplyr 1.0.2 ✓ tibble 3.0.3
#> ✓ infer 0.5.2 ✓ tidyr 1.1.2
#> ✓ modeldata 0.0.2 ✓ tune 0.1.1.9000
#> ✓ parsnip 0.1.3.9000 ✓ workflows 0.2.1.9000
#> ✓ purrr 0.3.4 ✓ yardstick 0.0.7
#> ── Conflicts ─────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
#> x recipes::step() masks stats::step()
library(glmnet)
#> Loading required package: Matrix
#>
#> Attaching package: 'Matrix'
#> The following objects are masked from 'package:tidyr':
#>
#> expand, pack, unpack
#> Loaded glmnet 4.0-2
df <- ggplot2::mpg
x <- model.matrix(cty ~ hwy + displ, df)[, -1]
y <- df$cty
result1 <- glmnet(x, y, alpha=0, lambda=0)
tidy(result1)
#> # A tibble: 3 x 5
#> term step estimate lambda dev.ratio
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 1 4.73 0 0.924
#> 2 hwy 1 0.596 0 0.924
#> 3 displ 1 -0.528 0 0.924
result2 <- linear_reg(mixture=0, penalty = 0) %>%
set_engine('glmnet') %>%
fit(cty ~ hwy + displ, data=df)
summary(result2$fit$lambda)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.406 4.063 40.640 456.932 406.297 4059.624
tidy(result2)
#> # A tibble: 3 x 3
#> term estimate penalty
#> <chr> <dbl> <dbl>
#> 1 (Intercept) 7.62 0
#> 2 hwy 0.507 0
#> 3 displ -0.766 0
result3 <- linear_reg(mixture=0, penalty = 0) %>%
set_engine('glmnet', lambda.min.ratio = 1e-15) %>%
fit(cty ~ hwy + displ, data=df)
summary(result3$fit$lambda)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.019 0.406 8.746 350.197 188.431 4059.624
tidy(result3)
#> # A tibble: 3 x 3
#> term estimate penalty
#> <chr> <dbl> <dbl>
#> 1 (Intercept) 4.93 0
#> 2 hwy 0.590 0
#> 3 displ -0.545 0
Created on 2020-10-16 by the reprex package (v0.3.0)