How to collect in-sample metrics with tidymodels' tune_grid?

Hello modellers!

I understand that tidymodels' tune tune_grid()method collects out-of-sample-metrics using the metrics parameter. However, how can I also collect in-sample metrics?

I do not only want to collect metrics for predictions, but also assess my models fit on the data it was trained on. This is helpful when assessing whether any model overfits.

Thanks for your help!

This is my first post on this forum, so please be gentle and let me know how I can improve my questioning.

1 Like

You would probably use the extract option in the control function to save the models, then re-predict on the training set. Alternatively, you could add an rsample::apparent() rsplit to the rsample object, but that would bias the metrics that are automatically produced.'

If that sounds like a pain (and it does to me) that due to our avoidance of using these statistics. I can see wanting to get them from demonstration purposes but we don't want to facilitate their use in people's ordinary workflows.

2 Likes

Thanks, @Max! Yes, it does sound like a pain, but a bearable one.

Just out of curiosity: What is the fundamental reason behind not wanting to expose an in-sample validation workflow for tune_grid()? Would one not even consider it good practice to check for overfitting during hyperparameter optimization? Having knowledge of overfitting would enable one to adjust model parameters accordingly, potentially leading to a robust model in a shorter amount of time than if not checking for overfitting.

You check for overfitting with the out-of-sample results. With the in-sample results, you wouldn't know if it were overfitting or was just a really good model.

The in-sample results can be horribly biased and the bias depends on the data and model. They don't have much utility beyond showing that they should not be used.

Thanks! To clarify, I am talking about detecting overfitting by comparing the model's fit on in-sample vs. out-of-sample data – not at looking at the two in isolation. For example, if a model would fit in-sample data really well but would show considerably worse performance for out-of-sample data, that could be a signal of overfitting. It is in that regard that I see value in looking at in-sample predictions.