case weight blog post discussion

We’ve published a blog post on case weight integration with tidymodels at:

We are excited to get community feedback so please comment here with your questions, thoughts, or examples.


That is great to hear, Max - thank you! The requirement for importance weights was really holding us back from using {tidymodels} in some areas (we used {mlr} instead, since this allowed us to manually define weights.)

From what I have read, the basic tenets of the implementations seems well thought through. I especially like that you have explicitly differentiated the different application scenarios for using case weights, with consequences regarding their use in resampling or not etc.

Let me present a self-serving use case for how I'd use the new functionality; perhaps that can start the discussion and motivate others to present their own use cases.

For me the most relevant use case for using importance weights is addressing the class imbalance problem in (often multi-class) classification. As you have alluded to, addressing the imbalanced class size usually entails one or more of the following strategies:

  • Over- or undersampling classes using SMOTE and friends (is not very elegant and you end up with the artificially generated samples in resampling and predictions, which makes the code more complex)
  • Using a model with a (ideally multi-class) classification target/cost function that is robust to varying class sizes (often not available since target functions are usually hard-baked into the models)
  • Using importance case or class weights so that each class is weighted equally within the model during training (the most flexible and also the most elegant solution concerning speed and implementation effort, if the model supports it)
  • Depending on what one assumes the data looks like at inference time, one may also want to consider class imbalance during resampling (for instance, using class-stratified nested CV), or by using performance measures that are robust to class imbalance such as kappa, or, to a lesser degree, AUROC.

So, I guess what I think could be worth discussing in this context is:

  1. Whether your implementation could make working with importance weights for addressing class imbalances even easier (for instance, by using a convenience function in the workflow add_case_weights() that re-weights cases as to lead to balanced classes). Ideally, this would work also with the multi-class scenario. I know it is a trade-off between doing too much behind the scenes (thus preventing the user from making relevant decisions), and overwhelming casual users with complexity.

  2. Whether in the vignette, you could also detail how using importance weights interact with using performance measures that are robust to class imbalance such as kappa, or, to a lesser degree, AUROC in classification.

  3. Whether in the vignette, you could also add information about when to use a stratified resampling strategies (assuming that is already supported) in such class-imbalanced scenarios, and when not. I'd assume that would depend on the expected class-size distribution and sens/spec calibration one wants to achieve at inference time, and may also differ between the inner (model selection) and outer (estimate test error) resampling loop.

Thanks for the thoughtful post!

Yes, absolutely. I can add a function to parsnip that could produce weights that are inversely proportional to the frequencies (or a custom weighting scheme). Any other suggestions are very welcome.

One thing that I hope to work more on this year is to make our interface to the various cost-sensitive learning models better/more consistent (this is unrelated to case weights though).

Importance weights, especially for class imbalances, should not directly interact with performance measures. Much like subsampling methods, the holdout data that is used to measure performance should represent the data as seen "in the wild". That means that they shouldn't be subsampled or impacted by methods to emphasize different data points.

The weights affect performance indirectly since the model is influenced by them.

However, it is important to make sure the right metric is being used (as you said above). If you want to strengthen the model for the minority class, optimizing on overall accuracy might obviate the effect of the case weights. We can do some documentation about that (and that's already on my list of updates for the second edition of Applied Predictive Modeling).

Very good point. It would be difficult to come up with a guideline for which approach should be used when but we can try a few things and show, for specific data sets, if there actually is a difference in say down-sampling and down-weighting (that might achieve the same effect). Again, more substrate for vignettes and APM.

@Max, unrelated to case weights, I can't wait for the second edition of APM! :partying_face:

Hi Max.

Can you pls confirm the case weights should be working on all engines in parsnip?

I have boost_tree[xgBoost] and rand_forest[ranger] working but I am receiving this error "Error in estimate_tune_results():" for mlp[nnet] and nearest_neighbour[kknn].

If they should indeed be working I shall attempt to send through a reprex for these models.

Hi @Max, this a great feature! Thank you for your hard work. I was just wondering if importance weights can be implemented with the 'mlp' function with e.g., the 'nnet' engine for classification?

I have tried incorporating importance weights with nnet but I get an error: check_case_weights(): Case weights are not enabled by the underlying model implementation. I would be happy to post a reprex. Thank you again for implementing all of this!

hah! just posted something very similar.

1 Like

@dmcgrath @nealec and @kaladin_stormblessed (Bridge 4!)

They are not available for all models. If you look at the engine specific documentation for each model, there is a heading for case weights that describes whether they can (or can't) be used. The majority models can use them IIRC.

nnet and kknn do not accept them. nnet could but the author of that function does some things in the code that make it very difficult to make them work. We can try some more to engineer around it but the author is not particularly responsive (or civil) so it may be difficult to make progress.

PLS, as implemented in the mixOmics package, does not either. I wish it would so maybe we could file a GH issue.

1 Like

Hi! This development is incredibly helpful to me, so thank you.
I have been playing with add_case_weights a bit and I have two questions for you:

  1. Is there any plan for this to be working with workflowsets as well?

  2. Have you noticed any issues with having case weights and tuning grids in the same workflow? From your example it seems to be working correctly, but I keep getting my R session to abort after a 2x3 grid on small chunks of data (which works without case weights). Any suggestion of why that could be happening is greatly appreciated!

For #1, I'll look into that. I think I tested this but I'm at home with covid and I'm a little slightly right now :face_with_thermometer: The main thing I remember was that, if a model in the set does not use case weights, we shouldn't fail (just issue a warning and move on).

For #2, can you give us a reprex? I've had some seg faults since moving to 4.2 on code that I've never had issues with previously. It probably depends on the model being used.

Thanks for your quick reply!
For #1, I had tried with something like workflow_map(fn = "add_case_weights", col = my_wt_col) but I may be misunderstanding how workflow_map works. No problem however, it was more of a curiosity.
For #2, I actually tried moving to 4.2 after reading your comment and now it works with no issue. I'm having trouble producing a reprex, but I'll post here if I can do it.

Thanks a lot and I wish you a fast recovery from covid :slight_smile:

I've added a discussion on the GitHub issues page. I suspect we will have to add an other option to workflow_set(). I don't think it will be that complicated.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.