MAE metric with weights in yardstick

Hi, I am looking for how to implement the WMAE metric in the yardstick. More specifically, in the class Kaggle competition in the following URL:

they use one of the variables Is_holiday for determining the weights in the mae, so that the model performs on holiday weeks sales price will be picked.

tune_result <- workflow %>%
tune_grid(validation_split,
grid = lambda_grid,
metrics = metric_set(rmse, mae))

Is there way that I can add weights on mae in the above code?

yardstick doesn't currently support weighted metrics. We hope to add them in the future.

If you really want this, you can implement a custom yardstick metric

Hi, @davis

Thanks for the follow-up! This is what I came with the custom metrics for weighted MAE. Is this the way that I can make the custom metric?

library(rlang)
library(yardstick)
#> For binary classification, the first factor level is assumed to be the event.
#> Use the argument `event_level = "second"` to alter this as needed.

wmae_vec <- function(truth, estimate, weights, na_rm = TRUE, ...) {
  
  wmae_impl <- function(truth, estimate, weights) {
    sum(weights * abs(truth - estimate)) / sum(weights)
  }
  
  metric_vec_template(
    metric_impl = wmae_impl,
    truth = truth, 
    estimate = estimate,
    weights = weights,
    na_rm = na_rm,
    cls = "numeric",
    ...
  )
  
}

wmae <- function(data, ...) {
  UseMethod("wmae")
}

wmae <- new_numeric_metric(wmae, direction = "minimize")

wmae.data.frame <- function(data, truth, estimate, weights, na_rm = TRUE, ...) {
  
  metric_summarizer(
    metric_nm = "wmae",
    metric_fn = wmae_vec,
    data = data,
    truth = !! enquo(truth),
    estimate = !! enquo(estimate), 
    metric_fn_options = list(weights = weights),
    na_rm = na_rm,
    ...
  )
  
}

data("solubility_test")
wmae_vec(
  truth = solubility_test$solubility, 
  estimate = solubility_test$prediction,
  weights = 1
)
#> [1] 172.2424
solubility_test$weights <- 2
wmae(solubility_test, 
     truth = solubility, 
     estimate = prediction, weights = solubility_test$weights)
#> # A tibble: 1 x 3
#>   .metric .estimator .estimate
#>   <chr>   <chr>          <dbl>
#> 1 wmae    standard       0.545

It seems to work, but I wonder I can make the custom metric with function like;

wmae(solubility_test, 
     truth = solubility, 
     estimate = prediction, 
     weights = weights)

Moreover, how can I apply this metric in tune_grid with an additional option (weights) ?

Unfortunately, I don't think you will be able to do wmae(..., weights = weights). Arguments besides truth and estimate don't currently allow for any tidyselect semantics. Most of the time those optional arguments are scalar values, like f_meas(beta = 2). The fact that weights would correspond to another column in the data frame makes things a little challenging with the current system.

For a similar reason, you won't be able to tune with it either.

Unfortunately, I think you will just have to wait until we add weight support officially in yardstick

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.