Tidymodels error in `estimate_tune_results()

I was using worklflow set to run 3 spec:

lnreg_spec <- 
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet")%>% 
   set_mode("regression")

nn_spec<- nearest_neighbor(neighbors = tune(), 
                           dist_power = tune(), 
                           weight_func = tune()) %>% 
   set_engine("kknn") %>% 
   set_mode("regression")

rf_spec <- 
  rand_forest(mtry = tune(), min_n = tune()) %>% 
  set_engine("ranger") %>% 
  set_mode("regression")

But when I called grid results, it said:

Error in `dplyr::mutate()`:
! Problem while computing `metrics = purrr::map(result, collect_metrics,
  summarize = summarize)`.
Caused by error in `estimate_tune_results()`:
! All of the models failed. See the .notes column.
Backtrace:
  1. grid_results %>% collect_metrics()
  9. purrr::map(result, collect_metrics, summarize = summarize)
 11. tune:::collect_metrics.tune_results(.x[[i]], ...)
 12. tune::estimate_tune_results(x)

So I called the show notes:

show_notes(.Last.tune.result): 

unique notes:
------------------------------------------------------------------------------
36 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
Error in Matrix::sparseMatrix(i = row_id, j = as.numeric(values), dims = c(nrow(x), : NA's in (i,j) are not allowed ------------------------------------------------------------------------------
31 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
26 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
124 columns were requested but there were 115 predictors in the data. 115 will be used., 38 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
124 columns were requested but there were 116 predictors in the data. 116 will be used., 38 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
124 columns were requested but there were 118 predictors in the data. 118 will be used., 38 samples were requested but there were 25 rows in the data. 25 will be used. ------------------------------------------------------------------------------
36 samples were requested but there were 26 rows in the data. 26 will be used. ------------------------------------------------------------------------------
31 samples were requested but there were 26 rows in the data. 26 will be used. ------------------------------------------------------------------------------
124 columns were requested but there were 118 predictors in the data. 118 will be used., 38 samples were requested but there were 26 rows in the data. 26 will be used.

What's wrong with my code? It said that there's NA in (i,j), but I checked the data with anyNA and it turned out FALSE.

It's really helpful to get a reproducible example so that we have more information.

The only guess that I have is that you data have some non-numeric predictors. For K-nearest neighbors, the predictors have to be all numeric (and on the same scale). I suggest using a recipe to do this with step_dummy() and step_normalize().

I've tried using step_dummy() and step_normalize() but they still fail. So I have to remove all the non-numeric predictors to make it work. Thanks.

More information on that would be helpful (maybe there is a bug that we can fix)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.