I am working off Julia Silge's blog post demonstrating sparse matrix models but I am using the ranger
model for classification, rather than lasso
which she uses. The model works fine when using non-sparse data but predict
fails with sparse data complaining "cannot coerce class 'structure("dgCMatrix", package = "Matrix")' to a data.frame." How can this be? Thanks.
library(tidyverse)
library(tidymodels)
library(tidytext)
library(textrecipes)
library(stopwords)
library(hardhat)
data("small_fine_foods")
sparse_bp <- default_recipe_blueprint(composition = "dgCMatrix")
text_rec <-
recipe(score ~ review, data = training_data) %>%
step_tokenize(review) %>%
step_stopwords(review) %>%
step_tokenfilter(review, max_tokens = 1e3) %>%
step_tfidf(review)
rf_model <- parsnip::rand_forest(trees = 100) %>%
set_engine("ranger",importance = "impurity") %>%
set_mode("classification")
wf_fat <-
workflow() %>%
add_recipe(text_rec) %>%
add_model(rf_model)
wf_sparse <-
workflow() %>%
add_recipe(text_rec, blueprint = sparse_bp) %>%
add_model(rf_model)
# fit works and...
fit_fat <- fit(wf_fat,training_data)
# predict works
summary(predict(fit_fat,training_data))
#> .pred_class
#> great:2609
#> other:1391
# fit works but...
fit_sparse <- fit(wf_sparse,training_data)
# predict gags
summary(predict(fit_sparse,training_data))
#> Error in as.data.frame.default(new_data): cannot coerce class 'structure("dgCMatrix", package = "Matrix")' to a data.frame
Created on 2023-03-31 with reprex v2.0.2