confusionMatrix error

Hello everyone!

I got this message when I code a confusionMatrix: "Error in confusionMatrix.default(prediccion_1900, testing_1900$Genero) :
The data contain levels not found in the data."

prediccion_1900 str() it´s a factor
testing_1900$Genero str() it´s a data.frame

in testing_1900$Genero , I TRIED many times to remove the separate symbol like this, but it does´t work....... ,sep = "([|])",remove = F

Please I´ll be very thankful with your help


Thanks a lot for your help, here´s the code:

testing_1900 = testing %>% filter(Id_usuario == 1900) %>%
select(Indice,Ano,Genero,Marca_temporal,Promedio_Idbm,Numero_Votos) %>%
mutate(Genero = factor(Genero))

testing_1900$Genero = fct_expand(testing_1900$Genero,levels(factor(testing$Genero))[2:20]) # More levels added to analize the error

prediccion_1900 = predict(modelo, newdata = testing_1900, type = "class") # type class so it doesn´t affect the number of levels less
confusionMatrix(prediccion_1900, testing_1900$Genero)

testing_1900$Genero has a number of categories linked by the |, whereas prediccion_1900 has only one. That might be an issue for your prediction.

You probably have to use \\ to escape for the regex as | is a special character.

Here, I have split it, but in to separate rows by using unnest, otherwise, it just splits in to a list within the data frame.


df <- tibble(id = c(1, 2),
             Genero = c("Comedy | Romance", "Action | Romance | Western")) %>% 
  mutate(g2 = str_split(Genero, "\\|")) %>% # \\ to escape then use |. Splits to a list.
  unnest(cols = c(g2)) %>% # puts in to separate rows
  mutate(g2 = str_trim(g2)) # remove leading and trailing spaces

> df
# A tibble: 5 x 3
     id Genero                     g2     
  <dbl> <chr>                      <chr>  
1     1 Comedy | Romance           Comedy 
2     1 Comedy | Romance           Romance
3     2 Action | Romance | Western Action 
4     2 Action | Romance | Western Romance
5     2 Action | Romance | Western Western
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.