match products in a list in R

Hi there! I have to classify a list of products like these:

product_list<-data.frame(product=c('banana from ecuador 1 unit', 'argentinian meat (1 kg) cow','chicken breast','noodles','salad','chicken salad with egg'))

Based on the words included in each element of this vector:

product_to_match<-c('cow meat','deer meat','cow milk','chicken breast','chicken egg salad','anana')

I would have to match all the words of each product product_to_match, into each element of the dataframe.

I am not sure what is the best way to do this, in order to classify each product into a new column, in order to have something like this:

product_list<-data.frame(product=c('banana from ecuador 1 unit', 'argentinian meat (1 kg) 
cow','chicken breast','noodles','salad','chicken salad with egg'),class=c(NA,'cow meat','chicken 
breast',NA,NA,'chicken egg salad'))

Notice that 'anana' did not match with 'banana', eventhough the characers are included in the string but not the word. I am not sure how to do this.

Thank you.

Can you please explain the logic more? It seems like with the "argentinian meat (1 kg) cow" example that's supposed to connect with "cow meat" even though the words are in a different order, but "banana" is not supposed to connect with "anana." Can you describe specifically what the rules are that should generate that output?

product_list <- data.frame(product = c("banana from ecuador 1 unit", "argentinian meat (1 kg) cow", "chicken breast", "noodles", "salad", "chicken salad with egg"))

product_to_match <- c("cow meat", "deer meat", "cow milk", "chicken breast", "chicken egg salad", "anana")

product_list$product[which (product_list$product %in% product_to_match)]
#> [1] "chicken breast"

identifies the only match. If matching on substrings is required, consider creating a product_to_match set of stem words and working with {stringr}

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.