How to count variables from multiple columns?

Meowster · May 3, 2023, 11:41am

I am attempting to analyze correlation between coffee and how it tastes.

Basically how coffee's varietal, process, origin, etc. influence coffee's taste. I have a sizable set of data but 'cupping notes' usually contain more than 1 flavor notes.

I have divided flavor notes into few columns. Here is an example data-

coffee_d <- data.frame(origin = c("Africa", "Africa", "Asia", "Asia"), process = c("Washed", "Natural", "Washed", "Anaerobic"), 
flavor1 = c("Nut", "Chocolate", "Ripe Fruit", "Strawberry"), flavor2 = c("Chocolate", "Spice", "Chocolate", "Citrus") )

Now, I would like to count multiple each instance of "Chocolate" note from "Africa" in both flavor1 and flavor2 column.
I would also like to use pairs.panels() to evaluate correlation between coffee origins, process, and flavors.
I simply can't figure out how to collapse the column flavor1 and flavor2 into a single flavors column, containing multiple variables.

I hope I am making some sense.
Thanks in advance.

Leon · May 3, 2023, 12:23pm

Try to play a bit around with this:

library("tidyverse")
coffee_d %>%
  pivot_longer(
    cols = contains("flavor"),
    names_to = "flavor_n",
    values_to = "flavor") %>%
  count(origin, flavor)

Meowster · May 3, 2023, 3:41pm

Thanks! This solves some problems!

Now I have to find a way to analyze the data using pairs.panels() without duplicating(triplicating/quad) some data.

I am not entirely sure how I would achieve this without creating a variable that means "Chocolate", "Nut", and "Strawberry" at the same time.

Leon · May 4, 2023, 8:33pm

Can you give an example of what you're trying to achieve?

system · May 11, 2023, 8:34pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.