I have a bigger version of the dataset below (the real version is loaded into R from Google Drive using googlesheets4). Each row represents an interview with someone in that state about which sports they play. (1 indicates the respondent plays the given sport, while 0 indicates they don't play it.) I want to use R to calculate how popular each sport is -- both globally across the entire study, and specifically within each state.
I'm a beginner with R so please correct me if I'm mistaken, but it seems like I have 3 problems here. First, I need to get this out of a binary format by creating a "Sport" column containing the sports each respondent mentioned on each row. But if I do that, I'll have multiple values in the Sport column on some rows (because some respondents play more than one sport). Then I'll need to count the total number of rows that mention each sport to get the total number of participants globally and (using group_by()) by state.
I don't know how to address those 3 problems or even whether I've correctly identified them. Ultimately I want to display the number of people playing each sports using ggplot or at least a printed table. I will highly appreciate any advice!
data <- tribble(
~State, ~Basketball, ~Baseball, ~Football,
"California", 0, 1, 1,
"Oregon", 1, 1, 1,
"Washington", 0, 1, 0,
"Washington", 1, 0, 0,
"California", 0, 1, 0,
"Washington", 1, 0, 1,
"Oregon", 0, 1, 0,
"Oregon", 0, 0, 0
)```