Wrangling binary data and plotting multiple values per column

I have a bigger version of the dataset below (the real version is loaded into R from Google Drive using googlesheets4). Each row represents an interview with someone in that state about which sports they play. (1 indicates the respondent plays the given sport, while 0 indicates they don't play it.) I want to use R to calculate how popular each sport is -- both globally across the entire study, and specifically within each state.

I'm a beginner with R so please correct me if I'm mistaken, but it seems like I have 3 problems here. First, I need to get this out of a binary format by creating a "Sport" column containing the sports each respondent mentioned on each row. But if I do that, I'll have multiple values in the Sport column on some rows (because some respondents play more than one sport). Then I'll need to count the total number of rows that mention each sport to get the total number of participants globally and (using group_by()) by state.

I don't know how to address those 3 problems or even whether I've correctly identified them. Ultimately I want to display the number of people playing each sports using ggplot or at least a printed table. I will highly appreciate any advice! :slight_smile:

data <- tribble(
  ~State, ~Basketball,  ~Baseball, ~Football,
  "California", 0,  1, 1,
  "Oregon", 1,  1, 1,
  "Washington", 0, 1, 0,
  "Washington", 1, 0, 0,
  "California", 0, 1, 0,
  "Washington", 1, 0, 1,
  "Oregon", 0, 1, 0,
  "Oregon", 0, 0, 0

Like this?


data %>% 
  pivot_longer(-State, names_to = "sport", values_to = "count") %>% 
  group_by(State, sport) %>% 
  summarise(count = sum(count))

# A tibble: 9 x 3
# Groups:   State [3]
  State      sport      count
  <chr>      <chr>      <dbl>
1 California Baseball       2
2 California Basketball     0
3 California Football       1
4 Oregon     Baseball       2
5 Oregon     Basketball     1
6 Oregon     Football       1
7 Washington Baseball       1
8 Washington Basketball     2
9 Washington Football       1

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.