I need to convert a 'Checkbox Grid' question type from Google Forms into a crosstabulation. However, when I download the data from Google Forms to a Google Spreadsheet or Excel, multiple responses are grouped into a single column separated by commas (,).
Below is one way to achieve the desired outcome, which uses
pivot_wider() to reshape the data and
separate_rows() to create a new row for each entry separated by a comma.
library(tidyverse) # sample data df = data.frame( q1 = c('Coca Cola', 'Coca Cola, Sprite', 'Fanta', 'Inca Kola', 'Coca Cola', 'Fanta', 'Sprite'), q2 = c('Fanta, Sprite', 'Coca Cola', 'Fanta', 'Coca Cola, Inca Kola', 'Fanta, Sprite', 'Fanta', 'Fanta, Sprite'), q3 = c('Fanta', 'Coca Cola, Fanta, Sprite, Inca Kola', 'Fanta, Sprite', 'Coca Cola', 'Coca Cola, Inca Kola', 'Fanta', 'Sprite'), q4 = c('Coca Cola, Fanta, Sprite, Inca Kola', 'Ningun', 'Sprite', 'Ningun', 'Sprite', 'Sprite', 'Fanta, Inca Kola') ) out = df |> mutate(row = row_number()) |> pivot_longer(-'row') |> separate_rows(value, sep = ', ') |> count(name, value) |> pivot_wider(names_from = value, values_from = n) |> select(`Coca Cola`, Fanta, Sprite, `Inca Kola`, `Ningun`) |> mutate_all(~replace_na(., 0)) out #> # A tibble: 4 × 5 #> `Coca Cola` Fanta Sprite `Inca Kola` Ningun #> <int> <int> <int> <int> <int> #> 1 3 2 2 1 0 #> 2 2 5 3 1 0 #> 3 3 4 3 2 0 #> 4 1 2 4 2 2
Created on 2023-06-19 with reprex v2.0.2
Thank you very much for your help, I was able to perform my analysis
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.