dplyr::count -- include a 0 for factor levels not in the data

Hi, I am summarizing responses to a Likert-style survey item. In some cases, there are item levels (which I coded as factors) that have no responses, but for purposes of summarizing I would like to include them in the resulting table as a 0 (or I suppose NA would be fine too). What might be a good approach for this?

Here is what I an envisioning:

library(tidyverse)

sampsurvey <- data.frame(rating = c("Agree","Strongly Agree", "Strongly Disagree", "Disagree",
                                    "Strongly Disagree", "Agree"))

# Assign factor levels -- 5 levels
sampsurvey$rating <- factor(sampsurvey$rating, levels = c("Strongly Disagree", 
                                                          "Disagree",
                                                          "Neutral",
                                                          "Agree",
                                                          "Strongly Agree"))
                            
                            
                            
# How do I get the "Neutral" level counted as a zero in this table?
sampsurvey %>% count(rating)

Here is my rather inelegant solution:

### I would think there would be a better way than this?

sampsurvey <- data.frame(rating = c("Agree","Strongly Agree", "Strongly Disagree", "Disagree",
                                    "Strongly Disagree", "Neutral"),
                         itemcount = c(1, 1, 1, 1, 1, NA))

sampsurvey$rating <- factor(sampsurvey$rating, levels = c("Strongly Disagree", 
                                                          "Disagree",
                                                          "Neutral",
                                                          "Agree",
                                                          "Strongly Agree"))
sampsurvey %>% 
  group_by(rating) %>% 
  summarise(nresponses = sum(itemcount))

1 Like

I was certain that this is how it's supposed to work out of the box, but turns out there is still a default setting that drops the empty groups. So, you can get what you want with .drop = FALSE in count:

library(tidyverse)

sampsurvey <- data.frame(rating = c("Agree","Strongly Agree", "Strongly Disagree", "Disagree",
                                    "Strongly Disagree", "Agree"))

# Assign factor levels -- 5 levels
sampsurvey$rating <- factor(sampsurvey$rating, levels = c("Strongly Disagree", 
                                                          "Disagree",
                                                          "Neutral",
                                                          "Agree",
                                                          "Strongly Agree"))



# How do I get the "Neutral" level counted as a zero in this table?
sampsurvey %>% 
  count(rating, .drop = FALSE) 
#> # A tibble: 5 x 2
#>   rating                n
#>   <fct>             <int>
#> 1 Strongly Disagree     2
#> 2 Disagree              1
#> 3 Neutral               0
#> 4 Agree                 2
#> 5 Strongly Agree        1

Created on 2019-03-18 by the reprex package (v0.2.1)

5 Likes

This is the change in 0.8.0 and first iteration had .drop = FALSE by default, but it was quickly changed because it proved to be too disruptive.

1 Like

Excellent & resolved!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.