pivot a two-way frequency table with proportions

I would like to create a two-way frequency table containing both counts and row percentages, ideally using tidyverse functions. Here's what I have so far. In this example I want to get counts of Y or N RSVP by Sex:

library(dplyr)

df <- data.frame(Sex = c("M","F","M","M","F","F"),
                 RSVP = c("Y","N","N","Y","N","Y"),stringsAsFactors = FALSE)


df %>% 
  count(Sex, RSVP) %>% 
  group_by(Sex) %>% 
  mutate(prop = prop.table(n))

  Sex   RSVP      n  prop
  <chr> <chr> <int> <dbl>
1 F     N         2 0.667
2 F     Y         1 0.333
3 M     N         1 0.333
4 M     Y         2 0.667

This appears to calculate correctly but I am having a hard time re-shaping this output such that the values of RSVP become two columns Y and N.

2 Likes

You might want to look into the janitor package!

library(dplyr, warn.conflicts = FALSE)
library(janitor)

df <- tibble(Sex = c("M","F","M","M","F","F"),
             RSVP = c("Y","N","N","Y","N","Y"))

df %>%
  tabyl(Sex, RSVP) %>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(digits = 1) %>%
  adorn_ns("front")
#>  Sex         N         Y
#>    F 2 (66.7%) 1 (33.3%)
#>    M 1 (33.3%) 2 (66.7%)

Created on 2019-04-03 by the reprex package (v0.2.1.9000)

5 Likes

This can be done by adding tidyr to the mix. Because tidyr::spread can only handle a single value column at a time, we'll use tidyr::gather to force the counts and proportions into a single column. After that, we create a new column with the intended column names and spread it out.

library(tidyr)

df %>% 
  count(Sex, RSVP) %>% 
  group_by(Sex) %>% 
  mutate(prop = prop.table(n)) %>%
  gather(key = "measure", value = "value", n, prop) %>%
  mutate(label = paste0(measure, "_", RSVP)) %>%
  select(-measure, -RSVP) %>%
  spread(key = "label", value = "value")
# # A tibble: 2 x 5
# # Groups:   Sex [2]
#   Sex     n_N   n_Y prop_N prop_Y
#   <chr> <dbl> <dbl>  <dbl>  <dbl>
# 1 F         2     1  0.667  0.333
# 2 M         1     2  0.333  0.667

And, just for completeness, this is how to get a similar result using base R:

counts <- table(df)
props <- prop.table(counts, margin = 1)
colnames(counts) <- paste0("n_", colnames(counts))
colnames(props) <- paste0("prop_", colnames(props))
cbind(counts, props)
#   n_N n_Y    prop_N    prop_Y
# F   2   1 0.6666667 0.3333333
# M   1   2 0.3333333 0.6666667
4 Likes

Thank you, I have been using that to add totals to my tables, great package! I should have added that I wanted to print the counts and percentages as distinct columns, from what I could tell with janitor they would print as one column.

Yes, that's definitely one of my (only!) qualms about janitor! Glad you found a solution though :slightly_smiling_face:

1 Like

Thanks very much, that was my issue precisely.

You can save a line of code with tidyr::unite(). Also, I've been trying to wrap my head around the new tidyr functions pivot_longer() and pivot_wider() for specifying more complex data reshaping in a single line of code, so I've included an example of that approach below as well.

# Need development version of tidyr for pivot_wider()
# devtools::install_github(tidyverse/tidyr)
library(tidyverse)

df <- data.frame(Sex = c("M","F","M","M","F","F"),
                 RSVP = c("Y","N","N","Y","N","Y"),stringsAsFactors = FALSE)

# Using unite() with @nwerth's answer 
df %>% 
  count(Sex, RSVP) %>% 
  group_by(Sex) %>% 
  mutate(prop = prop.table(n)) %>%
  gather(measure, value, n, prop) %>%
  unite(measure, measure, RSVP) %>% 
  spread(measure, value)
#> # A tibble: 2 x 5
#> # Groups:   Sex [2]
#>   Sex     n_N   n_Y prop_N prop_Y
#>   <chr> <dbl> <dbl>  <dbl>  <dbl>
#> 1 F         2     1  0.667  0.333
#> 2 M         1     2  0.333  0.667
# Using pivot_wider()
df %>% 
  count(Sex, RSVP) %>% 
  group_by(Sex) %>% 
  mutate(prop = prop.table(n)) %>% 
  pivot_wider(names_from=RSVP, values_from=c(n, prop))
#> # A tibble: 2 x 5
#>   Sex     n_N   n_Y prop_N prop_Y
#>   <chr> <int> <int>  <dbl>  <dbl>
#> 1 F         2     1  0.667  0.333
#> 2 M         1     2  0.333  0.667
4 Likes

This is really helpful, thank you. I had been hearing of unite() and the changes that would be coming with pivot_longer() and pivot_wider() so putting them into context here is very much appreciated.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.