How do you convert a tidy data frame to a contingency table?

How do you convert a tidy data frame to a contingency table?

Below is an example tidy data frame with the contingency table that I want to turn it into.

gender <- c("M", "M", "F", "F")
handed <- c("R", "L", "R", "L")
freq <- c(43, 9, 44, 4)
df <- tibble(gender, handed, freq)

I would like to turn it into this.
image

And then this.
image

Also, any advice of packages/methods to work with contingency tables appreciated.

Thanks!!!

Base R already provides the table() and ftable() functions (see their corresponding help for more details):

library(dplyr)
df <- tibble(gender = c("M", "M", "F", "F"),
             handed = c("R", "L", "R", "L"),
             freq = c(43, 9, 44, 4)
             )

df %>% ftable()
#>               freq 4 9 43 44
#> gender handed               
#> F      L           1 0  0  0
#>        R           0 0  0  1
#> M      L           0 1  0  0
#>        R           0 0  1  0

df %>% table()
#> , , freq = 4
#> 
#>       handed
#> gender L R
#>      F 1 0
#>      M 0 0
#> 
#> , , freq = 9
#> 
#>       handed
#> gender L R
#>      F 0 0
#>      M 1 0
#> 
#> , , freq = 43
#> 
#>       handed
#> gender L R
#>      F 0 0
#>      M 0 1
#> 
#> , , freq = 44
#> 
#>       handed
#> gender L R
#>      F 0 1
#>      M 0 0

Created on 2021-03-20 by the reprex package (v1.0.0)

1 Like

You may also be interested in the janitor package, which contains several useful functions, such as tabyl for crosstabulation:

2 Likes

It is not clear if you are wanting that exact formatting, or just the data structure.

There are also functions that can do some of this from raw data but you are starting with something close to a contingency table aleady

require(dplyr)
require(tidyr)
require(forcats)
require(flextable)
require(janitor)

df <- tibble(gender = c("M", "M", "F", "F"),
             handed = c("R", "L", "R", "L"),
             freq = c(43, 9, 44, 4)
)

df %>%
    # replace the columns
    transmute(
       # make gender become Sex, recode M and F into full names (for row headers)
        Sex = fct_recode(gender, Male = "M", Female = "F"),

        # make handed R and L be recoded
        Handedness  = fct_recode(handed, `Right-handed` = "R", `Left-handed`="L"),

        #keep frequency
        freq = freq) %>%

    # Make the contingency layout
    pivot_wider(names_from = Handedness, values_from = freq) %>%

    # add row and column totals
    adorn_totals(c("row", "col")) %>%

    # create a nice table layout
    flextable()%>%

    # add a merged header
    add_header_row(values=c("","Handedness"), colwidths = c(1,3)  ) %>%

    # Apply a grid
    theme_box() %>%

    # make first column bold
    bold(j=1)

    # You can look up how to shade the boxes yourself.  I don't think you can split a cell diag

screenshot-r.chemo.org.uk-2021.03.21-00_43_27

3 Likes

Thanks @CALUM_POLWART this is super helpful!!

xtabs and addmargins in base R seem to do what you want:

xtabs(freq ~ gender + handed) %>% 
  addmargins()

      handed
gender   L   R Sum
   F     4  44  48
   M     9  43  52
   Sum  13  87 100
3 Likes

Just for completeness.

library(tidyverse)
library(janitor)

df %>%
pivot_wider(names_from = handed, values_from = freq) %>%
adorn_totals(c("row", "col"))

gender R L Total
M 43 9 52
F 44 4 48
Total 87 13 100

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.