Creating a complex table from a data frame


#1

I have a dataframe in R, which I want to convert to a table using certain variables. The dataset is similar to the following...

Gender AgeBand Type of Ilness
Male 0 - 64 Ilness One
Male 0 - 64 Ilness One
Male 0 - 64 Illness Two
Male 65+ Ilness Three
Female 65+ Ilness Three
Male 0 - 64 Illness Four
Female 65+ Illness Four
Female 65+ Ilness Five
Female 0 - 64 Ilness Five
Female 65+ Illness Two

And I want it to look like this...
image

I've had a good look on the web and can't seem to find any code to do this. I've tried using the table function but it doesn't seem appropriate for this. I also want to save the table as it looks into a file so that the rownames become a column. Can anyone help please?


#2

Hi

You can use kable.
https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html

Look at the section called Grouped Columns / Rows for your header design.


#3

Hi @Elle

Im not sure if this is helpful but if you are using this in an Rmarkdown document you could create table using Kable and kableextra. A link to the github repository is here

If you need to export this to excel for example you could potentially openxlsx which has a github page here

If you want to try and do this as a dataframe within R itself so that it can be accessed as a dataframe, you could create a new column in your data and concatenate Gender and ageband together and then use tidyr::seperate to spread the columns out. A good example of this is here. You would need to use the spread function conditioning on the type of illness to get what your after

I have provided the github links but as far as i know all these libraries exist on CRAN


#4

You can use xtabs() to make a multidimensional tabulation, and then flatten it for display using ftable().

library(tidyverse)

df <-  tribble(
~Gender,    ~AgeBand,   ~TypeOfIllness,
 "Male",      "0 - 64", "Illness One",
 "Male",      "0 - 64", "Illness One",
 "Male",      "0 - 64", "Illness Two",
 "Male",      "65+",      "Illness Three",
 "Female",  "65+",    "Illness Three",
 "Male",      "0 - 64", "Illness Four",
 "Female",  "65+",    "Illness Four",
 "Female",  "65+",    "Illness Five",
 "Female",  "0 - 64",   "Illness Five",
 "Female",  "65+",    "Illness Two"
)

df <- df %>% mutate(
  Gender = factor(Gender),
  AgeBand = factor(AgeBand, levels = c("0 - 64", "65+")),
  TypeOfIllness = factor(
    TypeOfIllness, 
    levels = c(
      "Illness One", "Illness Two", 
      "Illness Three", "Illness Four", "Illness Five"
    )
  )
)


xtabs(~ TypeOfIllness + AgeBand + Gender, data = df) %>% 
  addmargins(c(1, 3)) %>% 
  ftable(col.vars = c("Gender", "AgeBand"))
#>               Gender  Female       Male        Sum    
#>               AgeBand 0 - 64 65+ 0 - 64 65+ 0 - 64 65+
#> TypeOfIllness                                         
#> Illness One                0   0      2   0      2   0
#> Illness Two                0   1      1   0      1   1
#> Illness Three              0   1      0   1      0   2
#> Illness Four               0   1      1   0      1   1
#> Illness Five               1   1      0   0      1   1
#> Sum                        1   4      4   1      5   5

Created on 2018-05-22 by the reprex package (v0.2.0).

The only difficult part is getting the Total column exactly like the one in your example. How important is that to what you're doing?

I'm not sure I follow this part. What sort of file format are you trying to save as?


#5

Here's a kableExtra example if your goal is to output the table to a document.

rmarkdown document:

---
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo=FALSE, warning=FALSE)

library(knitr)    
library(kableExtra)
library(tidyverse)
library(english)
library(tools)
```

```{r}
  df <-  tribble(
    ~Gender,    ~AgeBand,   ~TypeOfIllness,
    "Male",      "0 - 64", "Illness One",
    "Male",      "0 - 64", "Illness One",
    "Male",      "0 - 64", "Illness Two",
    "Male",      "65+",      "Illness Three",
    "Female",  "65+",    "Illness Three",
    "Male",      "0 - 64", "Illness Four",
    "Female",  "65+",    "Illness Four",
    "Female",  "65+",    "Illness Five",
    "Female",  "0 - 64",   "Illness Five",
    "Female",  "65+",    "Illness Two"
  )

```

```{r}
d = df %>% 
  mutate(TypeOfIllness = factor(TypeOfIllness, levels=paste("Illness", c(toTitleCase(as.character(english(1:5))),"Total")))) %>% 
  group_by(Gender, AgeBand, TypeOfIllness) %>% tally %>% 
  unite(key, Gender, AgeBand) %>%
  spread(key, n, fill=0) 

d = d %>% bind_rows(d %>% 
                      summarise_if(is.numeric, sum) %>% 
                      mutate(TypeOfIllness="Total")) %>% 
  mutate(`All Ages` = rowSums(.[-1])) %>%
  rename(`Type of Illness`=TypeOfIllness) %>% 
  set_names(gsub(".*_","", names(.)))


kable(d, format="latex", booktabs=TRUE) %>% 
  kable_styling() %>%
  add_header_above(c(" " = 1, "Female" = 2, "Male" = 2, "Total" = 1)) %>% 
  row_spec(nrow(d) - 1, extra_latex_after="\\hline") %>% 
  column_spec(ncol(d), border_left=TRUE)
```

PDF output