Creating a neat table with Kable()

Hi there,

I am in the process of working out how to create a neat table (and have never made a table at all before so just starting out!).

I am looking at the PHS open data for unintentional injuries and making a table looking at total Scotland data for 2018. I have made a table which looks ok - but I would like to create subdivision of rows to divide the results by male and female to make it look neater, as well as format the heading font size and make it a more attractive appearance. I have tried a couple of things in kableExtra but couldn't get it to work. Code below. Does anyone have any suggestions?

# Read in mortality data
orig_ui_deaths = read_csv("https://www.opendata.nhs.scot/dataset/b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/89807e07-fc5f-4b5e-a077-e4cf59491139/download/ui_deaths_2020.csv")
# Create table of number of deaths by injury type, age, and sex
# Pipe in orig_ui_deaths dataset
orig_ui_deaths %>%
  # Apply filters in same was as in plots so looking at one year, whole of Scotland, and exlucing "All" entries
  filter(Year == "2018", HBR == "S92000003", AgeGroup != "All" & Sex != "All", InjuryType != "Accidental exposure" & InjuryType != "All")%>%
# Group the table according to injury type, age, and sex
  group_by(InjuryType, AgeGroup, Sex) %>%
# Create a summary of total number of deaths for neater appearance to the table and demonstration of the figures
  summarise(total_deaths = sum(NumberOfDeaths)) %>%
# Change the orientation of the table so that age groups become the variable headings
  pivot_wider(names_from = AgeGroup, values_from = total_deaths) %>%
# Create table to present the data]
  knitr::kable(caption = "Deaths from unintentional injuries in Scotland")
  

Many thanks!

subdivision of rows to divide the results by male and female

is unclear. It's simple enough to create separate tables by sex.

Editorial suggestion: remove the top and bottom pair of rows and add a note that there were no injuries of those types.

kable has a option to set font size

kable(dt) %>%
  kable_styling(bootstrap_options = "striped", font_size = 7)

See this post

Thanks very much that's helpful. I'm trying to split the table so that the results can be seen by female and then male separately rather than just as subsequent rows in the table. I can't work out how to do this within the same table but wondered if there is a split function for the code so that two smaller tables could be created, one for female and one for male?

Many thanks again

Hi CM1,

There's an option to create groups where you exclude the factor Sex when generating the table in knitr::kable, but you can still use Sex as a grouping index in kableExtra::pack_rows.

EDITS:

  • Contents are now in a Rmarkdown file.
  • Changed kable output format from html to LaTeX.
  • Added LaTeX package caption to set up larger captions in YAML header of rmarkdown document.
---
title: "Table-kable"
output: pdf_document
geometry: margin=0.5in
header-includes:
- \usepackage{caption}
- \captionsetup{font=Large}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE)
```

```{r}
library("knitr")
library("kableExtra")
library("dplyr")
library("readr")
library("tidyr")
library("forcats")

orig_ui_deaths = read_csv("https://www.opendata.nhs.scot/dataset/b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/89807e07-fc5f-4b5e-a077-e4cf59491139/download/ui_deaths_2020.csv")

summary_tab <- orig_ui_deaths %>%
  # Apply filters in same was as in plots so looking at one year, whole of Scotland, and exlucing "All" entries
  filter(Year == "2018", HBR == "S92000003", AgeGroup != "All" & Sex != "All", InjuryType != "Accidental exposure" & InjuryType != "All") %>%
# Group the table according to injury type, age, and sex
  group_by(InjuryType, AgeGroup, Sex) %>%
# Create a summary of total number of deaths for neater appearance to the table and demonstration of the figures
  summarise(total_deaths = sum(NumberOfDeaths)) %>%
# Change the orientation of the table so that age groups become the variable headings
  pivot_wider(names_from = AgeGroup, values_from = total_deaths) %>%
  mutate(Sex = factor(Sex)) %>%
  arrange(Sex) %>%
  select(Sex, everything())

# Create table to present the data]
Fig_2 <- kable(summary_tab[, -1], 
  caption = "Deaths from unintentional injuries in Scotland",
    format = "latex", booktabs = TRUE) %>%
  kable_styling(font_size = 10) %>%
  pack_rows(tab_kable, colnum = 1,
    index = table(fct_inorder(summary_tab$Sex), useNA = "no"))
```
```{r, results='asis'}
Fig_2
```

2 Likes

Hi Jim,

This is absolutely brilliant, thanks so much for helping with this. Exactly what I was looking to do. Just wondered if you or anyone else might know how to make the font size of the title bigger?

Many thanks again,

CM

Hi Jim,

This is so helpful, thanks very much. Just running it in R studio and I can't seem to print the table. When I enter in this script exactly (loading the packages too) it comes up with an empty space. I tried to assign it to an object with -> Fig_2 at the end and then print and this also didn't work. Not quite sure this is happening now as previous code printed?

Thanks so much again for all of your help,

Clare

Hi Clare,
I've updated the formatting in my answer. It should work if it is run as an Rmarkdown file, knitting to a pdf output. You may have to install a version of LaTeX, such as installed from the R package tinytex, as well as the LaTeX package "caption" via tinytex::tlmgr_install("caption").

Jim

Dear Jim,

Thanks again so much for your help with the table. I am trying to create a pdf document (needs to be pdf) to present the code chunks and working and print the final three figures in the pdf but the text seems to be outside the code margins of the boxes when printed in pdf. Also, I don't know how to change the font of the tables to the same as the text - maybe not possible?

Many thanks again,

CM1

title: 'Unintentional injuries in Scotland: admissions to hospital and deaths'
author: "CM1"
output:
pdf_document: default
html_document: default
word_document: default
toc: yes
toc_float: yes
theme: flatly

r format(Sys.time(), '%d %B, %Y')

Overview

Unintentional injury is a common cause of emergency admission to hospital for adults and children.

Data Processing

# Set coding display settings to show code in HTML document, note cog settings switched to prevent warnings and messages from being presented in HTML document
knitr::opts_chunk$set(
	echo = TRUE,
	message = FALSE,
	warning = FALSE,
	dpi = 300)
# Load in R packages
library(tidyverse)
library(ggthemes)
# Load the janitor package to allow extra themes and geoms for plotting
library(janitor)
# Set and examine a variety of global options which affect the way that R computes and displays results. Scipen refers to the use of integers where a penalty is applied when deciding to print numeric values in fixed or exponential
options(scipen = 9)
library(knitr)
library("kableExtra")
library("dplyr")
library("readr")
library("tidyr")
library("forcats")
library("formatR")
# Read in admissions data
orig_ui_admissions = read_csv("https://www.opendata.nhs.scot/dataset/b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/aee43295-2a13-48f6-bf05-92769ca7c6cf/download/ui_admissions_2020.csv")
# Review the nature and content of variable headings to assist with code writing for plotting
glimpse(orig_ui_admissions)
# Fix the size of the figure by adding comments into {r} command
# Pipe the original admissions data set
orig_ui_admissions %>% 
# Filter the data to include the most recent year or data, the whole of Scotland health board data as comparison is then with a larger figure and focuses on the distribution of unintentional injuries by age and sex. Exclude data points that include "All" as will confuse the creation of a table
  filter(FinancialYear == "2018/19", HBR == "S92000003" & AgeGroup != "All" & Sex != "All" & InjuryType != "All Diagnoses") %>% 
# Order age group categories in ascending order to provide clearer presentation and interpretation of the data
  mutate(AgeGroup = factor(AgeGroup, levels = c("0-4 years", "5-9 years", "10-14 years", "15-24 years", "25-44 years", "45-64 years", "65-74 years", "75plus years"))) %>%
# Group the data according to sex, age group, and type of injury
 group_by(Sex, AgeGroup, InjuryType)%>%
# Add all the admissions for each group, creating a cumulative total
  summarise(n = sum (NumberOfAdmissions)) %>%
# Create a new variable for total number of admissions for each type of injury for analysis of relative frequencies of each type of injury according to age and sex
  mutate(pct = n / sum(n)) %>%
# Set axis characteristics, and fill the columns according to injury type
  ggplot(aes(x = InjuryType, y = pct, fill = InjuryType)) +
# Specify style of plot
  geom_col()+
# Create a grid of tables grouped according to age and sex for easy visual comparison
  facet_wrap(~AgeGroup, nrow = 2) +
  facet_grid(Sex~AgeGroup)+
# Ensure that scales are continuous and range from 0 - 95% so that labels appear tidy on the plot
  scale_y_continuous(labels = scales::percent_format(), limits = c(0, 0.95)) +
# Specfic limits of graph from 0 - 100% so that images are uniform and allow comparison
  expand_limits(y = c(0, 100)) +
# Remove padding around limits of plots before zero
  coord_cartesian(expand = FALSE) +
# Specify theme for presentation of plot
  theme_excel_new() +
# Specify position of the legend
  theme(axis.text.x = element_blank(), legend.position = "bottom") +
# Create title and subtitle for the plot and create an object name
  labs(title = "Admissions to hospital with unintentional injury", subtitle = "Grouped by age and sex", x = "", y = "Percentage of unintentional injury admissions") -> Fig_1

# Read in mortality data
orig_ui_deaths = read_csv("https://www.opendata.nhs.scot/dataset/b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/89807e07-fc5f-4b5e-a077-e4cf59491139/download/ui_deaths_2020.csv")
# Review the nature and content of variable headings 
glimpse(orig_ui_deaths)
library("knitr")
library("kableExtra")
library("dplyr")
library("readr")
library("tidyr")
library("forcats")

orig_ui_deaths = read_csv("https://www.opendata.nhs.scot/dataset/b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/89807e07-fc5f-4b5e-a077-e4cf59491139/download/ui_deaths_2020.csv")

summary_tab <- orig_ui_deaths %>%
  # Apply filters in same was as in plots so looking at one year, whole of Scotland, and exlucing "All" entries
  filter(Year == "2018", HBR == "S92000003", AgeGroup != "All" & Sex != "All", InjuryType != "Accidental exposure" & InjuryType != "All") %>%
# Group the table according to injury type, age, and sex
  group_by(InjuryType, AgeGroup, Sex) %>%
# Create a summary of total number of deaths for neater appearance to the table and demonstration of the figures
  summarise(total_deaths = sum(NumberOfDeaths)) %>%
# Change the orientation of the table so that age groups become the variable headings
  pivot_wider(names_from = AgeGroup, values_from = total_deaths) %>%
  mutate(Sex = factor(Sex)) %>%
  arrange(Sex) %>%
  select(Sex, everything())

# Create table to present the data]
Fig_2 <- kable(summary_tab[, -1], 
  caption = "Deaths from unintentional injuries in Scotland",
    format = "latex", booktabs = TRUE) %>%
  kable_styling(font_size = 10) %>%
  pack_rows(tab_kable, colnum = 1,
    index = table(fct_inorder(summary_tab$Sex), useNA = "no"))
# Join original admissions and deaths tibbles using full_join so that all admission and death variables are included in the tibble
ui_total1 = full_join(orig_ui_admissions, orig_ui_deaths)
# Review the nature and content of variable headings to assist with code writing for plotting
glimpse(ui_total1)
# Fix the size of the figure by adding comments into {r} command
# Pipe the joined data set
ui_total1 %>% 
# Filter the data to include the most recent year or data, the whole of Scotland health board data as comparison is then with a larger figure and focuses on the distribution of unintentional injuries by age and sex. Exclude data points that include "All" as will confuse the creation of a table
  filter(FinancialYear == "2018/19" & HBR == "S92000003" & AgeGroup != "All" & Sex != "All" & InjuryType != "All Diagnoses" & NumberOfDeaths != "NA") %>%
# Order age group categories in ascending order to provide clearer presentation and interpretation of the data
  mutate(AgeGroup = factor(AgeGroup, levels = c("0-4 years", "5-9 years", "10-14 years", "15-24 years", "25-44 years", "45-64 years", "65-74 years", "75plus years"))) %>% 
# Add all the admissions for each group, creating a cumulative total
  group_by(Sex, AgeGroup, InjuryType) %>%
# Add all the deaths for each group, creating a cumulative total
  summarise(n = sum(NumberOfDeaths), m = sum(NumberOfAdmissions)) %>%
# Create a new variable for proportion of number of deaths for admission for each type of injury for each type of injury according to age and sex
  mutate(pct = n / m) %>%
# Set axis characteristics, and fill the columns according to injury type
ggplot(aes(x = InjuryType, y = pct, fill = InjuryType)) +
# Specify style of plot
  geom_col()+
# Create a grid of tables grouped according to age and sex for easy visual comparison
  facet_wrap(~AgeGroup, nrow = 2) +
  facet_grid(Sex~AgeGroup)+
# Ensure that scales are continuous and range from 0 - 30% due to smaller numbers
  scale_y_continuous(labels = scales::percent_format(), limits = c(0, 0.3)) +
# Remove padding around limits of plots before zero
  coord_cartesian(expand = FALSE) +
# Specify theme for presentation of plot
  theme_excel_new() +
# Specify position of the legend
  theme(axis.text.x = element_blank(), legend.position = "bottom") +
# Create title and subtitle for the plot and create an object name
  labs(title = "Percentage of deaths among those admitted with unintentional injury", subtitle = "by age and sex", x = "", y = "number of patients") -> Fig_3

Results

Fig_1

Fig_2

Fig_3

Hi CM1,
For code, including comments, one rule-of-thumb is that lines are wrapped at 80 characters. With this limit, the code should fit nicely inside the boxes. In some cases, such as the URLs of the data set, I broke the code up into separate chunks and pasted them together with no space.

For producing books, etc, I think the recommended length is around 60 characters for readability.

As far as I can tell, the font in the tables is the same as in the main text, and in this example, set to Latin Modern.

Other than that, I've had to add \newpage between the figures and table so that they would appear in the correct order.

---
title: "Unintentional injuries in Scotland: admissions to hospital and deaths"
author: "CM1"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  pdf_document:
    fig_crop: no
toc: yes
toc_float: yes
geometry: margin=0.5in
header-includes:
- \usepackage{caption}
- \captionsetup{font=Large}
- \usepackage{lmodern}
- \usepackage[T1]{fontenc}
theme: flatly
---
# Overview

Unintentional injury is a common cause of emergency admission to hospital 
for adults and children.

# Data Processing
```{r setup, include=FALSE}
knitr::opts_chunk$set(
	echo = TRUE,
	message = FALSE,
	warning = FALSE,
	dpi = 300)
```

```{r ui-admissions}
library("knitr")
library("kableExtra")
library("dplyr")
library("readr")
library("tidyr")
library("forcats")
library("ggplot2")
library("ggthemes")

orig_ui_admissions = read_csv(paste0("https://www.opendata.nhs.scot/dataset/",
  "b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/",
  "aee43295-2a13-48f6-bf05-92769ca7c6cf/download/ui_admissions_2020.csv"))

# Fix the size of the figure by adding comments into {r} command
# Pipe the original admissions data set
orig_ui_admissions %>% 
# Filter the data to include the most recent year or data, the whole of Scotland 
#   health board data as comparison is then with a larger figure and focuses on 
#   the distribution of unintentional injuries by age and sex. Exclude data 
#   points that include "All" as will confuse the creation of a table
  filter(FinancialYear == "2018/19", HBR == "S92000003" & AgeGroup != "All" & 
      Sex != "All" & InjuryType != "All Diagnoses") %>% 
# Order age group categories in ascending order to provide clearer presentation 
#   and interpretation of the data
  mutate(AgeGroup = factor(AgeGroup, levels = c("0-4 years", "5-9 years", 
    "10-14 years", "15-24 years", "25-44 years", "45-64 years", "65-74 years", 
    "75plus years"))) %>%
# Group the data according to sex, age group, and type of injury
 group_by(Sex, AgeGroup, InjuryType) %>%
# Add all the admissions for each group, creating a cumulative total
  summarise(n = sum(NumberOfAdmissions)) %>%
# Create a new variable for total number of admissions for each type of injury 
#   for analysis of relative frequencies of each type of injury according to age
#   and sex
  mutate(pct = n / sum(n)) %>%
# Set axis characteristics, and fill the columns according to injury type
  ggplot(aes(x = InjuryType, y = pct, fill = InjuryType)) +
# Specify style of plot
  geom_col() +
# Create a grid of tables grouped according to age and sex for easy visual
#   comparison
  #facet_wrap(~AgeGroup, nrow = 2) +
  facet_grid(Sex~AgeGroup) +
# Ensure that scales are continuous and range from 0 - 95% so that labels appear 
#   tidy on the plot
  scale_y_continuous(labels = scales::percent_format(), limits = c(0, 0.95)) +
# Specfic limits of graph from 0 - 100% so that images are uniform and allow 
#   comparison
  expand_limits(y = c(0, 100)) +
# Remove padding around limits of plots before zero
  coord_cartesian(expand = FALSE) +
# Specify theme for presentation of plot
  theme_excel_new() +
# Specify position of the legend
  theme(axis.text.x = element_blank(), legend.position = "bottom") +
# Create title and subtitle for the plot and create an object name
  labs(title = "Admissions to hospital with unintentional injury",
    subtitle = "Grouped by age and sex", x = "",
    y = "Percentage of unintentional injury admissions") -> Fig_1

```

```{r ui-deaths}
orig_ui_deaths = read_csv(paste0("https://www.opendata.nhs.scot/dataset/",
  "b0135993-3d8a-4f3b-afcf-e01f4d52137c/resource/",
  "89807e07-fc5f-4b5e-a077-e4cf59491139/download/ui_deaths_2020.csv"))

summary_tab <- orig_ui_deaths %>%
# Apply filters in same was as in plots so looking at one year, whole of 
#   Scotland, and excluding "All" entries
  filter(Year == "2018", HBR == "S92000003", AgeGroup != "All" & Sex != "All",
    InjuryType != "Accidental exposure" & InjuryType != "All") %>%
# Group the table according to injury type, age, and sex
  group_by(InjuryType, AgeGroup, Sex) %>%
# Create a summary of total number of deaths for neater appearance to the table 
#   and demonstration of the figures
  summarise(total_deaths = sum(NumberOfDeaths)) %>%
# Change the orientation of the table so that age groups become the variable 
#   headings
  pivot_wider(names_from = AgeGroup, values_from = total_deaths) %>%
  mutate(Sex = factor(Sex)) %>%
  arrange(Sex) %>%
  select(Sex, everything())

# Create table to present the data]

Fig_2 <- kable(summary_tab[, -1], 
  caption = "Deaths from unintentional injuries in Scotland",
    format = "latex", booktabs = TRUE) %>%
  kable_styling(font_size = 10) %>%
  pack_rows(tab_kable, colnum = 1,
    index = table(fct_inorder(summary_tab$Sex), useNA = "no"))

```

```{r ui-total}
# Join original admissions and deaths tibbles using full_join so that all 
#   admission and death variables are included in the tibble
ui_total1 = full_join(orig_ui_admissions, orig_ui_deaths)

# Fix the size of the figure by adding comments into {r} command
# Pipe the joined data set
ui_total1 %>% 
# Filter the data to include the most recent year or data, the whole of Scotland 
#   health board data as comparison is then with a larger figure and focuses on 
#   the distribution of unintentional injuries by age and sex. Exclude data points
#   that include "All" as will confuse the creation of a table
  filter(FinancialYear == "2018/19" & HBR == "S92000003" & AgeGroup != "All" & 
    Sex != "All" & InjuryType != "All Diagnoses" & NumberOfDeaths != "NA") %>%
# Order age group categories in ascending order to provide clearer presentation 
#   and interpretation of the data
  mutate(AgeGroup = factor(AgeGroup, levels = c("0-4 years", "5-9 years", 
    "10-14 years", "15-24 years", "25-44 years", "45-64 years", "65-74 years", 
    "75plus years"))) %>% 
# Add all the admissions for each group, creating a cumulative total
  group_by(Sex, AgeGroup, InjuryType) %>%
# Add all the deaths for each group, creating a cumulative total
  summarise(n = sum(NumberOfDeaths), m = sum(NumberOfAdmissions)) %>%
# Create a new variable for proportion of number of deaths for admission for 
#   each type of injury for each type of injury according to age and sex
  mutate(pct = n / m) %>%
# Set axis characteristics, and fill the columns according to injury type
ggplot(aes(x = InjuryType, y = pct, fill = InjuryType)) +
# Specify style of plot
  geom_col() +
# Create a grid of tables grouped according to age and sex for easy visual
#   comparison
# facet_wrap(~AgeGroup, nrow = 2) +
  facet_grid(Sex~AgeGroup) +
# Ensure that scales are continuous and range from 0 - 30% due to smaller numbers
  scale_y_continuous(labels = scales::percent_format(), limits = c(0, 0.3)) +
# Remove padding around limits of plots before zero
  coord_cartesian(expand = FALSE) +
# Specify theme for presentation of plot
  theme_excel_new() +
# Specify position of the legend
  theme(axis.text.x = element_blank(), legend.position = "bottom") +
# Create title and subtitle for the plot and create an object name
  labs(title = "Percentage of deaths among those admitted with unintentional injury", 
    subtitle = "by age and sex", x = "", y = "number of patients") -> Fig_3
```


# Results

```{r Fig1, echo = FALSE}
Fig_1
```

\newpage

```{r, Fig2, results='asis', echo=FALSE}
Fig_2
```

\newpage

```{r, Fig3, echo=FALSE}
Fig_3
```

Hi Jim,

That is so helpful, thank you so much. Good to know about the lengths of comment in a row and to limit this. Also really helpful to know about the \newpage function for a cleaner look.

Many thanks again,
CM1

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Hi CM,
I've edited my answer to address the issue of making the title larger.

1 Like