Adding Rows at Different Positions to a Contingency Table in R

I am using R. For this random data set that I generated, I created the following contingency table:

library(memisc)
library(dplyr)

set.seed(123)

v1 <- c("2010-2011","2011-2012", "2012-2013", "2013-2014", "2014-2015") 
v2 <- c("A", "B", "C", "D", "E")
v3 <- c("Z", "Y", "X", "W" )
v4 <- c("data_1", "data_2", "data_3", "data_4" )


dates <- as.factor(sample(v1, 1000, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

types <- as.factor(sample(v2,1000, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

types2 <- as.factor(sample(v3, 1000, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

names <- as.factor(sample(v3, 1000, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

var = rnorm(1000,10,10)

problem_data = data.frame(var,dates, types, types2, names)


summary <- xtabs(~dates+names+types+types2, problem_data)
t = ftable(summary, row.vars=1, col.vars=2:4)

show_html(t)

enter image description here

If I wanted to add a row containing "Grand Totals" to the bottom of this table, I could do this as follows:

totals <- problem_data %>% group_by(names,  types, types2) %>% summarise(totals = n())
memisc::show_html(rbind(t, totals = totals$totals), varinfront = FALSE)

enter image description here

Is it possible to add "totals" at arbitrary positions in this contingency table?

For example, suppose I want to find the totals for the first two rows (2010-2011, 2011-2012), and then insert this total in this table at the third row. I can calculate the totals for the firs two rows:

first_two_rows = subset(problem_data, dates %in% c("2010-2011","2011-2012"))

totals_first_two_rows <- first_two_rows %>% group_by(names,  types, types2) %>% summarise(totals = n())

But how can this "totals_first_two_rows" be added to the third position of the contingency table? Using this stackoverflow post (Add new row to dataframe, at specific row-index, not appended?), I tried using the function provided in the answer:

insertRow <- function(existingDF, newrow, r) {
    existingDF[seq(r+1,nrow(existingDF)+1),] <- existingDF[seq(r,nrow(existingDF)),]
    existingDF[r,] <- newrow
    existingDF
}

insertRow(t, totals_first_two_rows, 3)

But this returns the following error:

Error in `[<-`(`*tmp*`, seq(r + 1, nrow(existingDF) + 1), , value = existingDF[seq(r,  : 
  subscript out of bounds

Can someone please show me how to fix this problem?

Thanks!

The gt package might be an alternative solution for you to try.

edit: Also see janitor::tabyl tabyls: a tidy, fully-featured approach to counting things • janitor

1 Like

Thank you! But is this possible without "gt"? Thanks!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.