Apply paired t test across multiple columns

Hello everyone,
I hope you all are doing well.

I am having trouble with coding a simple function called 'pairwise_t_test' across multiple columns.

t_test <- iris %>%
group_by(Species) %>%
rstatix::pairwise_t_test(column_name ~ Species, p.adjust.method = "bonferroni")

I don't know exactly I can write a loop or function codes to apply the code across all the remaining columns at once.

Thank you again.

First, your code doesn't seem to work. I don't think you should group_by before applying the test, the grouping is specified by the formula.

Then, to put it inside a function, you can build your formula as a string and convert it with as.formula():

aaa <- function(dta, col){
  dta %>%
    rstatix::pairwise_t_test(as.formula(paste0(col," ~ Species")),
                             p.adjust.method = "bonferroni")
}
aaa(iris, "Sepal.Length")

Here this works because as.formula() is a base R function that doesn't use data masking. If you want to include a group_by(), you will need to take a look at this page that shows you how to call group_by({{ col}}).

So if you had 2 grouping variables:

#add other categorical column
iris2 <- iris %>%
  add_column(other_cat = sample(1:5, nrow(iris), replace = TRUE))

aaa <- function(dta, col1, col2){
  dta %>%
    group_by({{col1}}) %>%
    rstatix::pairwise_t_test(as.formula(paste0(col2," ~ Species")), p.adjust.method = "bonferroni")
}
# these 2 commands produce the same result
iris2 %>%
  group_by(other_cat) %>%
  rstatix::pairwise_t_test(Sepal.Length ~ Species, p.adjust.method = "bonferroni")

aaa(iris2, other_cat, "Sepal.Length")

Of course, once you have a function doing what you want to a column, you can apply it to all columns with a map-family function.

1 Like

Then you could use a function such as map() to automatize the application of your function. Since rstatix::pairwise_t_test returns a dataframe, you could imagine doing this:

process_col <- function(column = col, data = dta){
  dta %>%
    rstatix::pairwise_t_test(as.formula(paste0(col," ~ Species")), p.adjust.method = "bonferroni")
}

map_dfr(names(iris)[1:4], process_col, data = iris)

Note that I inverted the arguments to the function so that it takes the one varying first.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Hi @AlexisW,
my bad about the "group_by" statement. I realized that error while working on it.
Appreciate the information.

But, please correct me if I am wrong, but do we have to apply aaa separately for each column. In that case, it is a little impractical when you use the "aaa" function for more than 10 columns.

aaa <- function(dta, col){
dta %>%
rstatix::pairwise_t_test(as.formula(paste0(col," ~ Species")),
p.adjust.method = "bonferroni")
}
aaa(iris, "Sepal.Length")

Thanks @AlexisW. I will play around with this.