multiple box plot with posthoc test result as letters

Hello,

Could anyone please help me, how I can add posthoc test result as letters? I was able to plot multiple box plot by using below script and data.

below tutorial shows how to add posthoc result as letters, but I am not able to modify according to my data;

library(reshape2)
library(ggplot2)
library(multcompView)
library(tidyverse)
list.files()



bpdata <- data.frame(
  DO = c(4.24, 2.58, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 4.24,
         2.66, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 2.51, 2.66,
         3.14, 2.92, 1.83, 1.82, 3.02, 3.3, 2.51, 3.04, 2.85,
         2.92, 1.83, 1.82, 3.02, 3.3, 4.24, 3.04, 2.85, 3.81, 2.64,
         4.61, 5.74, 3.69, 4.24, 2.09, 2.98, 3.81, 1.83, 1.82,
         3.02, 3.3, 2.51, 2.09, 2.98, 8.17, 2.63, 7.69, 3.07, 3.7,
         2.51, 3.04, 2.18, 8.17, 2.63, 7.69, 3.07, 3.7, 2.58,
         2.09, 2.18, 2.16, 1.7, 6.8, 1.95, 4.28, 2.58, 2.09, 2.98,
         2.16, 1.7, 6.8, 1.95, 4.28, 2.66, 3.14, 2.18, 8.17, 2.63,
         7.69, 3.07, 3.7, 2.66, 3.14, 2.92, 2.16, 1.7, 6.8,
         1.95, 4.28),
  pH = c(9.94, 10.06, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 9.94,
         10, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 10.04, 10, 10,
         9.06, 8.99, 8.92, 9.18, 9.32, 10.04, 9.96, 9.98, 9.06,
         8.99, 8.92, 9.18, 9.32, 9.94, 9.96, 9.98, 9.07, 8.99,
         8.93, 9.24, 9.32, 9.94, 9.95, 10.09, 9.07, 8.99, 8.92,
         9.18, 9.32, 10.04, 9.95, 10.09, 9.04, 9, 9.53, 9.31, 9.14,
         10.04, 9.96, 9.97, 9.04, 9, 9.53, 9.31, 9.14, 10.06,
         9.95, 9.97, 8.96, 9.02, 9.51, 9.39, 8.85, 10.06, 9.95,
         10.09, 8.96, 9.02, 9.51, 9.39, 8.85, 10, 10, 9.97, 9.04, 9,
         9.53, 9.31, 9.14, 10, 10, 9.06, 8.96, 9.02, 9.51, 9.39,
         8.85),
  Temperature = c(24.48, 24.95, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
                  24.48, 24.92, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
                  24.35, 24.92, 24, 26.46, 23.99, 23.43, 24.14, 24.67,
                  24.35, 24.04, 24.01, 26.46, 23.99, 23.43, 24.14, 24.67,
                  24.48, 24.04, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
                  24.48, 24.01, 24.36, 26.67, 23.99, 23.43, 24.14, 24.67,
                  24.35, 24.01, 24.36, 24.88, 24.7, 26.14, 24.19, 22.65,
                  24.35, 24.04, 24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.95,
                  24.01, 24.4, 24.3, 24.72, 26.15, 23.9, 20.65, 24.95,
                  24.01, 24.36, 24.3, 24.72, 26.15, 23.9, 20.65, 24.92, 24,
                  24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.92, 24,
                  26.46, 24.3, 24.72, 26.15, 23.9, 20.65),
  Turbidity = c(87.3, 95.2, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.3,
                113, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.7, 113,
                96.9, 39.7, 49, 47.6, 66.6, 74.1, 87.7, 98.5, 98.8, 39.7,
                49, 47.6, 66.6, 74.1, 87.3, 98.5, 98.8, 24.9, 43.1,
                49.1, 68.6, 69.8, 87.3, 95.9, 97.2, 24.9, 49, 47.6, 66.6,
                74.1, 87.7, 95.9, 97.2, 46.9, 37.2, 74.2, 69.9, 71.3,
                87.7, 98.5, 100.6, 46.9, 37.2, 74.2, 69.9, 71.3, 95.2, 95.9,
                100.6, 42, 36.1, 74.4, 44.4, 71.3, 95.2, 95.9, 97.2,
                42, 36.1, 74.4, 44.4, 71.3, 113, 96.9, 100.6, 46.9, 37.2,
                74.2, 69.9, 71.3, 113, 96.9, 39.7, 42, 36.1, 74.4, 44.4,
                71.3),
  ORP = c(97, 108.4, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 97,
          111.9, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 91.3, 111.9,
          108, 23.7, 64.1, 60.4, 40.6, 49.8, 91.3, 98.9, 102.9,
          23.7, 64.1, 60.4, 40.6, 49.8, 97, 98.9, 102.9, 94, 63.9,
          64.7, 40.3, 48.1, 97, 92.8, 106.9, 94, 64.1, 60.4, 40.6,
          49.8, 91.3, 92.8, 106.9, 90.4, 74.9, 67.7, 42, 37.1,
          91.3, 98.9, 102.7, 90.4, 74.9, 67.7, 42, 37.1, 108.4,
          92.8, 102.7, 53, 63.4, 73.9, 42, 37.5, 108.4, 92.8, 106.9,
          53, 63.4, 73.9, 42, 37.5, 111.9, 108, 102.7, 90.4, 74.9,
          67.7, 42, 37.1, 111.9, 111.9, 23.7, 53, 63.4, 73.9, 42,
          37.5),
  Ammonium = c(4.46, 4.62, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.46,
               4.75, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.5, 4.75,
               4.48, 7.64, 3.18, 3.09, 2.19, 2.14, 4.5, 3.97, 4.37, 7.64,
               3.18, 3.09, 2.19, 2.14, 4.46, 3.97, 4.37, 5.6, 3.38,
               3.21, 1.93, 2.21, 4.46, 4.06, 4.43, 5.6, 3.18, 3.09, 2.19,
               2.14, 4.5, 4.06, 4.43, 2.81, 4.2, 4.5, 1.99, 2.07, 4.5,
               3.97, 5.47, 2.81, 4.2, 4.5, 1.99, 2.07, 4.62, 4.06,
               5.47, 3.19, 3.97, 3.3, 3.38, 1.18, 4.62, 4.06, 4.43, 3.19,
               3.97, 3.3, 3.38, 1.18, 4.75, 4.48, 5.47, 2.81, 4.2, 4.5,
               1.99, 2.07, 4.75, 4.48, 7.64, 3.19, 3.97, 3.3, 3.38,
               1.18),
  Nitrates = c(3.6, 3.98, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.6,
               3.21, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.36, 3.21,
               3.76, 3.66, 2.44, 2.06, 0.67, 0.87, 3.36, 4.35, 3.57,
               3.66, 2.44, 2.06, 0.67, 0.87, 3.6, 4.35, 3.57, 3.53, 2.55,
               2.28, 0.71, 1.2, 3.6, 3.68, 3.34, 3.53, 2.44, 2.06,
               0.67, 0.87, 3.36, 3.68, 3.34, 2.25, 2.99, 1.3, 0.74, 0.68,
               3.36, 4.35, 4.11, 2.25, 2.99, 1.3, 0.74, 0.68, 3.98,
               3.68, 4.11, 2.1, 2.55, 1.11, 0.69, 0.69, 3.98, 3.68, 3.34,
               2.1, 2.55, 1.11, 0.69, 0.69, 3.21, 3.76, 4.11, 2.25,
               2.99, 1.3, 0.74, 0.68, 3.21, 3.76, 3.66, 2.1, 2.55, 1.11,
               0.69, 0.69),
  BGA = c(279133, 279113, 265304, 11603, 218471, 226658, 267389,
          267144, 279133, 270193, 265304, 11603, 218471, 226658,
          267389, 267144, 267716, 270193, 279119, 66608, 213497,
          242331, 267007, 261363, 267716, 279109, 265304, 66608,
          213497, 242331, 267007, 261363, 279133, 279109, 265304,
          11603, 218471, 226658, 267389, 267144, 279133, 279100,
          279099, 11603, 213497, 242331, 267007, 261363, 267716,
          279100, 279099, 246979, 169175, 272332, 269519, 272096,
          267716, 279109, 278137, 246979, 169175, 272332, 269519,
          272096, 279113, 279100, 278137, 174271, 159046, 271503,
          261124, 271303, 279113, 279100, 279099, 174271, 159046,
          271503, 261124, 271303, 270193, 279119, 278137, 246979,
          169175, 272332, 269519, 272096, 270193, 279119, 66608,
          174271, 159046, 271503, 261124, 271303),
  Chlrophyll = c(36.6, 38.9, 40, 44, 35, 32.8, 44, 44.8, 36.6, 39.8,
                 40, 44, 35, 32.8, 44, 44.8, 39.6, 39.8, 38.5, 45.1,
                 42.2, 50.2, 43.6, 45.1, 39.6, 43, 40, 45.1, 42.2, 50.2,
                 43.6, 45.1, 36.6, 43, 40, 44, 35, 32.8, 44, 44.8, 36.6,
                 50, 43, 44, 42.2, 50.2, 43.6, 45.1, 39.6, 50, 43, 34,
                 38.4, 58.5, 44.3, 40, 39.6, 43, 53.8, 34, 38.4, 58.5, 44.3,
                 40, 38.9, 50, 53.8, 50.3, 55.9, 51.3, 68.2, 34.2, 38.9,
                 50, 43, 50.3, 55.9, 51.3, 68.2, 34.2, 39.8, 38.5, 53.8,
                 34, 38.4, 58.5, 44.3, 40, 39.8, 38.5, 45.1, 50.3, 55.9,
                 51.3, 68.2, 34.2),
 Month = factor(c("July", "July", "July", "August", "August", "August",
                      "September", "September", "July", "July", "July",
                      "August", "August", "August", "September", "September",
                      "July", "July", "July", "August", "August", "August",
                      "September", "September", "July", "July", "July", "August",
                      "August", "August", "September", "September", "July",
                      "July", "July", "August", "August", "August", "September",
                      "September", "July", "July", "July", "August", "August",
                      "August", "September", "September", "July", "July",
                      "July", "August", "August", "September", "September",
                      "September", "July", "July", "July", "August", "August",
                      "September", "September", "September", "July", "July",
                      "July", "August", "August", "September", "September",
                      "September", "July", "July", "July", "August", "August",
                      "September", "September", "September", "July", "July", "July",
                      "August", "August", "September", "September",
                      "September", "July", "July", "August", "August", "August",
                      "September", "September", "September"),
                 levels = c("July", "August", "September"))
)


df.m <- melt(bpdata, id.var = "Month")

pdf("test.plot.pdf", width = 18, height = 12)


p <- ggplot(data = df.m, aes(x="", y=value)) +
theme_bw() +
theme(
      panel.grid.major=element_blank(),
      panel.grid.minor=element_blank(),
      plot.title = element_text(vjust = -8.5,hjust = 0.1),
      line=element_line(size=1),
     
      axis.text.x = element_text(size = 10, face ="bold", hjust = 0.5, colour = "black"),
      axis.text.y = element_text(size = 12, face ="bold", hjust = 0.5, colour = "black"),
      axis.ticks.x = element_blank()) +
       

theme(strip.background=element_rect(fill="white")) +
theme(strip.text=element_text(color="black",face="bold", size =14)) + 
geom_boxplot(aes(fill=Month),outlier.colour = NA) 
p + facet_wrap(.~variable, scales="free") 
dev.off();


#for posthoc test

model=lm(DO ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.DO.statistic.month.txt")




#for posthoc test

model=lm(pH ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.PH.statistic.month.txt")


#for posthoc test

model=lm(BGA ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.BGA.statistic.month.txt")



[test.plot.pdf|attachment](upload://6ANBkJIur1Ecif1LubRPbbybRuz.pdf) (7.0 KB)

Great reprex for the plotting part. Can you flesh it out with the HSD object that gives the post-hoc results to be plotted? Or is this a question about how to do an HSD?

@technocrat,

Thank you so much for your response. Actually I am not sure how to do HSD test for each of these parameters and then how to insert letters on top of the box plot?

I have tried below code HDS test but it is not making any sense?

#for posthoc test

model=lm(df.m$value ~ df.m$Month)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.statistic.month.txt")
Output;

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = model)

$`df.m$Month`
                        diff        lwr       upr     p adj
August-July      -11951.0691 -27001.203  3099.065 0.1497724
September-July     -792.4927 -15975.063 14390.077 0.9917547
September-August  11158.5764  -4469.542 26786.695 0.2148850

Thanks,

Ok, let's focus on HSD, which I'll call f.

What it does is to determine the distance and confidence intervals of the differences in the means of some variable y depending on some other variables, x_i \dots x_n. In other words, f(x) = y.

In the context of your problem, y is clearly df.m$value and the three levels of df.m$Month represent x_1 \dots x_3, the so-called treatment variables. TukeyHSD_result captures the HSD measure of the differences between each of the pairs of months.

Now, ask: What is the unit of measurement of the differences of all the values for a given month?

It's not something like the parts per million of the log ratio of free-hydrogen to hydroxyl ions times degrees centigrade divided by the voltage of nephelometric turbidity units.

What do you need to do to df.m to make the units of means differences meaninful?

@technocrat,

Thank you so much for you time and quick response. I am still not able to understand. I will be thankful to you if you can help with some more detail code.

@technocrat,

I wanted to plot HSD test result of on box plot, and want to see how based on month; pH, temperature ,DO and other variables vary and whether there is statistically significant difference based on month.

Many thanks

Hi @bioinfonext. Sorry to be obscure. I meant that you can't do a difference analysis of all the different variables (now expressed as factors) at the same time. Fixing this will involve using bpdata, rather than df.m and making models for each of

[1] "DO"          "pH"          "Temperature" "Turbidity"   "ORP"        
[6] "Ammonium"    "Nitrates"    "BGA"         "Chlrophyll"

I've done that and now I'm working on doing the modeling/anova/TukeyHSD.

@technocrat,

Thank you so much for your help. Now with bpdata I was able to calculate the HSD test and also updated above R script but do I need to repeat the HSD test code for each of these variable and how I can represent HSD test result as latters in the above box plot?

#for posthoc test

model=lm(DO ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.DO.statistic.month.txt")




#for posthoc test

model=lm(pH ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.PH.statistic.month.txt")


#for posthoc test

model=lm(BGA ~ Month, bpdata)
anova=aov(model)
summary(anova)




TukeyHSD_result <-TukeyHSD(anova)
capture.output(TukeyHSD_result, file = "HSD.BGA.statistic.month.txt")

Many thanks

suppressPackageStartupMessages({
  library(ggplot2)
  library(glue)
  library(magrittr)
  library(multcompView)
  library(purrr)
})

get_var <- function(x, y) glue("Variable: ", names(x[y]))

pull_one <- function(x) {
  headline <- "Differences in mean levels of Month, 95% family-wise confidence level"
  ggplot(tibble::rownames_to_column(as.data.frame(collector[x][[1]]$Month), var = "Month"), aes(diff, Month)) +
    geom_pointrange(aes(xmin = lwr, xmax = upr)) +
    labs(title = headline, subtitle = get_var(collector, x)) +
    xlab("") +
    ylab("") +
    theme_minimal() +
    theme(
      panel.grid.major.y = element_blank(),
      panel.grid.minor.y = element_blank()
    )
}

bpdata <- data.frame(
  # moved to first column
  Month = as.factor(c(
    "July", "July", "July", "August", "August", "August",
    "September", "September", "July", "July", "July",
    "August", "August", "August", "September", "September",
    "July", "July", "July", "August", "August", "August",
    "September", "September", "July", "July", "July", "August",
    "August", "August", "September", "September", "July",
    "July", "July", "August", "August", "August", "September",
    "September", "July", "July", "July", "August", "August",
    "August", "September", "September", "July", "July",
    "July", "August", "August", "September", "September",
    "September", "July", "July", "July", "August", "August",
    "September", "September", "September", "July", "July",
    "July", "August", "August", "September", "September",
    "September", "July", "July", "July", "August", "August",
    "September", "September", "September", "July", "July", "July",
    "August", "August", "September", "September",
    "September", "July", "July", "August", "August", "August",
    "September", "September", "September"
  )),
  DO = c(
    4.24, 2.58, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 4.24,
    2.66, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 2.51, 2.66,
    3.14, 2.92, 1.83, 1.82, 3.02, 3.3, 2.51, 3.04, 2.85,
    2.92, 1.83, 1.82, 3.02, 3.3, 4.24, 3.04, 2.85, 3.81, 2.64,
    4.61, 5.74, 3.69, 4.24, 2.09, 2.98, 3.81, 1.83, 1.82,
    3.02, 3.3, 2.51, 2.09, 2.98, 8.17, 2.63, 7.69, 3.07, 3.7,
    2.51, 3.04, 2.18, 8.17, 2.63, 7.69, 3.07, 3.7, 2.58,
    2.09, 2.18, 2.16, 1.7, 6.8, 1.95, 4.28, 2.58, 2.09, 2.98,
    2.16, 1.7, 6.8, 1.95, 4.28, 2.66, 3.14, 2.18, 8.17, 2.63,
    7.69, 3.07, 3.7, 2.66, 3.14, 2.92, 2.16, 1.7, 6.8, 1.95, 4.28
  ),
  pH = c(
    9.94, 10.06, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 9.94,
    10, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 10.04, 10, 10,
    9.06, 8.99, 8.92, 9.18, 9.32, 10.04, 9.96, 9.98, 9.06,
    8.99, 8.92, 9.18, 9.32, 9.94, 9.96, 9.98, 9.07, 8.99,
    8.93, 9.24, 9.32, 9.94, 9.95, 10.09, 9.07, 8.99, 8.92,
    9.18, 9.32, 10.04, 9.95, 10.09, 9.04, 9, 9.53, 9.31, 9.14,
    10.04, 9.96, 9.97, 9.04, 9, 9.53, 9.31, 9.14, 10.06,
    9.95, 9.97, 8.96, 9.02, 9.51, 9.39, 8.85, 10.06, 9.95,
    10.09, 8.96, 9.02, 9.51, 9.39, 8.85, 10, 10, 9.97, 9.04, 9,
    9.53, 9.31, 9.14, 10, 10, 9.06, 8.96, 9.02, 9.51, 9.39,
    8.85
  ),
  Temperature = c(
    24.48, 24.95, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.48, 24.92, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.35, 24.92, 24, 26.46, 23.99, 23.43, 24.14, 24.67,
    24.35, 24.04, 24.01, 26.46, 23.99, 23.43, 24.14, 24.67,
    24.48, 24.04, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.48, 24.01, 24.36, 26.67, 23.99, 23.43, 24.14, 24.67,
    24.35, 24.01, 24.36, 24.88, 24.7, 26.14, 24.19, 22.65,
    24.35, 24.04, 24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.95,
    24.01, 24.4, 24.3, 24.72, 26.15, 23.9, 20.65, 24.95,
    24.01, 24.36, 24.3, 24.72, 26.15, 23.9, 20.65, 24.92, 24,
    24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.92, 24,
    26.46, 24.3, 24.72, 26.15, 23.9, 20.65
  ),
  Turbidity = c(
    87.3, 95.2, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.3,
    113, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.7, 113,
    96.9, 39.7, 49, 47.6, 66.6, 74.1, 87.7, 98.5, 98.8, 39.7,
    49, 47.6, 66.6, 74.1, 87.3, 98.5, 98.8, 24.9, 43.1,
    49.1, 68.6, 69.8, 87.3, 95.9, 97.2, 24.9, 49, 47.6, 66.6,
    74.1, 87.7, 95.9, 97.2, 46.9, 37.2, 74.2, 69.9, 71.3,
    87.7, 98.5, 100.6, 46.9, 37.2, 74.2, 69.9, 71.3, 95.2, 95.9,
    100.6, 42, 36.1, 74.4, 44.4, 71.3, 95.2, 95.9, 97.2,
    42, 36.1, 74.4, 44.4, 71.3, 113, 96.9, 100.6, 46.9, 37.2,
    74.2, 69.9, 71.3, 113, 96.9, 39.7, 42, 36.1, 74.4, 44.4,
    71.3
  ),
  ORP = c(
    97, 108.4, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 97,
    111.9, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 91.3, 111.9,
    108, 23.7, 64.1, 60.4, 40.6, 49.8, 91.3, 98.9, 102.9,
    23.7, 64.1, 60.4, 40.6, 49.8, 97, 98.9, 102.9, 94, 63.9,
    64.7, 40.3, 48.1, 97, 92.8, 106.9, 94, 64.1, 60.4, 40.6,
    49.8, 91.3, 92.8, 106.9, 90.4, 74.9, 67.7, 42, 37.1,
    91.3, 98.9, 102.7, 90.4, 74.9, 67.7, 42, 37.1, 108.4,
    92.8, 102.7, 53, 63.4, 73.9, 42, 37.5, 108.4, 92.8, 106.9,
    53, 63.4, 73.9, 42, 37.5, 111.9, 108, 102.7, 90.4, 74.9,
    67.7, 42, 37.1, 111.9, 111.9, 23.7, 53, 63.4, 73.9, 42,
    37.5
  ),
  Ammonium = c(
    4.46, 4.62, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.46,
    4.75, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.5, 4.75,
    4.48, 7.64, 3.18, 3.09, 2.19, 2.14, 4.5, 3.97, 4.37, 7.64,
    3.18, 3.09, 2.19, 2.14, 4.46, 3.97, 4.37, 5.6, 3.38,
    3.21, 1.93, 2.21, 4.46, 4.06, 4.43, 5.6, 3.18, 3.09, 2.19,
    2.14, 4.5, 4.06, 4.43, 2.81, 4.2, 4.5, 1.99, 2.07, 4.5,
    3.97, 5.47, 2.81, 4.2, 4.5, 1.99, 2.07, 4.62, 4.06,
    5.47, 3.19, 3.97, 3.3, 3.38, 1.18, 4.62, 4.06, 4.43, 3.19,
    3.97, 3.3, 3.38, 1.18, 4.75, 4.48, 5.47, 2.81, 4.2, 4.5,
    1.99, 2.07, 4.75, 4.48, 7.64, 3.19, 3.97, 3.3, 3.38,
    1.18
  ),
  Nitrates = c(
    3.6, 3.98, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.6,
    3.21, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.36, 3.21,
    3.76, 3.66, 2.44, 2.06, 0.67, 0.87, 3.36, 4.35, 3.57,
    3.66, 2.44, 2.06, 0.67, 0.87, 3.6, 4.35, 3.57, 3.53, 2.55,
    2.28, 0.71, 1.2, 3.6, 3.68, 3.34, 3.53, 2.44, 2.06,
    0.67, 0.87, 3.36, 3.68, 3.34, 2.25, 2.99, 1.3, 0.74, 0.68,
    3.36, 4.35, 4.11, 2.25, 2.99, 1.3, 0.74, 0.68, 3.98,
    3.68, 4.11, 2.1, 2.55, 1.11, 0.69, 0.69, 3.98, 3.68, 3.34,
    2.1, 2.55, 1.11, 0.69, 0.69, 3.21, 3.76, 4.11, 2.25,
    2.99, 1.3, 0.74, 0.68, 3.21, 3.76, 3.66, 2.1, 2.55, 1.11,
    0.69, 0.69
  ),
  BGA = c(
    279133, 279113, 265304, 11603, 218471, 226658, 267389,
    267144, 279133, 270193, 265304, 11603, 218471, 226658,
    267389, 267144, 267716, 270193, 279119, 66608, 213497,
    242331, 267007, 261363, 267716, 279109, 265304, 66608,
    213497, 242331, 267007, 261363, 279133, 279109, 265304,
    11603, 218471, 226658, 267389, 267144, 279133, 279100,
    279099, 11603, 213497, 242331, 267007, 261363, 267716,
    279100, 279099, 246979, 169175, 272332, 269519, 272096,
    267716, 279109, 278137, 246979, 169175, 272332, 269519,
    272096, 279113, 279100, 278137, 174271, 159046, 271503,
    261124, 271303, 279113, 279100, 279099, 174271, 159046,
    271503, 261124, 271303, 270193, 279119, 278137, 246979,
    169175, 272332, 269519, 272096, 270193, 279119, 66608,
    174271, 159046, 271503, 261124, 271303
  ),
  Chlrophyll = c(
    36.6, 38.9, 40, 44, 35, 32.8, 44, 44.8, 36.6, 39.8,
    40, 44, 35, 32.8, 44, 44.8, 39.6, 39.8, 38.5, 45.1,
    42.2, 50.2, 43.6, 45.1, 39.6, 43, 40, 45.1, 42.2, 50.2,
    43.6, 45.1, 36.6, 43, 40, 44, 35, 32.8, 44, 44.8, 36.6,
    50, 43, 44, 42.2, 50.2, 43.6, 45.1, 39.6, 50, 43, 34,
    38.4, 58.5, 44.3, 40, 39.6, 43, 53.8, 34, 38.4, 58.5, 44.3,
    40, 38.9, 50, 53.8, 50.3, 55.9, 51.3, 68.2, 34.2, 38.9,
    50, 43, 50.3, 55.9, 51.3, 68.2, 34.2, 39.8, 38.5, 53.8,
    34, 38.4, 58.5, 44.3, 40, 39.8, 38.5, 45.1, 50.3, 55.9,
    51.3, 68.2, 34.2
  )
)


collector <- bpdata[2:9] %>% map(~ TukeyHSD(aov(.x ~ Month, data = bpdata), "Month", ordered = TRUE))

pull_one(1)

pull_one(2)

pull_one(3)

pull_one(4)

pull_one(5)

pull_one(6)

pull_one(7)

pull_one(8)

@technocrat,

Thank you so much for your time and valuable help, can you please also help how we can plot these posthoc test result as letters in multiple box plot.

Many thanks,
Bioinfonext

I don’t understand what you mean by “letters.”

For boxplots, you’re looking to add the HSD difference and upper/lower CI?

Finally, is the ultimate question “can you tell the month of the year by looking at the other variables”?

@technocrat,
I mean can we add letter on top of the box plot, like in the above tutorial where the pair of month don't show difference in variable value will represented by same letter like a on the top of the box plot and which variable have difference based on month will be respresented by different letter like a , b.

Thanks

Here's an alternative that achieves the same result.

suppressPackageStartupMessages({
  library(multcompView)
})

bpdata <- data.frame(
  Month = as.factor(c(
    "July", "July", "July", "August", "August", "August",
    "September", "September", "July", "July", "July",
    "August", "August", "August", "September", "September",
    "July", "July", "July", "August", "August", "August",
    "September", "September", "July", "July", "July", "August",
    "August", "August", "September", "September", "July",
    "July", "July", "August", "August", "August", "September",
    "September", "July", "July", "July", "August", "August",
    "August", "September", "September", "July", "July",
    "July", "August", "August", "September", "September",
    "September", "July", "July", "July", "August", "August",
    "September", "September", "September", "July", "July",
    "July", "August", "August", "September", "September",
    "September", "July", "July", "July", "August", "August",
    "September", "September", "September", "July", "July", "July",
    "August", "August", "September", "September",
    "September", "July", "July", "August", "August", "August",
    "September", "September", "September"
  )),
  DO = c(
    4.24, 2.58, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 4.24,
    2.66, 2.85, 3.81, 2.64, 4.61, 5.74, 3.69, 2.51, 2.66,
    3.14, 2.92, 1.83, 1.82, 3.02, 3.3, 2.51, 3.04, 2.85,
    2.92, 1.83, 1.82, 3.02, 3.3, 4.24, 3.04, 2.85, 3.81, 2.64,
    4.61, 5.74, 3.69, 4.24, 2.09, 2.98, 3.81, 1.83, 1.82,
    3.02, 3.3, 2.51, 2.09, 2.98, 8.17, 2.63, 7.69, 3.07, 3.7,
    2.51, 3.04, 2.18, 8.17, 2.63, 7.69, 3.07, 3.7, 2.58,
    2.09, 2.18, 2.16, 1.7, 6.8, 1.95, 4.28, 2.58, 2.09, 2.98,
    2.16, 1.7, 6.8, 1.95, 4.28, 2.66, 3.14, 2.18, 8.17, 2.63,
    7.69, 3.07, 3.7, 2.66, 3.14, 2.92, 2.16, 1.7, 6.8, 1.95, 4.28
  ),
  pH = c(
    9.94, 10.06, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 9.94,
    10, 9.98, 9.07, 8.99, 8.93, 9.24, 9.32, 10.04, 10, 10,
    9.06, 8.99, 8.92, 9.18, 9.32, 10.04, 9.96, 9.98, 9.06,
    8.99, 8.92, 9.18, 9.32, 9.94, 9.96, 9.98, 9.07, 8.99,
    8.93, 9.24, 9.32, 9.94, 9.95, 10.09, 9.07, 8.99, 8.92,
    9.18, 9.32, 10.04, 9.95, 10.09, 9.04, 9, 9.53, 9.31, 9.14,
    10.04, 9.96, 9.97, 9.04, 9, 9.53, 9.31, 9.14, 10.06,
    9.95, 9.97, 8.96, 9.02, 9.51, 9.39, 8.85, 10.06, 9.95,
    10.09, 8.96, 9.02, 9.51, 9.39, 8.85, 10, 10, 9.97, 9.04, 9,
    9.53, 9.31, 9.14, 10, 10, 9.06, 8.96, 9.02, 9.51, 9.39,
    8.85
  ),
  Temperature = c(
    24.48, 24.95, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.48, 24.92, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.35, 24.92, 24, 26.46, 23.99, 23.43, 24.14, 24.67,
    24.35, 24.04, 24.01, 26.46, 23.99, 23.43, 24.14, 24.67,
    24.48, 24.04, 24.01, 26.67, 24.01, 23.5, 24.46, 24.65,
    24.48, 24.01, 24.36, 26.67, 23.99, 23.43, 24.14, 24.67,
    24.35, 24.01, 24.36, 24.88, 24.7, 26.14, 24.19, 22.65,
    24.35, 24.04, 24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.95,
    24.01, 24.4, 24.3, 24.72, 26.15, 23.9, 20.65, 24.95,
    24.01, 24.36, 24.3, 24.72, 26.15, 23.9, 20.65, 24.92, 24,
    24.4, 24.88, 24.7, 26.14, 24.19, 22.65, 24.92, 24,
    26.46, 24.3, 24.72, 26.15, 23.9, 20.65
  ),
  Turbidity = c(
    87.3, 95.2, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.3,
    113, 98.8, 24.9, 43.1, 49.1, 68.6, 69.8, 87.7, 113,
    96.9, 39.7, 49, 47.6, 66.6, 74.1, 87.7, 98.5, 98.8, 39.7,
    49, 47.6, 66.6, 74.1, 87.3, 98.5, 98.8, 24.9, 43.1,
    49.1, 68.6, 69.8, 87.3, 95.9, 97.2, 24.9, 49, 47.6, 66.6,
    74.1, 87.7, 95.9, 97.2, 46.9, 37.2, 74.2, 69.9, 71.3,
    87.7, 98.5, 100.6, 46.9, 37.2, 74.2, 69.9, 71.3, 95.2, 95.9,
    100.6, 42, 36.1, 74.4, 44.4, 71.3, 95.2, 95.9, 97.2,
    42, 36.1, 74.4, 44.4, 71.3, 113, 96.9, 100.6, 46.9, 37.2,
    74.2, 69.9, 71.3, 113, 96.9, 39.7, 42, 36.1, 74.4, 44.4,
    71.3
  ),
  ORP = c(
    97, 108.4, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 97,
    111.9, 102.9, 94, 63.9, 64.7, 40.3, 48.1, 91.3, 111.9,
    108, 23.7, 64.1, 60.4, 40.6, 49.8, 91.3, 98.9, 102.9,
    23.7, 64.1, 60.4, 40.6, 49.8, 97, 98.9, 102.9, 94, 63.9,
    64.7, 40.3, 48.1, 97, 92.8, 106.9, 94, 64.1, 60.4, 40.6,
    49.8, 91.3, 92.8, 106.9, 90.4, 74.9, 67.7, 42, 37.1,
    91.3, 98.9, 102.7, 90.4, 74.9, 67.7, 42, 37.1, 108.4,
    92.8, 102.7, 53, 63.4, 73.9, 42, 37.5, 108.4, 92.8, 106.9,
    53, 63.4, 73.9, 42, 37.5, 111.9, 108, 102.7, 90.4, 74.9,
    67.7, 42, 37.1, 111.9, 111.9, 23.7, 53, 63.4, 73.9, 42,
    37.5
  ),
  Ammonium = c(
    4.46, 4.62, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.46,
    4.75, 4.37, 5.6, 3.38, 3.21, 1.93, 2.21, 4.5, 4.75,
    4.48, 7.64, 3.18, 3.09, 2.19, 2.14, 4.5, 3.97, 4.37, 7.64,
    3.18, 3.09, 2.19, 2.14, 4.46, 3.97, 4.37, 5.6, 3.38,
    3.21, 1.93, 2.21, 4.46, 4.06, 4.43, 5.6, 3.18, 3.09, 2.19,
    2.14, 4.5, 4.06, 4.43, 2.81, 4.2, 4.5, 1.99, 2.07, 4.5,
    3.97, 5.47, 2.81, 4.2, 4.5, 1.99, 2.07, 4.62, 4.06,
    5.47, 3.19, 3.97, 3.3, 3.38, 1.18, 4.62, 4.06, 4.43, 3.19,
    3.97, 3.3, 3.38, 1.18, 4.75, 4.48, 5.47, 2.81, 4.2, 4.5,
    1.99, 2.07, 4.75, 4.48, 7.64, 3.19, 3.97, 3.3, 3.38,
    1.18
  ),
  Nitrates = c(
    3.6, 3.98, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.6,
    3.21, 3.57, 3.53, 2.55, 2.28, 0.71, 1.2, 3.36, 3.21,
    3.76, 3.66, 2.44, 2.06, 0.67, 0.87, 3.36, 4.35, 3.57,
    3.66, 2.44, 2.06, 0.67, 0.87, 3.6, 4.35, 3.57, 3.53, 2.55,
    2.28, 0.71, 1.2, 3.6, 3.68, 3.34, 3.53, 2.44, 2.06,
    0.67, 0.87, 3.36, 3.68, 3.34, 2.25, 2.99, 1.3, 0.74, 0.68,
    3.36, 4.35, 4.11, 2.25, 2.99, 1.3, 0.74, 0.68, 3.98,
    3.68, 4.11, 2.1, 2.55, 1.11, 0.69, 0.69, 3.98, 3.68, 3.34,
    2.1, 2.55, 1.11, 0.69, 0.69, 3.21, 3.76, 4.11, 2.25,
    2.99, 1.3, 0.74, 0.68, 3.21, 3.76, 3.66, 2.1, 2.55, 1.11,
    0.69, 0.69
  ),
  BGA = c(
    279133, 279113, 265304, 11603, 218471, 226658, 267389,
    267144, 279133, 270193, 265304, 11603, 218471, 226658,
    267389, 267144, 267716, 270193, 279119, 66608, 213497,
    242331, 267007, 261363, 267716, 279109, 265304, 66608,
    213497, 242331, 267007, 261363, 279133, 279109, 265304,
    11603, 218471, 226658, 267389, 267144, 279133, 279100,
    279099, 11603, 213497, 242331, 267007, 261363, 267716,
    279100, 279099, 246979, 169175, 272332, 269519, 272096,
    267716, 279109, 278137, 246979, 169175, 272332, 269519,
    272096, 279113, 279100, 278137, 174271, 159046, 271503,
    261124, 271303, 279113, 279100, 279099, 174271, 159046,
    271503, 261124, 271303, 270193, 279119, 278137, 246979,
    169175, 272332, 269519, 272096, 270193, 279119, 66608,
    174271, 159046, 271503, 261124, 271303
  ),
  Chlorophyll = c(
    36.6, 38.9, 40, 44, 35, 32.8, 44, 44.8, 36.6, 39.8,
    40, 44, 35, 32.8, 44, 44.8, 39.6, 39.8, 38.5, 45.1,
    42.2, 50.2, 43.6, 45.1, 39.6, 43, 40, 45.1, 42.2, 50.2,
    43.6, 45.1, 36.6, 43, 40, 44, 35, 32.8, 44, 44.8, 36.6,
    50, 43, 44, 42.2, 50.2, 43.6, 45.1, 39.6, 50, 43, 34,
    38.4, 58.5, 44.3, 40, 39.6, 43, 53.8, 34, 38.4, 58.5, 44.3,
    40, 38.9, 50, 53.8, 50.3, 55.9, 51.3, 68.2, 34.2, 38.9,
    50, 43, 50.3, 55.9, 51.3, 68.2, 34.2, 39.8, 38.5, 53.8,
    34, 38.4, 58.5, 44.3, 40, 39.8, 38.5, 45.1, 50.3, 55.9,
    51.3, 68.2, 34.2
  )
)

# def use_py():
#   Vars = ["DO","pH","Temperature","Turbidity","ORP","Ammonium","Nitrates","BGA","Chlorophyll"]
#   part1 = "multcompBoxplot("
#   part2 = "~ Month, data=bpdata[,c(1,"
#   part3 = ")])"
#   for i in range(0,9): print(part1,Vars[i],part2,i+2,part3)

multcompBoxplot(DO ~ Month, data = bpdata[, c(1, 2)])

multcompBoxplot(pH ~ Month, data = bpdata[, c(1, 3)])

multcompBoxplot(Temperature ~ Month, data = bpdata[, c(1, 4)])
#> Error in xy.coords(x, y, xlabel, ylabel, log): 'x' is a list, but does not have components 'x' and 'y'

multcompBoxplot(Turbidity ~ Month, data = bpdata[, c(1, 5)])

multcompBoxplot(ORP ~ Month, data = bpdata[, c(1, 6)])

multcompBoxplot(Ammonium ~ Month, data = bpdata[, c(1, 7)])

multcompBoxplot(Nitrates ~ Month, data = bpdata[, c(1, 8)])

multcompBoxplot(BGA ~ Month, data = bpdata[, c(1, 9)])

multcompBoxplot(Chlorophyll ~ Month, data = bpdata[, c(1, 10)])

You can output the letters after you run the multiple comparisons. Then add them to the data set (I think the variable is called .group) or you can add a geom_text to the plot. I did not run your data but it might look like this

  • geom_text(data=plot_tukey, # data set containing the month group, maxy and group label
    aes(x= month, y=maxy, label= .group), # maxy is the maximum y value in the data
    size=5,
    nudge_x = c(.15, 0.25, -0.15, -0.15),
    nudge_y = c(0, 0, 0, 0.1), color = "black")

Here is some code for your data. You may have to drop jitter or adjust the letters a bit
library(multcomp)
library(emmeans)
bpdata$mon <- as.factor(bpdata$Month)
mod <-with(bpdata, lm(DO ~ mon))
lsms <- emmeans(mod, ~ mon)
CLD <- multcomp::cld(lsms, alpha = 0.05, Letters = LETTERS)
CLD # print the data set
str(CLD) # look at what is in the data set

CI with letters

ggplot(CLD, aes(x = mon, y = emmean, label = .group)) +
geom_point() +
geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL)) +
geom_text(nudge_x = c(0.2, 0.2, 0.2, 0.2),
nudge_y = c(1, 1, 1, 1), color = "black")

boxplot with letters

first get the max value so we can plot in the range of the graph

add it to the data set

maxy <- c( by(bpdata$DO, bpdata$mon, max) )
maxs <- data.frame(maxy)
maxs$mon <- names(maxy)
plot_tukey <- merge(maxs, CLD, by="mon")
plot_tukey # print new data set note names of variables

now make the boxplots

boxes <- ggplot(bpdata, aes(x = mon, y = DO )) +
geom_boxplot(outlier.colour = "transparent" ) +
geom_jitter(colour = "black", width = 0.1, height = 0) + theme_bw() +
theme( plot.caption = element_text(hjust = 0)) +
ylab( "Dissolved Oxygen") +
xlab("Month") +
ggtitle ("Dissolved oxygen boxplots", subtitle = "Montly variation") +
labs(caption = paste0("Boxes indicate the least squares mean. \n", "Error bars indicate the 95% ",
"confidence interval of the mean. \n", "Means sharing a letter are not ", "significantly
different (Tukey-adjusted comparisons)."),
hjust=0.5)

boxes

add text - nudge_x _y moves the letters a bit. Pos = right of center,

Neg = left of center.

boxes + geom_text(data=plot_tukey,
aes(x= mon, y=maxy, label= .group),
size=5,
nudge_x = c(.15, 0.25, -0.15, -0.25),
nudge_y = c(0, 0, 0, 0.1), color = "black")

you want to drop the caption "Error bars indicate the 95% ",
"confidence interval of the mean. \n", unless you also plot those (I had deleted that code)

@technocrat,

Thank you so much for your help and valuable time. I just need to understand your code so it will be easier if you could compile the R function and box plot code in a single R script and then it will easy to understand which R packages I need to install as well.

Many thanks

That's what the code in my answer is. The only external dependency is the multcompView library. (It should install the multcomp library on its own.) The only tricky part is generating the dependent variable, as in this example for pH

multcompBoxplot(pH ~ Month, data = bpdata[, c(1, 3)])

I cheated and used the commented out python script to do this rather than using cut and paste or trying to create a function. Depending on the number of variables, this should not be hard to do. Note that it is also necessary to change the 3 in

 bpdata[, c(1, 3)])

to correspond to the column number of the variable (in this case, pH1). What this argument does is to identify a portion of the entire bdpdatadata set as the argument to themultcompBoxplotfunction, consisting of, (all rows) c(1,3)1, columns 1 (Month) and 3 (pH).

@technocrat,

Thank you so much, it work perfectly with reproducible code but I was trying to replicate it with original dataset and it showing some error; where 1st column is the factor like month and rest of the column are numerical variable; like pH, BGA....

suppressPackageStartupMessages({
  library(multcompView)
})

bpdata <- read.table("Box.plot.chemical.txt", sep="\t", header=TRUE)
str(bpdata)

'data.frame':   108 obs. of  10 variables:
 $ SampleType  : Factor w/ 12 levels "1D_B","1D_2B",..: 4 5 6 4 5 6 4 5 6 4 ...
 $ P      : num  2815 1839 2184 2761 2550 ...
 $ Mn     : num  23.7 18.8 20.5 26.6 26.2 ...
 $ Cu     : num  2.77 2.44 2.6 2.47 2.76 ...
 $ Zn     : num  14.5 13 13.1 14.1 14.2 ...
 $ Rb     : num  14.5 9.6 12.7 18 15.2 ...
 $ Cd     : num  0.04159 0.03088 0.03264 0.0035 0.00567 ...
 $ DMA    : num  0.01391 0.01713 0.01771 0.00732 0.0109 ...
 $ AsV    : num  0.1193 0.1244 0.1135 0.0616 0.0887 ...
 $ AsTotal: num  0.1333 0.142 0.1317 0.0693 0.0996 ...



pdf("test2.plot.pdf", width = 10, height = 8)

multcompBoxplot(SampleType ~ Cd, data = bpdata[, c(1, 6)])
dev.off();


Error;
Error in `[.data.frame`(data, , fm[-1]) : undefined columns selected
Calls: multcompBoxplot -> [ -> [.data.frame
Execution halted

Sorry for asking help again and again.
Many thanks,

I've been working with data frames. Not sure it makes a difference, but could you share enough that bpdata or Box.plot.chemical.txt to reproduce the issue?