How can I create a combined barplot like this image?

Hi friends, I have data from different countries from two groups of samples (control and test). And, calculated three different indices alpha, beta and gamma. I have a large dataframe containing all the data. Here is a reproducible example of my dataframe (df):

df <-
  structure(list(SampleID = c("SRR490_profile", "SRR467268_profile", 
                              "SRR467265_profile", "SRR467264_profile", "SRR467263_profile", 
                              "SRR467258_profile", "SRR467250_profile", "SRR467244_profile", 
                              "SRR467242_profile", "SRR467227_profile", "ERR710432_profile", 
                              "ERR710430_profile", "ERR710429_profile", "SRR232646_profile", 
                              "SRR232622_profile", "SRR232619_profile", "SRR232617_profile", 
                              "SRR232616_profile", "SRR232614_profile", "SRR232612_profile", 
                              "HJ4386_profile", "HJ4382_profile", "HJ4380_profile", "HJ4379_profile", 
                              "HJ4377_profile", "HJ4375_profile", "HJ4374_profile", "HJ4373_profile", 
                              "HJ4372_profile", "HJ4370_profile", "KLM3246730_profile", "KLM3246729_profile", 
                              "KLM3246725_profile", "KLM3246722_profile", "KLM3246721_profile", 
                              "KLM3246714_profile", "KLM3246713_profile", "KLM3246712_profile", 
                              "KLM3246711_profile", "KLM3246705_profile"), type = c("test", 
                                                                                    "control", "control", "control", "control", "test", "test", "test", 
                                                                                    "control", "control", "control", "test", "control", "control", 
                                                                                    "control", "control", "control", "test", "control", "test", "control", 
                                                                                    "control", "control", "control", "control", "test", "control", 
                                                                                    "test", "test", "control", "control", "test", "test", "test", 
                                                                                    "control", "test", "control", "test", "test", "test"), variable = c("gamma", 
                                                                                                                                                        "alpha", "beta", "gamma", "alpha", "beta", "gamma", 
                                                                                                                                                        "alpha", "beta", "gamma", "alpha", "alpha", "beta", 
                                                                                                                                                        "gamma", "alpha", "beta", "gamma", "alpha", "beta", 
                                                                                                                                                        "gamma", "alpha", "alpha", "alpha", "beta", "beta", 
                                                                                                                                                        "gamma", "alpha", "beta", "gamma", "alpha", "beta", 
                                                                                                                                                        "gamma", "alpha", "beta", "gamma", "alpha", "beta", 
                                                                                                                                                        "gamma", "alpha", "beta"), value = c(95, 90, 109, 112, 
                                                                                                                                                                                             128, 109, 90, 87, 90, 99, 129, 101, 113, 100, 111, 102, 99, 105, 
                                                                                                                                                                                             85, 94, 133, 117, 116, 83, 113, 126, 90, 128, 107, 121, 88, 109, 
                                                                                                                                                                                             82, 42, 67, 100, 88, 101, 127, 115), StudyID = c("swiss", "swiss", 
                                                                                                                                                                                                                                              "swiss", "swiss", "swiss", "swiss", "swiss", "swiss", "swiss", 
                                                                                                                                                                                                                                              "swiss", "australian", "australian", "australian", "australian", 
                                                                                                                                                                                                                                              "australian", "australian", "australian", "australian", "australian", 
                                                                                                                                                                                                                                              "australian", "thai", "thai", "thai", "thai", "thai", "thai", 
                                                                                                                                                                                                                                              "thai", "thai", "thai", "thai", "english", "english", "english", 
                                                                                                                                                                                                                                              "english", "english", "english", "english", "english", "english", 
                                                                                                                                                                                                                                              "english")), row.names = c(NA, -40L), class = c("tbl_df", "tbl", 
                                                                                                                                                                                                                                                                                              "data.frame"))

Now, I want an image for each index containing barplots for two groups (control and test) and Y-axis showing the StudyID names (i.e., country names) like the following (this image shows plots for two indices alpha and beta):

Is this possible from my data? Can anyone give me a solution?

Thanks in advance

Given your initial dataframe df above try

library(tidyverse)
df <-
  structure(
    list(
      SampleID = c(
        "SRR490_profile",
... SNIP ...
        "english",
        "english",
        "english"
      )
    ),
    row.names = c(NA, -40L),
    class = c("tbl_df", "tbl",
              "data.frame")


  )

library(tidyverse)
summarized_df <- df %>% 
  group_by(variable, StudyID, type) %>% 
  summarise(mean = mean(value, na.rm = TRUE), 
            stdev = sd(value, na.rm = TRUE)) 
#> `summarise()` has grouped output by 'variable', 'StudyID'. You can override using the `.groups` argument.

ggplot(summarized_df) +
  geom_point(aes(x = mean, y = StudyID, color = type)) +
  geom_errorbarh(aes(xmin = mean - stdev, 
                     xmax = mean + stdev, 
                     y = StudyID, color = type)) +  
  facet_wrap(vars(variable))
#> Warning: Removed 5 rows containing missing values (geom_errorbarh).

Created on 2021-05-19 by the reprex package (v2.0.0)

2 Likes

Thanks a lot @Old_FORTRAN_Dog for your response. But it has two problems:

  1. The range for all variable alpha, beta and gamma is not same. For example, in original dataset, gamma ranges between 0 to 1. But, here x-axis for each plot ranges from 0 to 150. As a result, not all plots are clear enough. Please see the problem here in this figure that I have generated with your code:
    plot365
  2. The control and Test boxes are overlapping. How, can I make them apart like something in the figure I have provided in the question?

For 1 you can add the scales = "free_x" argument to the facet_wrap.
For 2 you need to dodge the points and errorbars.

  geom_point(aes(x = mean, y = StudyID, color = type),
             position = position_dodge(width = 0.3)) +
  geom_errorbarh(aes(xmin = mean - stdev, 
                     xmax = mean + stdev,
                     y = StudyID, color = type),
                 position = position_dodge(width = 0.3),
                 height = 0.3) +  
  facet_wrap(vars(variable),
             scales = "free_x")

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.