stacked barplot with samples in x-axis arranged according to “early” and “late” groups

Hi friends I am a novice in R. And struggling a lot. I'm working with Phyloseq
package to analyse 16S metagenomic data. I want to make a stacked barplot with abundance in y-axis and samples in x-axis. I have 10 samples with 4 "early" and 6 "late". How can I arrange in the barplot. I mean first four bars of controls and next six are of obese. I have tried reorder function. But found that reorder only works with numeric value not non-numeric "ealy" and "late" values. Here's my code:

ggplot(mdata_class, aes(x = Sample, y = Abundance, fill = Class)) + 
  #facet_grid(time~.) +
  geom_bar(stat = "identity") +
  scale_fill_manual(values = class_colors) +
  scale_x_discrete(
    breaks = map$sample,
    labels = map$time,
    drop = FALSE
  )  +
  # Remove x axis title, and rotate sample labels
  theme(axis.title.x = element_blank(),
        axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) + 
  
  # additional stuff
  guides(fill = guide_legend(reverse = TRUE, keywidth = 1, keyheight = 1)) +
  ylab("Relative Abundance (Class > 1%) \n") +
  ggtitle("Class Composition of Mothur MiSeq SOP data per individual") 

This is upto where I've reached. I want to arrange the "early" samples together and the late samples together.
Rplot_Rforum

Can anyone please help me out?

Thanks and Regards,
DC7

without your data its hard to advise you to try something with confidence, because we can't directly test.
Can you share the data ?
For example, I'm curious about the relationship between,
mdata_class and map ($sample and $time)

Here's the output when I print mdata_class

        OTU    Sample  Abundance    sample    type  time bodyweight     Kingdom            Phylum                  Class
1  Otu00009 DRR046797 0.10213514 DRR046797 control early         20 k__Bacteria p__Actinobacteria      c__Actinobacteria
2  Otu00009 DRR046803 0.08912090 DRR046803   obese  late         45 k__Bacteria p__Actinobacteria      c__Actinobacteria
3  Otu00009 DRR046798 0.08548668 DRR046798 control early         40 k__Bacteria p__Actinobacteria      c__Actinobacteria
4  Otu00009 DRR046801 0.05865369 DRR046801   obese  late         45 k__Bacteria p__Actinobacteria      c__Actinobacteria
5  Otu00009 DRR046805 0.05629529 DRR046805   obese early         78 k__Bacteria p__Actinobacteria      c__Actinobacteria
6  Otu00009 DRR046800 0.05349115 DRR046800 control  late         45 k__Bacteria p__Actinobacteria      c__Actinobacteria
7  Otu00009 DRR046802 0.02630549 DRR046802   obese  late         45 k__Bacteria p__Actinobacteria      c__Actinobacteria
8  Otu00020 DRR046804 0.14788058 DRR046804   obese  late         45 k__Bacteria     p__Firmicutes             c__Bacilli
9  Otu00020 DRR046798 0.05880534 DRR046798 control early         40 k__Bacteria     p__Firmicutes             c__Bacilli
10 Otu00020 DRR046802 0.02007156 DRR046802   obese  late         45 k__Bacteria     p__Firmicutes             c__Bacilli
11 Otu00020 DRR046803 0.01713152 DRR046803   obese  late         45 k__Bacteria     p__Firmicutes             c__Bacilli
12 Otu00020 DRR046797 0.01035021 DRR046797 control early         20 k__Bacteria     p__Firmicutes             c__Bacilli
13 Otu00002 DRR046799 0.39862975 DRR046799 control  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
14 Otu00002 DRR046797 0.36894936 DRR046797 control early         20 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
15 Otu00002 DRR046800 0.35967465 DRR046800 control  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
16 Otu00002 DRR046804 0.34913663 DRR046804   obese  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
17 Otu00002 DRR046805 0.33464421 DRR046805   obese early         78 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
18 Otu00002 DRR046798 0.33061613 DRR046798 control early         40 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
19 Otu00002 DRR046803 0.26273936 DRR046803   obese  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
20 Otu00002 DRR046801 0.26121922 DRR046801   obese  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
21 Otu00002 DRR046806 0.18225234 DRR046806   obese early         32 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
22 Otu00002 DRR046802 0.17431231 DRR046802   obese  late         45 k__Bacteria  p__Bacteroidetes         c__Bacteroidia
23 Otu00035 DRR046798 0.01992751 DRR046798 control early         40 k__Bacteria p__Proteobacteria  c__Betaproteobacteria
24 Otu00035 DRR046799 0.01478939 DRR046799 control  late         45 k__Bacteria p__Proteobacteria  c__Betaproteobacteria
25 Otu00035 DRR046803 0.01371867 DRR046803   obese  late         45 k__Bacteria p__Proteobacteria  c__Betaproteobacteria
26 Otu00035 DRR046804 0.01245559 DRR046804   obese  late         45 k__Bacteria p__Proteobacteria  c__Betaproteobacteria
27 Otu00035 DRR046801 0.01083689 DRR046801   obese  late         45 k__Bacteria p__Proteobacteria  c__Betaproteobacteria
28 Otu00001 DRR046806 0.66847252 DRR046806   obese early         32 k__Bacteria     p__Firmicutes          c__Clostridia
29 Otu00001 DRR046802 0.66746338 DRR046802   obese  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
30 Otu00001 DRR046801 0.63904709 DRR046801   obese  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
31 Otu00001 DRR046803 0.57818463 DRR046803   obese  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
32 Otu00001 DRR046800 0.56657288 DRR046800 control  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
33 Otu00001 DRR046799 0.56348693 DRR046799 control  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
34 Otu00001 DRR046805 0.54092091 DRR046805   obese early         78 k__Bacteria     p__Firmicutes          c__Clostridia
35 Otu00001 DRR046797 0.47789350 DRR046797 control early         20 k__Bacteria     p__Firmicutes          c__Clostridia
36 Otu00001 DRR046798 0.47385673 DRR046798 control early         40 k__Bacteria     p__Firmicutes          c__Clostridia
37 Otu00001 DRR046804 0.44762754 DRR046804   obese  late         45 k__Bacteria     p__Firmicutes          c__Clostridia
38 Otu00014 DRR046802 0.06449178 DRR046802   obese  late         45 k__Bacteria p__Actinobacteria      c__Coriobacteriia
39 Otu00014 DRR046805 0.03359757 DRR046805   obese early         78 k__Bacteria p__Actinobacteria      c__Coriobacteriia
40 Otu00014 DRR046797 0.03141518 DRR046797 control early         20 k__Bacteria p__Actinobacteria      c__Coriobacteriia
41 Otu00014 DRR046803 0.03140499 DRR046803   obese  late         45 k__Bacteria p__Actinobacteria      c__Coriobacteriia
42 Otu00014 DRR046804 0.01792293 DRR046804   obese  late         45 k__Bacteria p__Actinobacteria      c__Coriobacteriia
43 Otu00014 DRR046806 0.01570523 DRR046806   obese early         32 k__Bacteria p__Actinobacteria      c__Coriobacteriia
44 Otu00014 DRR046801 0.01404050 DRR046801   obese  late         45 k__Bacteria p__Actinobacteria      c__Coriobacteriia
45 Otu00058 DRR046802 0.03083417 DRR046802   obese  late         45 k__Bacteria     p__Firmicutes     c__Erysipelotrichi
46 Otu00058 DRR046805 0.01128693 DRR046805   obese early         78 k__Bacteria     p__Firmicutes     c__Erysipelotrichi
47 Otu00031 DRR046806 0.09179629 DRR046806   obese early         32 k__Bacteria   p__Fusobacteria       c__Fusobacteriia
48 Otu00066 DRR046798 0.02033571 DRR046798 control early         40 k__Bacteria p__Proteobacteria c__Gammaproteobacteria

And here's when I print map

Sample Data:        [10 samples by 4 sample variables]:
             sample    type  time bodyweight
DRR046797 DRR046797 control early         20
DRR046798 DRR046798 control early         40
DRR046799 DRR046799 control  late         45
DRR046800 DRR046800 control  late         45
DRR046801 DRR046801   obese  late         45
DRR046802 DRR046802   obese  late         45
DRR046803 DRR046803   obese  late         45
DRR046804 DRR046804   obese  late         45
DRR046805 DRR046805   obese early         78
DRR046806 DRR046806   obese early         32

Thanks,
DC7

the scale on x, you have is based on sample, the natural ordering is by alphabetic ordering, whether they are late or early is independent of the alphabetic ordering, so we order the samples , first by time then by sample, and use factor data type (Which internally is integer) to form a definite ordering of the samples.
Note that I shared the data for mdata_class as a directly reporoducible text (dput can do this)


library(tidyverse)

# dput(mdata_class)
mdata_class <- structure(list(OTU = c("Otu00009", "Otu00009", "Otu00009", "Otu00009", 
                                      "Otu00009", "Otu00009", "Otu00009", "Otu00020", "Otu00020", "Otu00020", 
                                      "Otu00020", "Otu00020", "Otu00002", "Otu00002", "Otu00002", "Otu00002", 
                                      "Otu00002", "Otu00002", "Otu00002", "Otu00002", "Otu00002", "Otu00002", 
                                      "Otu00035", "Otu00035", "Otu00035", "Otu00035", "Otu00035", "Otu00001", 
                                      "Otu00001", "Otu00001", "Otu00001", "Otu00001", "Otu00001", "Otu00001", 
                                      "Otu00001", "Otu00001", "Otu00001", "Otu00014", "Otu00014", "Otu00014", 
                                      "Otu00014", "Otu00014", "Otu00014", "Otu00014", "Otu00058", "Otu00058", 
                                      "Otu00031", "Otu00066"), Sample = c("DRR046797", "DRR046803", 
                                                                          "DRR046798", "DRR046801", "DRR046805", "DRR046800", "DRR046802", 
                                                                          "DRR046804", "DRR046798", "DRR046802", "DRR046803", "DRR046797", 
                                                                          "DRR046799", "DRR046797", "DRR046800", "DRR046804", "DRR046805", 
                                                                          "DRR046798", "DRR046803", "DRR046801", "DRR046806", "DRR046802", 
                                                                          "DRR046798", "DRR046799", "DRR046803", "DRR046804", "DRR046801", 
                                                                          "DRR046806", "DRR046802", "DRR046801", "DRR046803", "DRR046800", 
                                                                          "DRR046799", "DRR046805", "DRR046797", "DRR046798", "DRR046804", 
                                                                          "DRR046802", "DRR046805", "DRR046797", "DRR046803", "DRR046804", 
                                                                          "DRR046806", "DRR046801", "DRR046802", "DRR046805", "DRR046806", 
                                                                          "DRR046798"), Abundance = c(0.10213514, 0.0891209, 0.08548668, 
                                                                                                      0.05865369, 0.05629529, 0.05349115, 0.02630549, 0.14788058, 0.05880534, 
                                                                                                      0.02007156, 0.01713152, 0.01035021, 0.39862975, 0.36894936, 0.35967465, 
                                                                                                      0.34913663, 0.33464421, 0.33061613, 0.26273936, 0.26121922, 0.18225234, 
                                                                                                      0.17431231, 0.01992751, 0.01478939, 0.01371867, 0.01245559, 0.01083689, 
                                                                                                      0.66847252, 0.66746338, 0.63904709, 0.57818463, 0.56657288, 0.56348693, 
                                                                                                      0.54092091, 0.4778935, 0.47385673, 0.44762754, 0.06449178, 0.03359757, 
                                                                                                      0.03141518, 0.03140499, 0.01792293, 0.01570523, 0.0140405, 0.03083417, 
                                                                                                      0.01128693, 0.09179629, 0.02033571), sample = c("DRR046797", 
                                                                                                                                                      "DRR046803", "DRR046798", "DRR046801", "DRR046805", "DRR046800", 
                                                                                                                                                      "DRR046802", "DRR046804", "DRR046798", "DRR046802", "DRR046803", 
                                                                                                                                                      "DRR046797", "DRR046799", "DRR046797", "DRR046800", "DRR046804", 
                                                                                                                                                      "DRR046805", "DRR046798", "DRR046803", "DRR046801", "DRR046806", 
                                                                                                                                                      "DRR046802", "DRR046798", "DRR046799", "DRR046803", "DRR046804", 
                                                                                                                                                      "DRR046801", "DRR046806", "DRR046802", "DRR046801", "DRR046803", 
                                                                                                                                                      "DRR046800", "DRR046799", "DRR046805", "DRR046797", "DRR046798", 
                                                                                                                                                      "DRR046804", "DRR046802", "DRR046805", "DRR046797", "DRR046803", 
                                                                                                                                                      "DRR046804", "DRR046806", "DRR046801", "DRR046802", "DRR046805", 
                                                                                                                                                      "DRR046806", "DRR046798"), type = c("control", "obese", "control", 
                                                                                                                                                                                          "obese", "obese", "control", "obese", "obese", "control", "obese", 
                                                                                                                                                                                          "obese", "control", "control", "control", "control", "obese", 
                                                                                                                                                                                          "obese", "control", "obese", "obese", "obese", "obese", "control", 
                                                                                                                                                                                          "control", "obese", "obese", "obese", "obese", "obese", "obese", 
                                                                                                                                                                                          "obese", "control", "control", "obese", "control", "control", 
                                                                                                                                                                                          "obese", "obese", "obese", "control", "obese", "obese", "obese", 
                                                                                                                                                                                          "obese", "obese", "obese", "obese", "control"), time = c("early", 
                                                                                                                                                                                                                                                   "late", "early", "late", "early", "late", "late", "late", "early", 
                                                                                                                                                                                                                                                   "late", "late", "early", "late", "early", "late", "late", "early", 
                                                                                                                                                                                                                                                   "early", "late", "late", "early", "late", "early", "late", "late", 
                                                                                                                                                                                                                                                   "late", "late", "early", "late", "late", "late", "late", "late", 
                                                                                                                                                                                                                                                   "early", "early", "early", "late", "late", "early", "early", 
                                                                                                                                                                                                                                                   "late", "late", "early", "late", "late", "early", "early", "early"
                                                                                                                                                                                          ), bodyweight = c(20L, 45L, 40L, 45L, 78L, 45L, 45L, 45L, 40L, 
                                                                                                                                                                                                            45L, 45L, 20L, 45L, 20L, 45L, 45L, 78L, 40L, 45L, 45L, 32L, 45L, 
                                                                                                                                                                                                            40L, 45L, 45L, 45L, 45L, 32L, 45L, 45L, 45L, 45L, 45L, 78L, 20L, 
                                                                                                                                                                                                            40L, 45L, 45L, 78L, 20L, 45L, 45L, 32L, 45L, 45L, 78L, 32L, 40L
                                                                                                                                                                                          ), Kingdom = c("k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria", 
                                                                                                                                                                                                         "k__Bacteria", "k__Bacteria", "k__Bacteria", "k__Bacteria"), 
                              Phylum = c("p__Actinobacteria", "p__Actinobacteria", "p__Actinobacteria", 
                                         "p__Actinobacteria", "p__Actinobacteria", "p__Actinobacteria", 
                                         "p__Actinobacteria", "p__Firmicutes", "p__Firmicutes", "p__Firmicutes", 
                                         "p__Firmicutes", "p__Firmicutes", "p__Bacteroidetes", "p__Bacteroidetes", 
                                         "p__Bacteroidetes", "p__Bacteroidetes", "p__Bacteroidetes", 
                                         "p__Bacteroidetes", "p__Bacteroidetes", "p__Bacteroidetes", 
                                         "p__Bacteroidetes", "p__Bacteroidetes", "p__Proteobacteria", 
                                         "p__Proteobacteria", "p__Proteobacteria", "p__Proteobacteria", 
                                         "p__Proteobacteria", "p__Firmicutes", "p__Firmicutes", "p__Firmicutes", 
                                         "p__Firmicutes", "p__Firmicutes", "p__Firmicutes", "p__Firmicutes", 
                                         "p__Firmicutes", "p__Firmicutes", "p__Firmicutes", "p__Actinobacteria", 
                                         "p__Actinobacteria", "p__Actinobacteria", "p__Actinobacteria", 
                                         "p__Actinobacteria", "p__Actinobacteria", "p__Actinobacteria", 
                                         "p__Firmicutes", "p__Firmicutes", "p__Fusobacteria", "p__Proteobacteria"
                              ), Class = c("c__Actinobacteria", "c__Actinobacteria", "c__Actinobacteria", 
                                           "c__Actinobacteria", "c__Actinobacteria", "c__Actinobacteria", 
                                           "c__Actinobacteria", "c__Bacilli", "c__Bacilli", "c__Bacilli", 
                                           "c__Bacilli", "c__Bacilli", "c__Bacteroidia", "c__Bacteroidia", 
                                           "c__Bacteroidia", "c__Bacteroidia", "c__Bacteroidia", "c__Bacteroidia", 
                                           "c__Bacteroidia", "c__Bacteroidia", "c__Bacteroidia", "c__Bacteroidia", 
                                           "c__Betaproteobacteria", "c__Betaproteobacteria", "c__Betaproteobacteria", 
                                           "c__Betaproteobacteria", "c__Betaproteobacteria", "c__Clostridia", 
                                           "c__Clostridia", "c__Clostridia", "c__Clostridia", "c__Clostridia", 
                                           "c__Clostridia", "c__Clostridia", "c__Clostridia", "c__Clostridia", 
                                           "c__Clostridia", "c__Coriobacteriia", "c__Coriobacteriia", 
                                           "c__Coriobacteriia", "c__Coriobacteriia", "c__Coriobacteriia", 
                                           "c__Coriobacteriia", "c__Coriobacteriia", "c__Erysipelotrichi", 
                                           "c__Erysipelotrichi", "c__Fusobacteriia", "c__Gammaproteobacteria"
                              )), row.names = c(NA, -48L), class = c("tbl_df", 
                                                                                                                        "tbl", "data.frame"))

mdata_class <- arrange(mdata_class,time,sample)
mdata_class$sample <- forcats::as_factor(mdata_class$sample)

map <- mdata_class %>% select(sample,time) %>% unique

ggplot(mdata_class, aes(x = sample, y = Abundance, fill = Class)) + 
  #facet_grid(time~.) +
  geom_bar(stat = "identity") +
  # scale_fill_manual(values = class_colors) +
  scale_x_discrete(
     breaks = map$sample,
    labels = map$time,
    drop = TRUE
  )  +
  theme(axis.title.x = element_blank(),
        axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) + 
  guides(fill = guide_legend(reverse = TRUE, keywidth = 1, keyheight = 1)) +
  ylab("Relative Abundance (Class > 1%) \n") +
  ggtitle("Class Composition of Mothur MiSeq SOP data per individual")

Hi @nirgrahamuk
Many many thanks for your help. But I have few confusions (as I'm very much new in R, these questions may sound dumb; Pardon me for that).
Firstly, I already have map file, then why should I do this step? Is this step necessary?
`

map <- mdata_class %>% select(sample,time) %>% unique

Secondly, inmdata_class` there are "Sample" and "sample". Why don't we use "Sample" rather than "sample". Actually what is the difference between them?

Will you please help me out clearing my concept?

THANKS AGAIN,
DC7

I believe it was this way in your data, if they are entirely the same then it's irrelevant which is used for the purpose.

On the map question, I judged it was a good way to regulate the factor presence and ordering which should be consistent between both dataframes

1 Like

okkkk... I think I have understood. You mean to say that the map data should also be arranged the same way to that of the mdata_class. Right?

Hi...
Regarding the "Sample" and "sample" question, I am seeing one strange thing.
when I am using breaks = map$sample, it works fine. But when I use breaks = map$Sample it is not showing the labels along the x-axis. Something like this image:

Well sample was changed to factor and Sample wasn't.
But could have been

1 Like

Thanks a lot @nirgrahamuk. Now, I am getting my expected plot. Many many thanks for helping me out.
DC7

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.