I have one question regarding the relative frequency plot. I am trying to perform a stacked plot for my microbiome data, but in y axis (relative frequencies) I get much more than 100. The sum of each bacteria (relative frequency) is 100 but I don't know why it goes way higher than 100 in the plot. I have read other posts but I couldn't figured it out. I really appreciate if you can help me out, please.
The first 10 rows of the data are the following:
The codes that I am using are:
library(tidyverse) library(readxl) library(glue) library(ggtext) library(patchwork) library(reshape2) library(ggtext) pc = read.csv("L2_16S_R2.csv", header = TRUE) #convert data frame from a "wide" format to a "long" format pcm = melt(pc, id = c("Vineyard")) View(pcm) str(pcm) pcm %>% group_by(Vineyard, variable) %>% summarize(value = sum(value), .groups="drop") %>% group_by(Vineyard, variable) %>% summarize(mean_value = mean(value), .groups="drop") %>% mutate(variable=str_replace(variable, "(.*)_unclassified", "Unclassified *\\1*"), variable = str_replace(variable, "^(\\S*)$", "*\\1*")) %>% ggplot(aes(x=Vineyard, y=mean_value, fill= variable)) + geom_col() + labs(x = NULL, y = "Mean relative abundance (%)") + theme_classic() + theme(axis.text.x = element_markdown(), legend.text = element_markdown())
And the plot is: