I want to sort a stacked ggplot bar chart by the relative frequency of a subset in the fill.
library(ggplot2)
library(tibble)
library(scales)
factor1 <- as.factor(c("ABC", "CDA", "XYZ", "YRO"))
factor2 <- as.factor(c("A", "B"))
set.seed(43)
data <- tibble(x = sample(factor1, 1000, replace = TRUE),
z = sample(factor2, 1000, replace = TRUE))
One answer is to use tapply, provided by a stack overflow answer here.
lvls <- names(sort(tapply(data$z == "B", data$x, mean)))
ggplot(data = data, aes(factor(x, levels = lvls), fill = z)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent)
This is a correct answer but I am wondering if there is another more tidy way to do this?
I am primarily interested in a way to do this that does not involve dplyr, but any suggestions are welcome.