Plot two columns in a dataframe in a stacked bar plot using ggplot()


#1

Hello,

My dataframe has two columns which I want to put on a stacked bar plot using ggplot():

  • There is a "CUST_REGION_DESCR" column.
  • There is a "Total_Sales" column.
  • Two other columns are a "Total_PL_Sales" and a "Total_NonPL_Sales"

How do I plot a bar plot with x = CUST_REGION_DESCR, y = Total_Sales, and each bar plot contains the raw dollars amount (not Percentage) of both Total_PL_Sales and Total_NonPL_Sales so that these two numbers add up to Total_Sales in each x-axis tick.

library(reprex)
library(tidyverse)

df9b = structure(list(CUST_REGION_DESCR = structure(c(1L, 2L, 2L, 2L, 
                                                      2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 
                                                      5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 
                                                      7L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("CORPORATE REGION", "MOUNTAIN WEST REGION", 
                                                                                              "NORTH CENTRAL REGION", "NORTH EAST REGION", "OHIO VALLEY REGION", 
                                                                                              "SOUTH CENTRAL REGION", "SOUTH EAST REGION", "WESTERN REGION"
                                                      ), class = "factor"), cluster = structure(c(6L, 1L, 2L, 3L, 4L, 
                                                                                                  5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 
                                                                                                  3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 
                                                                                                  1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", "2", "3", "4", "5", 
                                                                                                                                      "6"), class = "factor"), Frequency = c(1L, 23790L, 198734L, 16039L, 
                                                                                                                                                                             704403L, 115804L, 9110L, 24844L, 304581L, 19967L, 866626L, 69866L, 
                                                                                                                                                                             9509L, 41705L, 379897L, 33726L, 891636L, 126722L, 13118L, 24015L, 
                                                                                                                                                                             232134L, 13232L, 604834L, 48994L, 5165L, 40381L, 395687L, 36907L, 
                                                                                                                                                                             947122L, 98568L, 14777L, 44788L, 442906L, 31482L, 1297810L, 105840L, 
                                                                                                                                                                             9171L, 27573L, 231626L, 22654L, 624062L, 119244L, 11321L), Total_Sales = c(146.25, 
                                                                                                                                                                                                                                                        3075492.94, 19920967.86, 1960786.48, 80336467.37, 17689175.85, 
                                                                                                                                                                                                                                                        1123670.6, 2857875.91, 31031641.02, 2441446.32, 106867486.15, 
                                                                                                                                                                                                                                                        9705276.69, 1114186.75, 4717704.19, 36854339.96, 4010670.78, 
                                                                                                                                                                                                                                                        104041068.18, 18982879.18, 1688525.28, 2572610.16, 22359811.92, 
                                                                                                                                                                                                                                                        1475061.98, 69658938.13, 7400010.31, 729931.94, 4519106.65, 37130978.44, 
                                                                                                                                                                                                                                                        3958571.89, 107905634.78, 13912376.39, 1621599.43, 4874605.01, 
                                                                                                                                                                                                                                                        41083818.15, 3446400.14, 145152439.67, 14911846.68, 1116550.76, 
                                                                                                                                                                                                                                                        3053504.72, 21516253.86, 2823231.22, 68596277.46, 17352856.35, 
                                                                                                                                                                                                                                                        1415456.47), Total_PL_Sales = c(0, 436916.35, 3369256.72, 252424.27, 
                                                                                                                                                                                                                                                                                        10759991.38, 927154.09, 131618.18, 453732.75, 5512526.74, 347807.94, 
                                                                                                                                                                                                                                                                                        13992250.48, 735852.54, 141106.62, 819124.7, 6774823.55, 560709.15, 
                                                                                                                                                                                                                                                                                        15171746.01, 1109749.55, 193021.25, 396658.3, 4254468.76, 247627.12, 
                                                                                                                                                                                                                                                                                        10375052.52, 464877.66, 102218.96, 744850.74, 6973926.53, 664834.55, 
                                                                                                                                                                                                                                                                                        15665048.93, 896169.59, 249149.15, 876154.9, 8147445.66, 531482.52, 
                                                                                                                                                                                                                                                                                        20745553.83, 996742.41, 133631.8, 411419.06, 3637467.97, 391509.06, 
                                                                                                                                                                                                                                                                                        9557386.45, 1003022.41, 151783.39), Total_NonPL_Sales = c(146.25, 
                                                                                                                                                                                                                                                                                                                                                  2638576.59, 16551711.14, 1708362.21, 69576475.99, 16762021.76, 
                                                                                                                                                                                                                                                                                                                                                  992052.42, 2404143.16, 25519114.28, 2093638.38, 92875235.67, 
                                                                                                                                                                                                                                                                                                                                                  8969424.15, 973080.13, 3898579.49, 30079516.41, 3449961.63, 88869322.17, 
                                                                                                                                                                                                                                                                                                                                                  17873129.63, 1495504.03, 2175951.86, 18105343.16, 1227434.86, 
                                                                                                                                                                                                                                                                                                                                                  59283885.61, 6935132.65, 627712.98, 3774255.91, 30157051.91, 
                                                                                                                                                                                                                                                                                                                                                  3293737.34, 92240585.85, 13016206.8, 1372450.28, 3998450.11, 
                                                                                                                                                                                                                                                                                                                                                  32936372.49, 2914917.62, 124406885.84, 13915104.27, 982918.96, 
                                                                                                                                                                                                                                                                                                                                                  2642085.66, 17878785.89, 2431722.16, 59038891.01, 16349833.94, 
                                                                                                                                                                                                                                                                                                                                                  1263673.08)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
                                                                                                                                                                                                                                                                                                                                                  ), row.names = c(NA, -43L), vars = "CUST_REGION_DESCR", drop = TRUE, indices = list(
                                                                                                                                                                                                                                                                                                                                                    0L, 1:6, 7:12, 13:18, 19:24, 25:30, 31:36, 37:42), group_sizes = c(1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                       6L, 6L, 6L, 6L, 6L, 6L, 6L), biggest_group_size = 6L, labels = structure(list(
                                                                                                                                                                                                                                                                                                                                                                                                                         CUST_REGION_DESCR = structure(1:8, .Label = c("CORPORATE REGION", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "MOUNTAIN WEST REGION", "NORTH CENTRAL REGION", "NORTH EAST REGION", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "OHIO VALLEY REGION", "SOUTH CENTRAL REGION", "SOUTH EAST REGION", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "WESTERN REGION"), class = "factor")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  -8L), vars = "CUST_REGION_DESCR", drop = TRUE))

df9b %>% ggplot()+
  geom_bar(aes(x = CUST_REGION_DESCR, y = Total_Sales), stat = "identity") +
  coord_flip() + theme(legend.position = "none")

Created on 2018-07-09 by the reprex
package
(v0.2.0).

Additionally, it would be great if I can do a side-by-side bar plot. I assume I use

position = "dodge"

but correct me if I am wrong.
Thank you!


#2

Is this what you are after?

df9b %>% 
  select(-cluster, -Frequency, -Total_Sales) %>% 
  gather(variable, value, -CUST_REGION_DESCR) %>% 
  ggplot(aes(x = CUST_REGION_DESCR, y = value, fill = variable)) +
  geom_col() +
  coord_flip() + 
  theme(legend.position = "bottom")

#3

martin.R:
I have a few questions based on your code and yes, this is what I want. However, is there a way for me to show the dollar sign on the x-asix with comma as well? For instance, can I show one of the ticks on the x-asix as $1,000,000?

  • When you call gather() statement, there are three components: variable, value and -CUST_REGION_DESC. Where do these three variables come from?

Thank you.


#4

I think you need the scales package for changing the labels. scales::dollar_format() should work.

To plot the separate sales values, the data was transformed into a long format via tidyr::gather(). variable and value are just nominal names for the new columns and could be named anything else.


#5

@martin.R
Is there a way for me to reorder the bar plot based on the Total_PL_Sales value?

ggplot(aes(x = reorder(CUST_REGION_DESCR,-value), y = value, fill = Buckets))

This one does not give me what I want.
Thanks!


#6

You could turn CUST_REGION_DESCR into a factor and use forcats::fct_reorder() to reorder it based on Total_Sales before dropping that variable.


#7

@jasonparker
Jason:
CUST_REGION_DESCR is already a factor.