GGPlot2 - Alter scale on pyramid plot

Hi Everyone - I have been practising plots - I am currently creating a pyramid plot. I've got stuck on the scales. I want to make the scales on both sides of the pyramid the same, i.e. 10, 20, 30, 40 etc. Can anyone advise how I would tweak the code below?

Many thanks :grinning:

# 1. Load Libraries -----------------------------------------------------------

library(tidyverse)
library(ggpol)

# 2. Create Dataset -----------------------------------------------------------

df <- tibble(
  Population = c(5, 8.7, 16.7, 24.8, 38, -4.6, -6.4, -16.1, -39.6, -55.3),
  Gender = c("Male", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Female"),
  AgeBand = c("65-69", "70-74", "75-79", "80-84", "85+", "65-69", "70-74", "75-79", "80-84", "85+")
  )

summary(df)

# 3. Create Pyramid Plot ------------------------------------------------------

ggplot(df, aes(x = AgeBand, y = Population, fill = Gender)) +
  geom_bar(stat = "identity") +
  facet_share(~Gender, dir = "h", scales = "free", reverse_num = TRUE) +
  coord_flip() +
  theme_minimal() +
  labs(y = "Count", x = "Age Band", title = " ") +
  scale_fill_manual(values = c("pink", "blue"))


# dev.off() # Clear screen

1 Like

For future reference, reprex is especially helpful when you're askin questions about data visualization, since it automatically includes the output. :slightly_smiling_face:

This is a little bit tricky, since you're basically "duping" the chart into creating the pyramid shape by using negative values, that are actually positive.

What creates the two different axis limits on either side in your original plot is the scales = "free" argument.

library(tidyverse)
library(ggpol)

df <- tibble(
  Population = c(5, 8.7, 16.7, 24.8, 38, -4.6, -6.4, -16.1, -39.6, -55.3),
  Gender = c("Male", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Female"),
  AgeBand = c("65-69", "70-74", "75-79", "80-84", "85+", "65-69", "70-74", "75-79", "80-84", "85+")
)

ggplot(df, aes(x = AgeBand, y = Population, fill = Gender)) +
  geom_bar(stat = "identity") +
  facet_share(~Gender, dir = "h", scales = "free", reverse_num = TRUE) +   # note: scales = "free"
  coord_flip() +
  theme_minimal() +
  labs(y = "Count", x = "Age Band", title = " ") +
  scale_fill_manual(values = c("pink", "blue"))

If you remove that argument, you get matching scales in each facet (note, below, they're both the same). However, your limits are set based on the range of values, which has it going all the way to negative 40.

ggplot(df, aes(x = AgeBand, y = Population, fill = Gender)) +
  geom_bar(stat = "identity") +
  facet_share(~Gender, dir = "h", reverse_num = TRUE) +
  coord_flip() +
  theme_minimal() +
  labs(y = "Count", x = "Age Band", title = " ") +
  scale_fill_manual(values = c("pink", "blue"))

If you manually set the scale (it's the y-axis, since you used coord_flip()), you'll lose your female bars, since the range becomes negative on that side (male turns pink, since the fill has pink first, and it's the only data being plotted).



ggplot(df, aes(x = AgeBand, y = Population, fill = Gender)) +
  geom_bar(stat = "identity") +
  facet_share(~Gender, dir = "h", reverse_num = TRUE) +
  coord_flip() +
  theme_minimal() +
  labs(y = "Count", x = "Age Band", title = " ") +
  scale_fill_manual(values = c("pink", "blue")) +
  scale_y_continuous(limits = c(0, 40))
#> Warning: Removed 5 rows containing missing values (position_stack).

Created on 2018-09-25 by the reprex package (v0.2.1.9000)

I'm actually not sure what the best solution is here. Since facet_share() is from ggpol, I'd recommend taking a look at the source code there to see exactly how that's being accomplished, and/or looking to see if anyone else has filed or given a workaround in the issues.

That, or someone else here might have a good idea! :blush:

3 Likes

Here's a more manual approach, relying on mapping the data onto one plot space and then fudging the x axis:

ctr_width = 20
brks = seq(0, 60, by = 10)

df %>%
  mutate(start = if_else(Gender == "Female", 0, ctr_width),
         end = Population + if_else(Gender == "Female", 0, ctr_width),
         mid = (start + end)/2,
         wid = abs(end - start))  %>%
  ggplot(data = .) +
  geom_tile(aes(mid, AgeBand, fill = Gender, width = wid), height = 0.8) +
  geom_text(data = . %>% distinct(AgeBand),
            aes(ctr_width/2, AgeBand, label = AgeBand)) +
  scale_x_continuous(breaks = c(-brks, ctr_width + brks),
                     labels = c(brks, brks)) +
  theme_minimal() +
  theme(axis.text.y = element_blank())

Rplot

It could use a little more work, eg maybe add a white rectangle with annotate behind the age labels to mask the grid lines, change/remove the axis labels, but you get the idea.