How to display a subset of data in ggplot with axes/legends/layout displayed as if all data are present?

I'm using ggplot to generate plots for a talk I'll be giving in powerpoint. Since the plot has a lot of information, I want to be able to walk people throw it step by step, first having only the axes up (no data), then revealing each group of data one by one. I had planned on doing this by generating multiple plots that have the same dimensions/axes/legends, but differ only in what data are displayed (and then each slide will have a new image with progressively more information on it).

So for instance, if the final plot I want to show looks like this:

library(tidyverse)
mtcars %>% 
  ggplot(aes(x = hp, y = mpg, color = as.factor(cyl))) +
  geom_point(size = 2) +
  scale_color_manual(values = c(`4` = 'blue', `6` = 'black', `8` = 'grey'), drop = FALSE) +
  theme_bw()

I want to create some companion plots that have the identical axes ranges above, the full legend (with all 3 groups), and in the same chart area, but without some or all of the groups present. For the chart without any data present, this gets me close:

mtcars %>% 
  ggplot(aes(x = hp, y = mpg, color = as.factor(cyl))) +
  geom_blank() +
  scale_color_manual(values = c(`4` = 'blue', `6` = 'black', `8` = 'grey'), drop = FALSE) +
  theme_bw()

But it doesn't include the legend.

For the plots that only have partial data, something like the the following (obviously) doesn't work because it only has one data value on the legend and the axes are off. The axes are easy enough to change so that they match, but I can't get the legend to display more than the sole data point (this isn't surprising since I've filtered out all the other data, so there's no way for the plot to know that there are other members of the group):

mtcars %>% 
  filter(cyl == 4) %>% 
  ggplot(aes(x = hp, y = mpg, color = as.factor(cyl))) +
  geom_point(size = 2) +
  scale_color_manual(values = c(`4` = 'blue', `6` = 'black', `8` = 'grey'), drop = FALSE) +
  theme_bw()

In other words, is there a way to display the legend and plot axes/layout for a plot as if all of the data are plotted, but actually only plot a subset of the data (or no data at all)? I feel like there's probably something simple that I'm not familiar with, but couldn't anything when I tried.

This is kind of a silly solution, but what if you just move the points so they fall outside the limits.

library(tidyverse)
# final plot
iris %>% 
    ggplot() + 
    aes(Sepal.Length, Sepal.Width, color = Species) + 
    geom_point() + 
    coord_cartesian(ylim = c(1.5, 5))

# intermediate plot
iris %>% 
    mutate(Sepal.Width = 0) %>%
    ggplot() + 
    aes(Sepal.Length, Sepal.Width, color = Species) + 
    geom_point() + 
    coord_cartesian(ylim = c(1.5, 5))

Created on 2021-05-31 by the reprex package (v1.0.0)

Hrmm, that's a creative solution. It doesn't feel like the most elegant way to go about it, especially since to create intermediate figures you need to do some conditional mutating like this:

iris %>% 
  mutate(Sepal.Width = if_else(Species != 'setosa', Sepal.Width, 0)) %>%
  ggplot() + 
  aes(Sepal.Length, Sepal.Width, color = Species) + 
  geom_point() + 
  coord_cartesian(ylim = c(1.5, 5))

But if there's not a better solution that can work, thanks!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.