scale_fill_manual after filter does not retain factor level assignment

I want to use specific colors for factor levels in a ggplot, and i pass these to scale_fill_manual.
However, after I filter the data set to remove some factor levels, the colors values shift and are no longer aligned as before.

reprex:

library(ggplot2)
library(dplyr)

mod.cars <- mtcars %>%  mutate(gear = as.factor(gear)) # just for demo purposes

# the colors i want to be assigned to the specific factor levels
gear.cols <- c("red", "green", "blue") 

ggplot(mod.cars, aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols)

This graph creates 3 boxes, where gear=3 is red, gear=4 is green, and gear=5 is blue.
This is what I want.

If I filter out a factor level and recreate the plot,

mod.cars %>%
  filter(gear %in% c(4,5)) %>%
  ggplot(aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols)

gear=4 is now red, and gear=5 is now green.
Basically the data set doesn't seem to recognize that levels(gear) should still contain 3, even if there is no data now, there was a gear level=3 before the filter.

How can I maintain the assigned colors for all factor levels, on a subset?

suppressPackageStartupMessages({
  library(dplyr)
  library(ggplot2)
})

mod.cars <- mtcars %>%  mutate(gear = as.factor(gear)) # just for demo purposes

# the colors i want to be assigned to the specific factor levels
gear.cols <- c("red", "green", "blue") 

ggplot(mod.cars, aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols) + 
  theme_minimal()

gear.cols <- c("green", "blue") 

mod.cars %>%
  filter(gear %in% c(4,5)) %>%
  ggplot(aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols) + 
  theme_minimal()

Hi @Norcalbiostat,
I would use a named vector when specifying my colors for factor levels:

gear.cols <- c("3" = "red", "4" = "green", "5" = "blue") 
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

mod.cars <- mtcars %>%  mutate(gear = as.factor(gear)) # just for demo purposes

# the colors i want to be assigned to the specific factor levels
gear.cols <- c("3" = "red", "4" = "green", "5" = "blue") 

ggplot(mod.cars, aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols)

mod.cars %>%
  filter(gear %in% c(4,5)) %>%
  ggplot(aes(x=hp, fill=gear)) + 
  geom_boxplot() + 
  scale_fill_manual(values=gear.cols)

Created on 2021-03-23 by the reprex package (v1.0.0)

3 Likes

Perfect, thanks @jrmuirhead - i needed it to be able to handle a dynamic filter.

Thanks @technocrat - but i will be using this in a dynamic filter in a shiny app so I can't redefine the color vector each time.

@Norcalbiostat has a better solution whether or not with Shiny.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.