Before aking my question, I went through all the web discussions about scale_colour_manual(). My impression is that the problem is related to the new version of ggplot (version 3.5.5) as before it worked for me.
As far as I understand from the help function (and from the discussions on the web), with scale_colour_manual() the unused factor levels are omitted from the legend.
This does not work (any more) for me, even if I set drop = TRUE as illustrated in the example below. I am using a recent version of ggplot2 (3.3.5).
In this example the legend still contains a key for "Y" (red) although the factor (the variable Extra) in the dataset only contains "N".
library(ggplot2); library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
theme_set(theme_bw(base_size = 15))
## a local copy of the BOD dataset
## time points larger than five days are labelled as extra
tib_BOD <- BOD %>%
as_tibble() %>%
mutate(Extra = factor(ifelse(Time > 5, "Y", "N")))
tib_BOD
#> # A tibble: 6 x 3
#> Time demand Extra
#> <dbl> <dbl> <fct>
#> 1 1 8.3 N
#> 2 2 10.3 N
#> 3 3 19 N
#> 4 4 16 N
#> 5 5 15.6 N
#> 6 7 19.8 Y
## a subset without Extra == "Y"
tib_BOD_noExtra <- tib_BOD %>% filter(Extra == "N")
## named vector to define the colours
fcol_Extra <- c(N = "blue", Y = "red")
## the legend contains the "Y" although it is not used
## drop = TRUE is not required as this is the default
ggplot(tib_BOD_noExtra) +
geom_point(aes(x = Time, y = demand, colour = Extra), size = 5) +
scale_colour_manual(values = fcol_Extra, drop = TRUE)
Hi @paulMT,
Thanks for posting a great reproducible example.
You had a small error in your character vector of colours; now it works.
suppressPackageStartupMessages(library(tidyverse))
tib_BOD <- read.table(header=TRUE, text ="
Time demand Extra
1 8.3 N
2 10.3 N
3 19 N
4 16 N
5 15.6 N
7 19.8 Y
")
tib_BOD
#> Time demand Extra
#> 1 1 8.3 N
#> 2 2 10.3 N
#> 3 3 19.0 N
#> 4 4 16.0 N
#> 5 5 15.6 N
#> 6 7 19.8 Y
## a subset without Extra == "Y"
tib_BOD_noExtra <- tib_BOD %>% filter(Extra == "N")
## named vector to define the colours
#fcol_Extra <- c(N = "blue", Y = "red") # This line was incorrect
fcol_Extra <- c("blue","red")
ggplot(tib_BOD_noExtra) +
geom_point(aes(x = Time, y = demand, colour = Extra), size = 5) +
scale_colour_manual(values = fcol_Extra)
ggplot(tib_BOD) +
geom_point(aes(x = Time, y = demand, colour = Extra), size = 5) +
scale_colour_manual(values = fcol_Extra)
Thank you for your reply. It solves the problem in my small example, but not generally. I really need to work with named colour vectors as recommended in one of the help examples of scale_colour_manual(). If I run the (second) help example with the named vector (cols), I get 4 keys in the legend (ggplot2 version 3.3.5), while a colleague got 3 keys only with ggplot2 version 3.3.3. There, the orange key is dropped, as it is not used in the graph. So, as I far as I can control, something has changed in comparison to previous versions. All of the sudden a lot of code does not run any more in a proper way.
Thank you very much, this solves my problem.
I will do some further research, but I think that something similar holds for all scale__manual.() functions.