defining colour for NA values in ggplot2

ggplot2

#1

So to be candid, I have a solution for my problem, but this is primarily a question about laziness... I have recently begun customising my ggplot2 graphs. One thing I've been doing is defining a different colour palette. Most of my plots revolve around three factors, so I've been specifying the colours as follows:

cols  <- c(
           "factorA" = "#CC6677",
           "factorB" = "#2DA17E",
           "factorC" = "#4477AA"
          )

I can then call those in ggplot, for example:

ggplot(data) +
geom_point(x = factor, y = values, fill = factor) + 
scale_colour_manual(values = cols) 

and that works perfectly fine; I've managed to customise over 40 plots using the above. But I've just realised that I now want to include NA values in a large minority of those, but I can't seem to define a colour for NA values as follows:

cols  <- c(
           "factorA" = "#CC6677",
           "factorB" = "#2DA17E",
           "factorC" = "#4477AA",
           na.value  = "#000000",
          )

From what I've just read you have to define it outside the values call (and this does work):

...+ scale_colour_manual(values = cols, na.value = "#000000") 

but I don't really want to go through and manually append the na.value call to those plots that need it, unless I have no choice... Is there a way I can edit my cols to make this less painful?


#2

I'm not sure if either of these is optimal, but here are two possibilities:

  1. Define a function:

    scm = function(palette=cols) {
      scale_color_manual(values=palette, na.value="#000000")
    }
    
    ggplot(mydata, aes(x, y)) + geom_line() + scm()
    
  2. Pass a list of arguments to scale_color_manual:

    library(purrr)
    
    pal = list(values=cols, na.value="#000000")
    
    ggplot(mydata, aes(x, y)) + geom_line() + invoke(scale_colour_manual, pal)
    

    invoke is from the purrr package. The base R equivalent would be do.call(scale_colour_manual, pal).

There are some ways you can make it easier to deal with arbitrary numbers of colors. For example, you can define a palette generating function:

my_pal = colorRampPalette(cols)

my_pal(3)  # Gives you your original palette
my_pal(5)  # Same color range, with five colors
my_pal(5) %>% set_names(paste0("factor", LETTERS[1:5]))  # Add names to the color vector

This way, you can use the same palette for different numbers of factor levels with a single function.

You can also create your own scale_color_*** function. The paletti package makes this easy:

#devtools::install_github("edwinth/paletti")
library(paletti)
scale_color_mypal = get_scale_color(get_pal(cols))

ggplot(mtcars, aes(mpg, wt, colour=factor(carb)) + 
  geom_point() + 
  scale_color_mypal()

#3

Thanks @joels, #1 works perfectly. I've now defined two functions, one for fill and one for colour which I can now call when appropriate.