How to set Fill value with new tidyeval in ggplot2

rmendels · July 5, 2018, 6:26pm

okay I am trying to understand how to use the new tidyeval with ggplot2. I have a function to make maps, presently it does something like the following (except actually my_frame is passed to a function):

my_frame <- data.frame(amplitude = spatial_amp, longitude = longitude, latitude = latitude)
  myplot <- ggplot(data = my_frame, aes_string(x = names(my_frame)[2], y = names(my_frame)[3], fill = names(my_frame)[1])) +
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + ylab("latitude") + xlab("longitude") +
    coord_fixed(1.3, xlim = xlim, ylim = ylim)

I am trying to follow the new methods, so changed it to:

  x_var = quo(names(my_frame)[2])
  y_var = quo(names(my_frame)[3])
  fill_var = quo(names(my_frame)[1])
  myplot <- ggplot(data = my_frame, aes(!!x_var, !!y_var, fill = !!fill_var) )+
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + ylab("latitude") + xlab("longitude") +
    coord_fixed(1.3, xlim = xlim, ylim = ylim)

It picked up the x-values (longitude) and y-values (latitude) okay, but it did not set the Fill correctly. Any help appreciated.

mara · July 5, 2018, 7:45pm

Would you mind just tossing in some dummy data/a snippet of your data here so I can make sure I'm recreating your scenario correctly?

If you can, a reprex would be great.

rmendels · July 5, 2018, 9:11pm

The following should pretty much reproduce what I am trying and works:

library(ggplot2)
library(mapdata)
latitude <- seq(from = 48, to = -47.5, by = -1.)
longitude <- seq(from = -150, to = -70.5, by = 1.)
temp <- expand.grid( x = longitude, y = latitude)
xlim <- c(temp$x[1], temp$x[length(temp$x)])
ylim <- c(temp$y[length(temp$y)], temp$y[1])
longitude <- temp$x
latitude <- temp$y
no_loc <- length(longitude)
spatial_amp <- rnorm(no_loc,mean=0,sd=1) 
spatial_amp_frame <- data.frame(amplitude = spatial_amp, longitude = longitude, latitude = latitude)
w <- map_data("worldHires", ylim = ylim, xlim = xlim)
amplitude_map <- ggplot(data = spatial_amp_frame, aes_string(x = names(spatial_amp_frame)[2], y = names(spatial_amp_frame)[3],  fill = names(spatial_amp_frame)[1])) +
  geom_raster(interpolate = FALSE, na.rm = TRUE) +
  geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
  theme_bw() + ylab("latitude") + xlab("longitude") +
  coord_fixed(1.3, xlim = xlim, ylim = ylim)

The following would be the same but doesn't get the fill correctly:

x_var = quo(names(spatial_amp_frame)[2])
  y_var = quo(names(spatial_amp_frame)[3])
  fill_var = quo(names(spatial_amp_frame)[1])
  myplot <- ggplot(data = spatial_amp_frame, aes(!!x_var, !!y_var, fill = !!fill_var) )+
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + ylab("latitude") + xlab("longitude") +
    coord_fixed(1.3, xlim = xlim, ylim = ylim)

In fact I am generating a lot of maps, and have a function that helps with names. At present it is:

make_map <- function(my_frame, xlim, ylim, my_color, title = NA, limits = NA) {
  require(ggplot2)
  require(mapdata)
  w <- map_data("worldHires", ylim = ylim, xlim = xlim)
  myplot <- ggplot(data = my_frame, aes_string(x = names(my_frame)[2], y = names(my_frame)[3], fill = names(my_frame)[1])) +
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + ylab("latitude") + xlab("longitude") +
    coord_fixed(1.3, xlim = xlim, ylim = ylim) 
    if (!is.na(limits)) {
      myplot <- myplot + scale_fill_gradientn(colours = my_color, limits = limits, na.value = NA)
    } else{
      myplot <- myplot + scale_fill_gradientn(colours = my_color, na.value = NA)
    }
  if (!is.na(title)) {
    myplot <- myplot + ggtitle(title)
    }
  myplot
}

and what I have tried is:

make_map1 <- function(my_frame, xlim, ylim, my_color, title = NA, limits = NA) {
  require(ggplot2)
  require(mapdata)
  x_var = quo(names(my_frame)[2])
  y_var = quo(names(my_frame)[3])
  fill_var = quo(names(my_frame)[1])
  w <- map_data("worldHires", ylim = ylim, xlim = xlim)
  myplot <- ggplot(data = my_frame, aes(!!x_var, !!y_var, fill = !!fill_var)) +
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + ylab("latitude") + xlab("longitude") +
    coord_fixed(1.3, xlim = xlim, ylim = ylim) 
  if (!is.na(limits)) {
    myplot <- myplot + scale_fill_gradientn(colours = my_color, limits = limits, na.value = NA)
  } else{
    myplot <- myplot + scale_fill_gradientn(colours = my_color, na.value = NA)
  }
  if (!is.na(title)) {
    myplot <- myplot + ggtitle(title)
  }
  myplot
}

aosmith · July 5, 2018, 9:34pm

I think you'll want sym instead of quo since you are using variable names as strings.

Try using:

x_var = sym(names(spatial_amp_frame)[2])
y_var = sym(names(spatial_amp_frame)[3])
fill_var = sym(names(spatial_amp_frame)[1])

rmendels · July 5, 2018, 10:20pm

Thanks. That does indeed seem to work (using "sym"). Okay but what has me confused is I was trying to follow an example in the Blog:

x_var <- quo(wt)
y_var <- quo(mpg)
group_var <- quo(cyl)

ggplot(mtcars, aes(!!x_var, !!y_var)) + 
  geom_point() + 
  facet_wrap(vars(!!group_var))

Now I can just follow things by rote (I will for this example) but I don't understand why "quo" works in example above and not in my mapping example. I have had a lot of problems understanding the ins and outs of tidyeval. The main thing is I understand that the use of "aes_string" is deprecated in the long-run (discovering "aes_string" was a godsend to write a more general function). I know it is probably a lot of work, and it is of course easy for me to volunteer work for others, but some more examples of how to do things in the new version of ggplot2 compare to the old version, for the features that are deprecated, I at least would find helpful.

mara · July 6, 2018, 11:59am

It works for me! The only change I made was to pass the variable names in directly for the !!, rather than as quo(names(df[1]))

library(ggplot2)
library(mapdata)
#> Loading required package: maps
latitude <- seq(from = 48, to = -47.5, by = -1.)
longitude <- seq(from = -150, to = -70.5, by = 1.)
temp <- expand.grid( x = longitude, y = latitude)
xlim <- c(temp$x[1], temp$x[length(temp$x)])
ylim <- c(temp$y[length(temp$y)], temp$y[1])
longitude <- temp$x
latitude <- temp$y
no_loc <- length(longitude)
spatial_amp <- rnorm(no_loc,mean=0,sd=1) 
spatial_amp_frame <- data.frame(amplitude = spatial_amp, longitude = longitude, latitude = latitude)
w <- map_data("worldHires", ylim = ylim, xlim = xlim)
amplitude_map <- ggplot(data = spatial_amp_frame, aes_string(x = names(spatial_amp_frame)[2], y = names(spatial_amp_frame)[3],  fill = names(spatial_amp_frame)[1])) +
  geom_raster(interpolate = FALSE, na.rm = TRUE) +
  geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
  theme_bw() + ylab("latitude") + xlab("longitude") +
  coord_fixed(1.3, xlim = xlim, ylim = ylim) 

amplitude_map


x_var <- quo(longitude)
y_var <- quo(latitude)
fill_var <- quo(amplitude)
myplot <- ggplot(data = spatial_amp_frame, aes(!!x_var, !!y_var, fill = !!fill_var)) +
  geom_raster(interpolate = FALSE, na.rm = TRUE) +
  geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
  theme_bw() + ylab("latitude") + xlab("longitude") +
  coord_fixed(1.3, xlim = xlim, ylim = ylim)

myplot

Created on 2018-07-06 by the reprex package (v0.2.0.9000).

PS @aosmith and @rmendels have you both updated to ggplot2 3.0.0? If so, one of the big changes is that aes() now supports quosures, so quo() will be used in place of sym().

jcblum · July 6, 2018, 1:03pm

I think one thing that’s maybe a little confusing with this example is that you’re choosing the variables to plot by position — so you never actually needed aes_string() in the first place. You could do:

make_map <- function(my_frame, xlim, ylim, my_color, title = NA, limits = NA) {
  require(ggplot2)
  require(mapdata)
  w <- map_data("worldHires", ylim = ylim, xlim = xlim)
  # Identify variables by position
  myplot <- ggplot(data = my_frame, aes(x = my_frame[[2]], y =  my_frame[[3]], fill =  my_frame[[1]])) +
    geom_raster(interpolate = FALSE, na.rm = TRUE) +
    geom_polygon(data = w, aes(x = long, y = lat, group = group), fill = "gray50") +
    theme_bw() + 
    # Identify variable labels by position
    labs( y = "latitude", x = "longitude", fill = names(my_frame[1])) +
    coord_fixed(1.3, xlim = xlim, ylim = ylim) 
    if (!is.na(limits)) {
      myplot <- myplot + scale_fill_gradientn(colours = my_color, limits = limits, na.value = NA)
    } else{
      myplot <- myplot + scale_fill_gradientn(colours = my_color, na.value = NA)
    }
  if (!is.na(title)) {
    myplot <- myplot + ggtitle(title)
    }
  myplot
}

Quosures (or the old aes_string()) become important when you want to write a function that’s more general than this one — say where the data frames differ in configuration so you don’t know what order the variables appear in, or where you want to make several plots with the same structure, but using different combinations of variables each time.

I agree that more concrete examples and cookbook recipes would be helpful for internalizing how these new patterns work. Right now the core reference is the Metaprogramming section of Advanced R, 2nd ed., but I suspect its examples are a bit hard to relate to for the average ggplot user who wants to write some functions to reduce copying and pasting in their code. But knowing the ggplot community, I expect it won’t be long before there are lots more resources out there!

aosmith · July 6, 2018, 1:53pm

Yep, @mara I updated to the official release yesterday!

I haven't checked in with tidyeval for a few months, but I've used sym() whenever I wanted to work with the names of variables as strings instead of "bare" variable names. In this case I can't get quo() to work with strings.

Below is a (very basic) reprex. The big difference to what you did is that I'm pulling variables via names() as in the OP, so the results are strings.

library(ggplot2)

# Variables of interest, as strings
names(mtcars)[1]
#> [1] "mpg"
names(mtcars)[3]
#> [1] "disp"
names(mtcars)[2]
#> [1] "cyl"

Here's the "old way", working with aes_string() (I'm gonna miss this approach! )

# aes_string
ggplot(mtcars, aes_string(names(mtcars)[1], names(mtcars)[3], 
                          color = names(mtcars)[2]) ) +
    geom_point()

Things don't work right for me if I try to use quo() with variables-as-strings.

# try with quo
x_var = quo(names(mtcars)[1])
y_var = quo(names(mtcars)[3])
col_var = quo(names(mtcars)[2])

ggplot(mtcars, aes(!!x_var, !!y_var, 
                          color = !!col_var ) ) +
    geom_point()

Things work correctly with sym().

# try with sym
x_var = sym(names(mtcars)[1])
y_var = sym(names(mtcars)[3])
col_var = sym(names(mtcars)[2])

ggplot(mtcars, aes(!!x_var, !!y_var, 
                   color = !!col_var ) ) +
    geom_point()

Created on 2018-07-06 by the reprex package (v0.2.0).

mara · July 6, 2018, 1:58pm

Good to know! Yeah, it makes sense that quo() wouldn't handle names in the same way — though I'm still trying to figure out why it worked for latitude and longitude, but not for fill in the original post.

aosmith · July 6, 2018, 2:32pm

I think that things actually didn't work in the original post for the geom_raster() layer, it just looks like it did.

The polygon layer, which has its own coordinates defined in aes(), is what drew the background map/ocean. And then the limits were set via ylim/xlim. All of which made it look like the x and y mappings with quo() for the raster layer were working when they weren't.

The code only runs without error if the "problem" layer comes first. If you were to put the polygon layer above the raster layer you'd get an error about discrete values being supplied to a continuous scale. I guess because you can add continuous to discrete but not vice versa?

rmendels · July 6, 2018, 2:44pm

Thanks to all for comments and help. One comment asked about using names rather than just the indices fo the frame. My memory was when I tried that it affected the labelling. But I wrote this awhile ago and don't remember for certain.

But, at the same time I am even more confused not as to what works or doesn't work, but why one thing works and one does not, which is important to go beyond this specific case.

Thanks again to all.