Assign conditional line colors in ggplot custom stat

Hi everyone,

I'd like to create a stat similar to this example from the ggplot vignette that automatically colors regression lines based on either a positive or negative slope. For example, a regression line with negative slope would appear green, and a positively sloped line would appear red. How can I write that into my stat functions?

Here's the lm example:

StatLm <- ggproto("StatLm", Stat, 
  required_aes = c("x", "y"),
  
  compute_group = function(data, scales) {
    rng <- range(data$x, na.rm = TRUE)
    grid <- data.frame(x = rng)
    
    mod <- lm(y ~ x, data = data)
    grid$y <- predict(mod, newdata = grid)
    
    grid
  }
)

stat_lm <- function(mapping = NULL, data = NULL, geom = "line",
                    position = "identity", na.rm = FALSE, show.legend = NA, 
                    inherit.aes = TRUE, ...) {
  layer(
    stat = StatLm, data = data, mapping = mapping, geom = geom, 
    position = position, show.legend = show.legend, inherit.aes = inherit.aes,
    params = list(na.rm = na.rm, ...)
  )
}

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() + 
  stat_lm()

What did you tried from the example of the ggplot vignette ?
You should try to modify the function to do what you want.

I think as you want to modify the color that you need to create a new Geom and not a new stat. You could see how it is done for geom_smooth that add by default a blue line for the regression.

1 Like

Thanks, @cderv, that put me on the right course. Here's a functioning example of what I'm looking for based on the first sample geom presented at the Extending ggplot2 site.

GeomSimpleReg <- ggproto("GeomSimpleReg", Geom,
                           required_aes = c("x", "y"),

                           
                           draw_key = draw_key_point,
                           
                           draw_panel = function(data, panel_params, coord) {
                             coords <- coord$transform(data, panel_params)
                             
                             #fit model and predict
                             rng <- c(coords$x, na.rm = TRUE)
                             grid <- data.frame(x = rng)
                             mod <- lm(y ~ x, data = coords)
                             grid$y <- predict(mod, newdata = grid)
                             
                             #get slope of line
                             slope <- coef(mod)[2]
                             
                             #set color
                             if (slope < 0){
                               def_col <- "purple"
                             } else {
                               def_col <- "orange"
                             }
                             
                             grid$colour <- def_col
                             
                             grid::linesGrob(
                               grid$x, grid$y,
                               gp = grid::gpar(col = grid$colour)
                             )
                           },
                           default_aes = aes(shape = 19)


)

geom_simple_reg <- function(mapping = NULL, data = NULL, stat = "identity",
                              position = "identity", na.rm = FALSE, show.legend = NA, 
                              inherit.aes = TRUE, ...) {
  layer(
    geom = GeomSimpleReg, mapping = mapping,  data = data, stat = stat, 
    position = position, show.legend = show.legend, inherit.aes = inherit.aes,
    params = list(na.rm = na.rm, ...)
  )
}

#Test positive slope
x <- 1:20
df <- data.frame(x = x,
                 y = 0.03*x + rnorm(20, sd = 0.1))

ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_simple_reg()

#test negative slope
df <- data.frame(x = x,
                 y = -0.03*x + rnorm(20, sd = 0.1))

ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_simple_reg()

This is only half a solution though -- I'd like to be able to feed the geom grouped data, but I can't seem to get it right. For example, using the code above:

x <- 1:20
df <- data.frame(x = c(x,x),
                 y = c(0.03*x + rnorm(20, sd = 0.1),  
                       -0.03*x + rnorm(20, sd = 0.1)),
                 group = rep(c("Var1","Var2"),each = 20))

ggplot(df, aes(x = x, y = y)) +
  geom_point(aes(color = group)) +
  geom_simple_reg()

This produces a single regression line for two distinct groups.

How can I get output for both groups, and keep the color scheme as is?

Thank you!

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.