consistent generalised y-label breaks ggplot2 facets

Is there a way to ensure consistent breaks for facets in ggplot2? This needs to be generalised though, I shouldn't be specifying the specific breaks each time.

Here the values go from 0 to 4000, but the breaks are different. The first one has breaks every 2000, while the second has breaks every 1000. In each case the maximum value exceeds the highest break value.

# data --------------------------------
df1 <- data.frame(
  stringsAsFactors = FALSE,
  SYS_LOC_CODE = c("A",
                   "A","A","A","A",
                   "A","A","A","A",
                   "A","A","A","A",
                   "A","B","B","B",
                   "B","B","B",
                   "B","B","B","B",
                   "B","B","B","B",
                   "B"),
  SAMPLE_DATE = as.Date(c("2016-04-12","2016-12-09","2017-04-26",
                  "2018-04-12","2018-10-07",
                  "2019-05-05","2019-10-04",
                  "2020-06-08","2020-09-10",
                  "2021-04-22","2021-08-28",
                  "2022-05-06","2022-10-16","2023-05-19",
                  "2015-10-26","2016-04-12",
                  "2016-12-09","2017-04-26",
                  "2018-04-12","2018-10-07",
                  "2019-05-03","2019-10-02","2020-06-08",
                  "2020-09-12","2021-04-23",
                  "2021-08-24","2022-05-07",
                  "2022-10-16","2023-05-19")),
  REPORT_RESULT_VALUE = c(18.5,146,
                          15,15,110,22,22,100,22,
                          100,100,100,22,200,54.6,
                          130,484,298,578,430,670,
                          530,620,480,750,690,850,670,
                          5400)
)

df2 <- data.frame(
  stringsAsFactors = FALSE,
  NA,
  SYS_LOC_CODE = c("A","A",
                   "A","A",
                   "A","A","A",
                   "A","A",
                   "A","A","A",
                   "A","A",
                   "B","B","B",
                   "B","B",
                   "B","B","B",
                   "B","B",
                   "B","B","B",
                   "B","B"),
  SAMPLE_DATE = as.Date(c("2016-04-12",
                  "2016-12-09",
                  "2017-04-26","2018-04-12",
                  "2018-10-07",
                  "2019-05-05",
                  "2019-10-04","2020-06-08",
                  "2020-09-10",
                  "2021-04-22","2021-08-28",
                  "2022-05-06",
                  "2022-10-16","2023-05-19",
                  "2015-10-26",
                  "2016-04-12",
                  "2016-12-09","2017-04-26",
                  "2018-04-12",
                  "2018-10-07","2019-05-03",
                  "2019-10-02",
                  "2020-06-08","2020-09-12",
                  "2021-04-23",
                  "2021-08-24",
                  "2022-05-07","2022-10-16",
                  "2023-05-19")),
  REPORT_RESULT_VALUE = c(0.42,5.59,
                          2.76,8.7,4,0.93,1,
                          0.33,0.31,0.31,
                          0.71,0.49,0.31,0.61,
                          270,163,439,
                          1040,3720,2100,1600,
                          2200,1900,1600,
                          850,710,660,590,
                          4400)
)

# graphs
ggplot(df1, aes(SAMPLE_DATE, REPORT_RESULT_VALUE)) +
  geom_point() +
  facet_wrap(~SYS_LOC_CODE) +
  labs(title = "df1")

ggplot(df2, aes(SAMPLE_DATE, REPORT_RESULT_VALUE)) +
  geom_point() +
  facet_wrap(~SYS_LOC_CODE) +
  labs(title = "df2")

image
image

Any ideas?

I think this sort of does what I want, though there might be better ways to do it.

r - set upper limit in ggplot to include label greater than the maximum value - Stack Overflow

One way or another, to have common ranges (and, potentially, other characteristics, like breaks) I think you'll need to calculate those from the data and apply them to your plots. Depending on how often you'll want to do this and how many things might change between sets of plots, there are various approaches for streamlining the process. For example, using your two data frames:

library(tidyverse)
library(patchwork)

# List of data frames
# (we use the lst() function so that we have a named list, 
#  which can then be used to provide the title for each plot)
d = lst(df1, df2) 

# Maximum y-value across the two data frames
max.y = map_dbl(d, ~max(.x[["REPORT_RESULT_VALUE"]], na.rm=TRUE)) %>% max

n.breaks = 6

pl = d %>% 
  imap(
    ~ ggplot(.x, aes(SAMPLE_DATE, REPORT_RESULT_VALUE)) +
      geom_point() +
      scale_y_continuous(limits=c(0, max.y), 
                         breaks=scales::breaks_extended(n.breaks)) +
      facet_wrap(~SYS_LOC_CODE) +
      labs(title = .y)
  )

wrap_plots(pl)

If you're going to do this a lot, you can create a function. Then you'll need to decide what things can stay the same across plots and what things will likely change or need to be adjustable. For example:

plot_func = function(data.list, yvar, n.breaks=5) {
  
  # Get maximum y-value
  max.y = map_dbl(data.list, ~max(.x[[yvar]], na.rm=TRUE)) %>% max
  
  # Return a list of plots
  pl = data.list %>% 
    imap(
      ~ ggplot(.x, aes(SAMPLE_DATE, !!sym(yvar))) +
        geom_point() +
        scale_y_continuous(limits=c(0, max.y), 
                           breaks=scales::breaks_extended(n.breaks)) +
        facet_wrap(~SYS_LOC_CODE) +
        labs(title = .y)
    )
  
  pl
}

plot_func(lst(df1,df2), "REPORT_RESULT_VALUE", 6) %>% 
  wrap_plots()

Thanks. I'm sort of doing something similar, though there are other issues. It will do though.

patchwork::wrap_plots() looks interesting too.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.