Assign 2 geom_lines to different y axis


#1

My data looks like this:

    Year REPPAY     REPP
   <dbl>  <dbl>    <dbl>
 1  1971      0  0.145  
 2  1973      0  0.208  
 3  1976      1 -1.16   
 4  1977      2 -0.739  
 5  1978      2  0.0267 
 6  1979      2  0.618  
 7  1980      4  1.10   
 8  1981      3  0.289  
 9  1982      4  0.630  
10  1983      5 -0.274  
11  1984     10  0.175  
12  1985     17  0.179  
13  1986     22  0.00116
14  1987     38  0.457  
15  1988    914 -0.0526 
16  1989    859 -0.0145 
17  1990    955 -0.0562 
18  1991    798  0.140  
19  1992    792  0.256  
20  1993    839  0.208

I want to draw a ggplot with Year as x-axis, and REPPAY and REPP as two y-axis.
My code is as below:


ggplot(figure_1_sample, aes(x = Year)) + 
  geom_line(aes(y = REPPAY, colour = "REPPAY")) + 
  geom_line(aes(y = REPP, colour = "REPP")) +
  scale_y_continuous(sec.axis = sec_axis(~./5000, name = "Repurchase permium"))+ 
  ggtitle("Graph A: Number of repurchaser and repurchase premium ") +
  theme(legend.position = "bottom", plot.title = element_text(hjust = 0.5))

I got this plot:

My second geom_line seems to still use the left side y_axis scale. I want to know how to associate my second geom_line with the right side y_axis with proper scale. Also, I would like to know how to change the label of the left side y_axis.


#2

My understanding of this issue is a little out-of-date now. But as far as I'm aware, ggplot2 has always been fairly opinionated about not plotting two variables with different scales on the same panel.

I wasn't aware that sec_axis() had been introduced at some point, but my reading of its documentation is that it's designed to allow you to map each geom to two axes. For example, if temperature is your y scale, you could have the temperature in °C on the primary y axis and in °F on the secondary y axis. Or to have prices in two different currencies.

Mapping one geom to a primary y axis and another geom to the secondary y axis seems to be prohibited by this design, and I think it probably reflects the opinion that doing this on one panel can be confusing. If you really want to make this happen in this case, I think you'll just have to also multiply your REPP figures by 5000 (eg. aes(y = REPP * 5000, colour = "REPP")) to kind of fake it, otherwise you could use tidyr::gather() to combine REPAY and REPP into a key-value pair of columns (ie. each year has two rows: one for REPAY and one for REPP) and then use faceting to vertically separate the two.

I know that's not quite what you're looking for, @Peter_Griffin, but I hope it helps!


#3

Thanks for your reply. Is there any other plotting package that can do this? Is lattice still under maintenance?


#4

I'm not sure, unfortunately; I've never used lattice! I think that, in this specific case, it'd just be easiest to multiply your REPP figures and look into alternate plotting packages in the longer term. But maybe someone else can chime in with some alternatives :slight_smile:


#5

To add to @rensa's answer, you can have a second y axis, but it has to be a one-to-one transformation of the first y axis. Below is an example using your data. The general idea is to decide on the transformation for the second axis and then, as rensa noted, multiply the data to be plotted on the secondary axis by the inverse transformation, so that the secondary axis ticks will correspond to the data values.

In this case, the desired transformation is linear. I've provided two potential transformations. The first one, which is commented out, sets the transformation so as to make the the two lines have the same maximum absolute value. The second one, which is actually used below, sets the transformation to a factor of 1,000. This results in the axis ticks for the secondary axis coinciding with those of the primary axis. The code below also sets the axis colors to be the same as the colors of the data lines to which the axis corresponds.

# Set color palette  
cols = hcl(c(15, 15+180), 100, 65)

# Set scale factor for second axis
#scl = with(figure_1_sample, max(abs(REPPAY))/max(abs(REPP)))
scl = 1000

ggplot(figure_1_sample, aes(x = Year)) + 
  geom_line(aes(y = REPPAY, colour = "REPPAY")) + 
  geom_line(aes(y = REPP*scl, colour = "REPP")) +
  scale_y_continuous(sec.axis = sec_axis(~./scl, name = "Repurchase permium"))+ 
  ggtitle("Graph A: Number of repurchaser and repurchase premium ") +
  theme(legend.position = "bottom",
        legend.margin=margin(-5,0,0,0),
        plot.title = element_text(hjust = 0.5),
        axis.text.y.right=element_text(colour=cols[1]),
        axis.ticks.y.right=element_line(colour=cols[1]),
        axis.title.y.right=element_text(colour=cols[1]),
        axis.text.y=element_text(colour=cols[2]),
        axis.ticks.y=element_line(colour=cols[2]),
        axis.title.y=element_text(colour=cols[2])) +
  scale_colour_manual(values=cols) +
  labs(colour="")

Rplot23