How to create a legend for ggplot

I have the following code which gives me a density plot and runs okay. My question is how can I add a legend inside the plot on th upper right for my two variables sim and dv?

#CODE THAT RUN GRAPH WITH NO LEGEND AND FAR X GOes TO X AXIS
ggplot() + geom_density(aes(x=sim), colour="red", linetype="longdash",data=data) +
geom_density(aes(x=dv,col=sim), colour="blue", linetype="solid",data=data) +
scale_y_continuous(limits = c(0,NA),expand=c(0,0)) +
scale_x_continuous(limits = c(0,13),expand=c(0,0)) +
xlab("Concerta Peak1 Cmax Distribution") +
ylab("Density")

You can add below code to have legend inside the plot on the right:

+ theme(legend.position="right")
2 Likes

The theme() documentation page has several examples showing how to place legends in different spots:
Modify components of a theme — theme • ggplot2 (legend stuff starts about halfway through the examples)

The legend.position keywords control where the legend appears outside the plot area. To put a legend inside the plot, you supply legend.position as coordinates on a relative scale that runs from [0,0] in the lower left to [1,1] in the upper right. You'll usually want to use legend.justification, too — this tells ggplot which part of the legend box should align with the coordinates. By default it's the center point of the box, which is not ideal if you want the legend all the way to one side of the plot area.

library(tidyverse)

ggplot(mtcars, aes(wt, mpg)) +
  geom_point(aes(color = factor(cyl))) +
  labs(
    x = "Weight (1000 lbs)",
    y = "Fuel economy (mpg)",
    color = "Cylinders"
  ) + theme(
    legend.position = c(0.95, 0.95),
    legend.justification = c("right", "top")
  )

Created on 2018-09-30 by the reprex package (v0.2.1)

All the theme elements that modify legends begin with legend., so you can use autocomplete to explore them. Or you can read about them in the theme() documentation. Here's the summary:

legend.background
background of legend (element_rect; inherits from rect)

legend.margin
the margin around each legend (margin)

legend.spacing
the spacing between legends (unit)

legend.spacing.x
the horizontal spacing between legends (unit); inherits from legend.spacing

legend.spacing.y
the horizontal spacing between legends (unit); inherits from legend.spacing

legend.key
background underneath legend keys (element_rect; inherits from rect)

legend.key.size
size of legend keys (unit)

legend.key.height
key background height (unit; inherits from legend.key.size)

legend.key.width
key background width (unit; inherits from legend.key.size)

legend.text
legend item labels (element_text; inherits from text)

legend.text.align
alignment of legend labels (number from 0 (left) to 1 (right))

legend.title
title of legend (element_text; inherits from title)

legend.title.align
alignment of legend title (number from 0 (left) to 1 (right))

legend.position
the position of legends ("none", "left", "right", "bottom", "top", or two-element numeric vector)

legend.direction
layout of items in legends ("horizontal" or "vertical")

legend.justification
anchor point for positioning legend inside plot ("center" or two-element numeric vector) or the justification according to the plot area when positioned outside the plot

legend.box
arrangement of multiple legends ("horizontal" or "vertical")

legend.box.just
justification of each legend within the overall bounding box, when there are multiple legends ("top", "bottom", "left", or "right")

legend.box.margin
margins around the full legend area, as specified using margin()

legend.box.background
background of legend area (element_rect; inherits from rect)

legend.box.spacing
The spacing between the plotting area and the legend box (unit)

3 Likes

I added the code and saw no legend, I got th same graph.

#CODE THAT RUN GRAPH WITH NO LEGEND AND FAR X GOes TO X AXIS
ggplot() + geom_density(aes(x=sim), colour="red", linetype="longdash",data=data) +
geom_density(aes(x=dv,col=sim), colour="blue", linetype="solid",data=data) +
scale_y_continuous(limits = c(0,NA),expand=c(0,0)) +
scale_x_continuous(limits = c(0,13),expand=c(0,0)) +
xlab("Concerta Peak1 Cmax Distribution") +
ylab("Density") +
theme(legend.position="right")

I have a question. The values in your legend are cylinder related (4,6,8)and come from the data. In my case my data looks like this:

subject

time

cmt

sim

dv

prev_subject

rep

peak

1

2

11

7.35

7.55

1

1

1

The values the I want in my legend are sim and dv which correspond to my graph lines. How would these be picked up based upon your suggested code?

ggplot generates legends only when you create an aesthetic mapping inside aes . This is usually done by mapping a data column to an aesthetic, like colour , shape , or fill. ggplot is also set up to work most easily with data in "long" format. In your case, that would mean stacking the dv and sim columns and adding an additional column that marks whether a value came from dv or sim. Below, we'll do that with the gather function.

library(tidyverse)
theme_set(theme_classic())

# Fake data
set.seed(2)
dat = data.frame(othercolumn = sample(LETTERS, 100, replace=TRUE),
                 dv = rnorm(100, 10, 3),
                 sim = rnorm(100, 11, 2))

# convert data to long format
dat.l = gather(dat, key, value, dv, sim)

Note that we now have the numeric data in a single column called value and a categorical column called key that tells us where the data came from.

dat.l[c(1:5,101:105), ]
    othercolumn key     value
1             E  dv  7.485139
2             S  dv 16.198904
3             O  dv  8.313259
4             E  dv 13.827147
5             Y  dv  6.857282
101           E sim 12.951781
102           S sim 10.661154
103           O sim 12.444384
104           E sim  9.311163
105           Y sim 13.554587

To plot the data, we set the x aesthetic to value (we could have done x=value, but x is first by default, so we can just type value) and the colour aesthetic to key inside aes, which generates a legend. We set custom colors using scale_colour_manual and we use theme to set a custom legend position.

ggplot(dat.l, aes(value, colour=key)) +
  geom_density() +
  labs(colour="Type", 
       x="Concerta Peak1 Cmax Distribution",
       y="Density") +
  scale_colour_manual(values=c("blue", "red")) +
  theme(legend.position=c(0.9, 0.9))

Instead of creating dat.l as a separate object, we could have converted the data to long format on the fly:

dat %>% 
  gather(key, value, dv, sim) %>%
  ggplot(aes(value, colour=key)) +
    geom_density() +
    labs(colour="Type", 
         x="Concerta Peak1 Cmax Distribution",
         y="Density") +
    scale_colour_manual(values=c("blue", "red")) +
    theme(legend.position=c(0.9, 0.9))

With your original data, to get two density plots, we need two calls to geom_density. We can also create a legend with artificial "dummy" aesthetics, which are done below with colour="dv" and colour="sim"(we could have used any strings instead of "dv" and "sim"). This "works", but requires more work and doesn't maintain a natural mapping between the data and the plot.

ggplot(dat) +
  geom_density(aes(dv, colour="dv")) +
  geom_density(aes(sim, colour="sim")) +
  labs(colour="Type", 
       x="Concerta Peak1 Cmax Distribution",
       y="Density") +
  scale_colour_manual(values=c("blue", "red")) +
  theme(legend.position=c(0.9, 0.9))

For all of these versions of the code, the plot looks like this:
Rplot27

3 Likes

It's much easier to help you if you can either make a reprex, or, at the very least, format your code chunks as code! :slightly_smiling_face:

See FAQ here:

1 Like

I plan to use reprex as soon as I get the procedure perfected and hopefully with your last suggestion, that will now occur.

Thanks for your help.

2 Likes

Just an FYI for the future: be careful posting copy-pasted fancy stuff like tables via e-mail. They tend to come through really garbled in the forum — for instance, your nice pretty table wound up looking like this on the site itself:

Not so easy to read! The only reliable way to post formatting on this forum is to use Markdown syntax. You can read more about that here: FAQ: How to format your code

That is astounding how much the table was edited.

When one uses reprex the true table structure of a data frame or fix statement is correctly captured?

Amazing!

It worked and I understand the logic.

I had one follow-up question. In my original post, I had in the code below in order to have the graphs asymptote with the Y and X axes respectively, but it didn’t work with your code.

scale_y_continuous(limits = c(0,NA),expand=c(0,0)) +

scale_x_continuous(limits = c(0,13),expand=c(0,0))

I got this error:

Error: Cannot add ggproto objects together. Did you forget to add this object to a ggplot object?

Can you tell me how to modify the code to get the graph to asymptote?

One more follow-up question.

Where in the

[ggplot(data.l, aes(value,colour=key)) + geom_density() +

labs(colour="Type",

    x="Concerta Peak1 Cmax Distribution",

    y="Density")]

script can I change the legend from a box to a line?

Please ignore the attached E-mail, I figured out what I did wrong.

Hey @jacksonan1, if you still have a question about this, I think it would be better to start a new topic. You can start a new topic right from your post above by clicking the little :link: button at the bottom of the post and choosing :heavy_plus_sign: New Topic from the box that pops up.

2 Likes

No more questions on this topic.

Thanks

Great! :grin: Then can I trouble you with one more bit of housekeeping? If one of these posts answered your original question, would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

1 Like

I checked the final post as a solution to my question related to legends.

1 Like

2 posts were split to a new topic: add a legend to a ggplot2 plot