How to add legend to ggplot?

Hi all

I am making a plot in Shiny, based in 3 columns of my dataset. I want to add an legend, but I can figure out how to do this?

This is my code, first I generate a dataframe (MIN, MEAN, MAX) with the while loop, then I Try to plot it, but I want a legend that says blue is min, orange is mean and red is max.


 #Create a Empty DataFrame with 0 rows and n columns
      columns = c("Hour","Low TAT", "Mean", "High TAT") 
      df_scatterplot = data.frame(matrix(nrow = 0, ncol = length(columns))) 
      colnames(df_scatterplot) = columns
      counter <- 0
      while(counter < 24){
        df_scatterplot[nrow(df_scatterplot) + 1,] = c(counter,
                                                      quantile(df1_filtered$TAT[df1_filtered$Hours == counter], probs = 0.01),
                                                      mean(df1_filtered$TAT[df1_filtered$Hours == counter]),
                                                      quantile(df1_filtered$TAT[df1_filtered$Hours == counter], probs = 0.99))
        
        counter = counter +1
      }
      
      update_busy_bar(99)
      update_busy_bar(100)
      ggplot(data = df_scatterplot) +
        geom_line(aes(x=df_scatterplot$Hour, y=df_scatterplot$`Low TAT`), color = "blue") +
        geom_point(aes(x=df_scatterplot$Hour, y=df_scatterplot$`Low TAT`), color = "blue") +
        geom_text(x=df_scatterplot$Hour, y=df_scatterplot$`Low TAT`, label = round(df_scatterplot$`Low TAT`, digits = 2), vjust = -1) +
        geom_line(aes(x=df_scatterplot$Hour, y=df_scatterplot$`High TAT`), color = "red") +
        geom_point(aes(x=df_scatterplot$Hour, y=df_scatterplot$`High TAT`), color = "red") +
        geom_text(x=df_scatterplot$Hour, y=df_scatterplot$`High TAT`, label = round(df_scatterplot$`High TAT`, digits = 2), vjust = -1) +
        theme(axis.text.x = element_text(face = "bold", color = "#993333", size = 15),
              axis.text.y = element_text(face = "bold", color = "#993333", size = 15),
              axis.line = element_line(color = "#993333", size = 1)) +
        scale_x_continuous(breaks=seq(0,23,1)) +
        xlab("Hours in a day") + ylab("TAT")

Thank you!

first a note; avoid restating the source data.frame name when you mention your variables ; as you provide a data param into ggplot, therefore ggplot knows from where to look for variables that you mention, if you are using it conventionally.

Its considered bad practice for both style and technical reasons to use $ in this context, unless you are doing something complex so there is no alternative. Hope you'll agree that the code looks nicer;

library(tidyverse)
# not recommended
ggplot(economics, aes(economics$date, economics$unemploy)) + geom_line()
# recommended
ggplot(economics, aes(date, unemploy)) + geom_line()

We can see mapped aesthetics in the legend/guide; therefore I add color as my choice; I arbitrarily name it 'my line' ; I add a scale color manual to manually specify what colour should be mapped to 'my line' ; other automated approaches are possible.
finally set the title over the color guides
all together :

ggplot(economics, aes(x = date,
                      y = unemploy,
                      color="my line")) + geom_line() +
  scale_color_manual(values=c("my line"="#964B00"))  + # brown as rgb hex #964B00 
guides(color=guide_legend(title="whatever title I want"))

Hi, thank your for your reaction.

I cleaned the code a bit, but without hard coding the column in the dataframe, I get an error that he can't find it.

I added your code chunk, but it didn't work. where should I place it in my code, since I have multiple aes().

 ggplot(data = df_scatterplot, aes(x = df_scatterplot$Hour)) +
        geom_line(aes(y=df_scatterplot$`Low TAT`), color = "blue") +
        geom_point(aes(y=df_scatterplot$`Low TAT`), color = "blue") +
        geom_text(y=df_scatterplot$`Low TAT`, label = round(df_scatterplot$`Low TAT`, digits = 2), vjust = -1) +
        geom_line(aes(y=df_scatterplot$`Mean`), color = "orange") +
        geom_point(aes(y=df_scatterplot$`Mean`), color = "orange") +
        geom_text(y=df_scatterplot$`Mean`, label = round(df_scatterplot$`Mean`, digits = 2), vjust = -1) +
        geom_line(aes(y=df_scatterplot$`High TAT`), color = "red") +
        geom_point(aes(y=df_scatterplot$`High TAT`), color = "red") +
        geom_text(y=df_scatterplot$`High TAT`, label = round(df_scatterplot$`High TAT`, digits = 2), vjust = -1) +
        theme(axis.text.x = element_text(face = "bold", color = "#993333", size = 15),
              axis.text.y = element_text(face = "bold", color = "#993333", size = 15),
              axis.line = element_line(color = "#993333", size = 1)) +
        scale_x_continuous(breaks=seq(0,23,1)) +
        xlab("Hours in a day") + ylab("TAT (in mins)")
ggplot(data = df_scatterplot, aes(x = Hour)) +
  geom_line(aes(y=`Low TAT`), color = "blue") +
  geom_point(aes(y=`Low TAT`), color = "blue") +
  geom_text(y=`Low TAT`, label = round(`Low TAT`, digits = 2), vjust = -1) +
  geom_line(aes(y=`Mean`), color = "orange") +
  geom_point(aes(y=`Mean`), color = "orange") +
  geom_text(y=`Mean`, label = round(`Mean`, digits = 2), vjust = -1) +
  geom_line(aes(y=`High TAT`), color = "red") +
  geom_point(aes(y=`High TAT`), color = "red") +
  geom_text(y=`High TAT`, label = round(`High TAT`, digits = 2), vjust = -1) +
  theme(axis.text.x = element_text(face = "bold", color = "#993333", size = 15),
        axis.text.y = element_text(face = "bold", color = "#993333", size = 15),
        axis.line = element_line(color = "#993333", size = 1)) +
  scale_x_continuous(breaks=seq(0,23,1)) +
  xlab("Hours in a day") + ylab("TAT (in mins)")+
  scale_color_manual(values=c("blue"="blue",
                              "orange"="orange",
                              "red" = "red"))  +  
  guides(color=guide_legend(title="whatever title I want"))

A legend will be added by default if you first reshape the data you're plotting. Below is an example using pivot_longer(), which results in a data frame that has 3 columns (Hour, name, value).

library(tidyverse)

# initial sample data
df_scatterplot = data.frame(
  Hour = 1:10,
  `Low TAT` = sample(1:10, 10, replace = T),
  `Mean` = sample(11:20, 10, replace = T),
  `High TAT` = sample(21:30, 10, replace = T)
)

head(df_scatterplot, 5)
#>   Hour Low.TAT Mean High.TAT
#> 1    1       5   17       23
#> 2    2       7   11       23
#> 3    3       9   17       24
#> 4    4       6   15       29
#> 5    5       9   16       25

# reshape the data
df_scatterplot = df_scatterplot |>
  pivot_longer(-'Hour')

head(df_scatterplot, 5)
#> # A tibble: 5 × 3
#>    Hour name     value
#>   <int> <chr>    <int>
#> 1     1 Low.TAT      5
#> 2     1 Mean        17
#> 3     1 High.TAT    23
#> 4     2 Low.TAT      7
#> 5     2 Mean        11

By transforming your data into this structure, you only have to call your geoms once each, specifying the color to match the name column. Then, you can manually set the color using scale_color_manual().

ggplot(data = df_scatterplot) +
  geom_line(aes(x = Hour, y = value, group = name, color = name)) +
  geom_point(aes(x = Hour, y = value, color = name)) +
  theme(axis.text.x = element_text(face = "bold", color = "#993333", size = 15),
        axis.text.y = element_text(face = "bold", color = "#993333", size = 15),
        axis.line = element_line(color = "#993333", size = 1)) +
  scale_x_continuous(breaks=seq(0,23,1)) +
  scale_color_manual(values = c('Low.TAT' = 'blue',
                                'Mean' = 'orange',
                                'High.TAT' = 'red'),
                     name = 'My New Title'
                     ) +
  xlab("Hours in a day") + 
  ylab("TAT")

Hi all

Thank you, it worked out!

This is my final code:


 #Create a Empty DataFrame with 0 rows and n columns
      columns = c("Hour","Low TAT", "Mean TAT", "High TAT") 
      df_scatterplot = data.frame(matrix(nrow = 0, ncol = length(columns))) 
      colnames(df_scatterplot) = columns
      counter <- 0
      while(counter < 24){
        df_scatterplot[nrow(df_scatterplot) + 1,] = c(counter,
                                                      quantile(df1_filtered$TAT[df1_filtered$Hours == counter], probs = 0.01),
                                                      mean(df1_filtered$TAT[df1_filtered$Hours == counter]),
                                                      quantile(df1_filtered$TAT[df1_filtered$Hours == counter], probs = 0.99))
        
        counter = counter +1
      }
      
      update_busy_bar(99)
      # reshape the data before plot (automatic legend)
      df_scatterplot = df_scatterplot |>
        pivot_longer(-'Hour')
      
      update_busy_bar(100)
      ggplot(data = df_scatterplot) +
        geom_line(aes(x = Hour, y = value, group = name, color = name)) +
        geom_point(aes(x = Hour, y = value, color = name)) +
        geom_text(aes(x = Hour, y = value, group = name, color = name), label = round(df_scatterplot$value, digits = 2), vjust = -0.5) +
        theme(axis.text.x = element_text(face = "bold", color = "#993333", size = 15),
              axis.text.y = element_text(face = "bold", color = "#993333", size = 15),
              axis.line = element_line(color = "#993333", size = 1)) +
        scale_x_continuous(breaks=seq(0,23,1)) +
        guides(colour = guide_legend(override.aes = list(size = 1), shape = "-")) +
        theme(legend.position="bottom") +
        scale_color_manual(values = c('Low TAT' = 'blue',
                                      'Mean TAT' = 'orange',
                                      'High TAT' = 'red'),
                           name = '') +
        xlab("Hours in a day") + 
        ylab("TAT (in mins)")

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.