Observing wrong colours in ggplot line plots using two dataframes

I have two dataframes, each having one hundred 'x' and 'y' coordinates I wish to plot together on a single line plot. I can get the lines to plot, but am running into trouble trying to colour the lines and add a legend to the graph.

From my understanding it might be better to combine the two dataframes into one and add a column with the two series names, but for now it's best if the two are separate. I can replicate the problem using the mtcars datasets.

df <- mtcars

df1 <- select(df, wt, qsec)
df2 <- select(df, wt, disp)

ggplot() +
  geom_line(data = df1, aes(x = wt, y = qsec, colour = "blue")) +
  geom_line(data = df2, aes(x = wt, y = disp, colour = "red"))

I get the following plot using the code; below that is the original problem with my data.

I have tried to use the scale_color functions below on my data with no success

scale_color_manual(values = c("Offset"="blue", "Initial"="red"))
scale_color_discrete(name="Geometry", labels = c("Initial", "Offset"))


Are you looking for something like this?

#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>     filter, lag
#> The following objects are masked from 'package:base':
#>     intersect, setdiff, setequal, union
df <- mtcars

df1 <- select(df, wt, qsec)
df2 <- select(df, wt, disp)
COLORS <- c("qsec" = "blue", "disp" = "red")
ggplot() +
  geom_line(data = df1, aes(x = wt, y = qsec, color = "qsec")) +
  geom_line(data = df2, aes(x = wt, y = disp, color = "disp")) +
  scale_color_manual(values = COLORS, name = "Something")

Created on 2020-06-18 by the reprex package (v0.3.0)

1 Like

This is my first time using reprex(). I have pasted the reprex from the clipboard below

your code does differ from that of FJCC (at least it did before you cropped your image)
you have colour="blue" in the first case whereas

and relies on qsec being a name in the COLORS vector

You're right, my mistake. replacing the color= got my plot to look exactly like his

1 Like

So just to circle back, I tried the above solution on my original data with some different values and got it to plot a yellow and green line successfully (instead of red and blue). I also noticed:

  • when defining the colours vector in my code, I first used colors=c(...) instead of COLORS=c(...) and got errors; I'm assuming it's because colors (in lowercase) is a protected command in R
  • The label of the two series are still the y-values in the plot. Is there a way to rename these "Initial" and "Offset" in the plot legend only, or would renaming the series in my dataframe header be the easiest way?
  • Would it be easier to combine the two dataframes into one, then add a column (named 'condition') so that each x/y has a named condition (either inital or offset) and then have ggplot use the single dataframe and simply differ the colors by condition.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

it doesnt work as provided, or on your real data ?
It works as provided on my system.

the warning you posted implies you are using data with NA values in it.

Inside the aes() function you map variables to aesthetic elements, in your code you were mapping the colour aesthetic to a character vector called "blue". This is different than assigning the blue color to the line geom, which you could do outside the aes() function.

In order to help you with your issue, we would need a reprex that actually shows your problem maybe with a sample of your data instead of mtcars

1 Like

It seems strange that the original code I used would not work, since my understanding is ggplot simply needs descriptors inside aes() to tell it what to plot.

The image you presented looks like a solution but the code does not work; when running I get the following error:

Warning messages:
1: Removed 32 row(s) containing missing values (geom_path).
2: Removed 32 row(s) containing missing values (geom_path).

it could be that somethings gone wrong in your environment. I suggest restarting R and trying again

it does not work as provided. I looked at the dataframes in the environment and all of the variables are there, i.e. no N/A values