How represent specific comparison values with specific color in Scatter Plot?

I´m working on my thesis project and need some help.
Previously, I posted a similar question but now I simplify the DataFrame (DF), because the previous question and answer work but not how I expect for my results.
Now, I´m using a DF for made a Scatter Plot (SP) with ggplot2. This is part of my DF (9 columns and 723 rows):

 ID1            ID2      dN      dS      t       Label_ID1   Label_ID2       Group

 YP_009724393  AFR79253  0.5756  1.8977  2.7739  SARS2-CoV   229E-CoV        Inter
 YP_009724393  AFR79260  0.5756  1.8977  2.7739  SARS2-CoV   229E-CoV        Inter
 YP_009724393  AAG48596  0.5984  1.4520  2.5175  SARS2-CoV   229E-CoV        Inter
 YP_009724393  AVP78034  0.0166  0.3459  0.3185  SARS2-CoV   SARSr-bat-CoV   Inter
 YP_009724393  AVP78045  0.0166  0.3459  0.3185  SARS2-CoV   SARSr-bat-CoV   Inter
 YP_009724393  ATO98112  0.0172  0.7249  0.6923  SARS2-CoV   SARSr-bat-CoV   Inter

Previously, I made different SPs with only Group column (color = Group, i.e: Intra or Inter). Now I´m looking for a specific comparison with the columns Label_ID1 and ID2 . The column Label_ID1 has only one Label (SARS2-CoV) and the column Label_ID2 has 24 different labels.

My goal is to represent the values in the columns dN vs dS, dN vs t, and dS vs t with only a specifics comparison (Label_ID1 vs Label_ID2) represent with different colors with this "versus":

SARS2-CoV vs SARS1-CoV; (red, for example)
SARS2-CoV vs MERS-CoV, (blue...)
SARS2-CoV vs SARSr-bat-CoV, (green...)
SARS2-CoV vs SARSr-bat-RaTG13-CoV, (...)
SARS2-CoV vs SARSr-bat-BM4831-CoV;
SARS2-CoV vs SARSr-Pangolin-CoV.

I´m using RStudio (ggplot2) to plot this DF. My question is:
It´s possible to represent this comparison? I tried different ways but nothing works for me. This is part of my code which previously plot SC with only Group column with two different color:

df_M %>%
  ggplot(aes(x = dN, y = t)) + 
  geom_point(aes(color = Group)) +
  scale_y_continuous(trans='log10') +
  scale_x_continuous(trans='log10') +
  labs(title = "Pairwise Comparison M Protein",
       subtitle = "Inter versus Intragroup",
       x = "dN rate",
       y = "t rate",
       color = "Group") +
  theme_gray() + 
  theme(axis.title = element_text())

Any idea or help is welcome! Thank! Best!

we need more details something we can run on our end.
are you aware of this package ?:

1 Like

Thank you for share the page (ggbetweenstats).

Especifectly, what details you think lack the post?
Do you prefer that it share the entire data frame?

My goal is to plot a Scatter Plot with values in the columns dN, dS and t for this comparison: dN versus dS, dN versus t and dS versus t. However, I prefer to represent in the plot the labels with the columns Label_ID1 and ID2, where the labels comparison have one specific color, for example, SARS2-CoV vs MERS-CoV is red in the plot, SARS2-CoV vs SARS1-CoV green...

If you need more details, please say me.
Thank!

we need a runnable reproducible example as it is now it is too much work and error-prone to try to reproduce what is not working

For example, with this code:

# Set the DataFrame
Wuhan_S_df_filter <- read.csv(file.choose(), header = T, sep = ",")
#Plot
Wuhan_S_df_filter %>%
  ggplot(aes(x = dN, y = dS)) + 
  geom_point(aes(color = Label_ID2)) +
  scale_y_continuous(trans='log10') +
  scale_x_continuous(trans='log10') +
  labs(title = "Pairwise Comparison M Protein",
       subtitle = "Wuhan versus pathogenic variants in humans",
       x = "dN rate",
       y = "dS rate",
       color = "Comparison") +
  scale_color_manual(labels=c(
    "SARS2-CoV vs SARS1-CoV", 
    "SARS2-CoV vs MERS-CoV", 
    "SARS2-CoV vs SARSr-bat-CoV",
    "SARS2-CoV vs SARSr-bat-RaTG13-CoV",
    "SARS2-CoV vs SARSr-bat-BM4831-CoV",
    "SARS2-CoV vs SARSr-Pangolin-CoV"), 
    values=c(
    "#0000FF", 
    "#800080", 
    "#00FFFF", 
    "#808000",
    "#00FF00",
    "#FF0000" )) +
  theme_gray() + 
  theme(axis.title = element_text()) 

My new DF (filtered) have this content (part of then):

Number, ID1, ID2, dN, dS, t, Label_ID1, Label_ID2, Group
163355,YP_009724390 , AVP78031 ,0.0429,0.9339,0.7181, SARS2-CoV , SARSr-bat-CoV , Inter
163356,YP_009724390 , AVP78042 ,0.0427,1.0791,0.8011, SARS2-CoV , SARSr-bat-CoV , Inter
163357,YP_009724390 , ATO98108 ,0.0577,1.5007,1.2546, SARS2-CoV , SARSr-bat-CoV , Inter
163358,YP_009724390 , ATO98120 ,0.0705,1.8456,1.5053, SARS2-CoV , SARSr-bat-CoV , Inter
163359,YP_009724390 , ATO98132 ,0.06,1.7172,1.3582, SARS2-CoV , SARSr-bat-CoV , Inter
163360,YP_009724390 , ATO98145 ,0.0777,1.3141,1.1472, SARS2-CoV , SARSr-bat-CoV , Inter
163361,YP_009724390 , ATO98157 ,0.0509,1.7169,1.3887, SARS2-CoV , SARSr-bat-CoV , Inter
163362,YP_009724390 , ATO98169 ,0.0572,1.4994,1.2218, SARS2-CoV , SARSr-bat-CoV , Inter
163363,YP_009724390 , ATO98181 ,0.0574,2.5492,2.005, SARS2-CoV , SARSr-bat-CoV , Inter
163364,YP_009724390 , ATO98193 ,0.0708,1.5942,1.3335, SARS2-CoV , SARSr-bat-CoV , Inter
163365,YP_009724390 , ATO98205 ,0.0504,1.5662,1.2405, SARS2-CoV , SARSr-bat-CoV , Inter
163366,YP_009724390 , ATO98218 ,0.0537,1.6191,1.2918, SARS2-CoV , SARSr-bat-CoV , Inter
163367,YP_009724390 , ATO98231 ,0.0604,1.6006,1.2974, SARS2-CoV , SARSr-bat-CoV , Inter
163368,YP_009724390 , ACU31032 ,0.0704,1.3902,1.1675, SARS2-CoV , SARSr-bat-CoV , Inter
163369,YP_009724390 , ABD75323 ,0.0767,1.4571,1.2149, SARS2-CoV , SARSr-bat-CoV , Inter
163370,YP_009724390 , ABD75332 ,0.0576,2.0601,1.6651, SARS2-CoV , SARSr-bat-CoV , Inter
163371,YP_009724390 , AAZ67052 ,0.0575,2.1861,1.753, SARS2-CoV , SARSr-bat-CoV , Inter
163372,YP_009724390 , QHR63300 ,0.006,0.1936,0.1445, SARS2-CoV , SARSr-bat-RaTG13-CoV , Inter
163373,YP_009724390 , YP_003858584 ,0.1156,2.9016,2.1998, SARS2-CoV , SARSr-bat-BM4831-CoV , Inter
163374,YP_009724390 , ADK66841 ,0.1156,2.9016,2.1998, SARS2-CoV , SARSr-bat-BM4831-CoV , Inter
163530,YP_009724390 , YP_009047204 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter
163531,YP_009724390 , ANF29162 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter
163532,YP_009724390 , ANF29173 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter
163533,YP_009724390 , ANF29184 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter
163534,YP_009724390 , ANF29195 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter
163535,YP_009724390 , ANF29206 ,0.5809,2.9256,3.3527, SARS2-CoV , MERS-CoV , Inter

But this dosen´t work for me:

>     "#0000FF", 
Error: unexpected ',' in "    "#0000FF","
>     "#800080", 
Error: unexpected ',' in "    "#800080","
>     "#00FFFF", 
Error: unexpected ',' in "    "#00FFFF","
>     "#808000",
Error: unexpected ',' in "    "#808000","
>     "#00FF00",
Error: unexpected ',' in "    "#00FF00","
>     "#FF0000" )) +
Error: unexpected ')' in "    "#FF0000" )"

I hope this helps you reproduce the example.

please use the reprex package and add in I do not have access to your file system to browse to your file

library(ggplot2)
mpg$cyl <-as.factor(mpg$cyl)
  ggplot(mpg,aes(x = displ  , y = cty)) + 
  geom_point(aes(color = cyl),size=3, alpha=0.5) +
  scale_y_continuous(trans='log10') +
  scale_x_continuous(trans='log10') +
  scale_color_manual(labels=c(
    "4 aaa", 
    "5 blah", 
    "6",
    "8"), 
    values=c(
      "#0000FF", 
      "#800080", 
      "#00FFFF", 
      "#808000" )) +
  theme_gray() + 
  theme(axis.title = element_text()) 

Created on 2021-02-22 by the reprex package (v1.0.0)

1 Like

Thank you for your help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.