I´m using ggplot2 for made a Scatter Plot.
I have a DataFrame with 8 columns and 255.000 rows approximately, my DF look like this:
ID1 ID2 dN dS t Label_ID1 Label_ID2 Group ARB07438 YP_173238 0.0202 2.0534 0.4971 HKU1-CoV HKU1-CoV Intra ARB07438 AZS52618 -0.0000 0.1115 0.0299 HKU1-CoV HKU1-CoV Intra ARB07438 AYN64561 -0.0000 -0.0000 -0.0000 HKU1-CoV HKU1-CoV Intra ARB07599 NP_073551 0.5332 2.5718 2.4730 HKU1-CoV 229E-CoV Inter ARB07599 QJY77946 0.5234 2.5786 2.4587 HKU1-CoV 229E-CoV Inter ...
I made a Scatter Plot that represents this versus dN vs dS, dN vs t and dS vs t. The column Group represents only two different colors (Intra and Inter):
All fine with that, however, I wanna make a Scatter Plot that represents the Columns Labels_ID1 and ID2. These columns have 24 different labels (HKU1-CoV, Bovine-CoV, 229E-CoV, SARSr-bat-CoV, SARSr-bat-RaTG13-CoV, SARSr-bat-BM4831-CoV, Camel-229E-CoV, SARSr-Civet-CoV, PEDV-CoV, HKU2-CoV, HKU3-CoV, HKU4-CoV, HKU5-CoV, HKU23-CoV, MERS-CoV, Murine-CoV, MERSr-bat-Neoromicia-CoV, NL63-CoV, OC43-CoV, SARSr-Pangolin-CoV, PEHV-CoV, SARS1-CoV, SARS2-CoV and PEDV-CoV).
I´m looking for something like this:
In the DF the columns Label_ID1 and 2 have different combinations of labels, for example, 229E-CoV; Bovine-CoV.
Is it possible to make this "versus" representation with different colors to see the different distributions in the Scatter Plot of each Label?
I tried different ways, but nothing works for me, for this reason only paste the part of the code for the Scatter Plot with the two groups' columns (intra and inter, the firts Scatter Plot in the top).
df_S %>% ggplot(aes(x = dN, y = t)) + geom_point(aes(color = Group)) + scale_y_continuous(trans='log10') + scale_x_continuous(trans='log10') + labs(title = "Pairwise Comparison S Protein", subtitle = "Inter versus Intragroup", x = "dN rate", y = "dS rate", color = "Group") + theme_gray() + theme(axis.title = element_text())
Any help or ideas is welcome.