I'm trying to change the color of my data in a Scatter Plot. I have 5 columns (only 3 with values for my Plot) with 264.000 rows, this is my dataframe:
You have a scatter plot, right? This means dS on the X- dN on the Y-axis. You need both columns for one point.
How do you want to color the points differentially?
Hi again!
My problem with this code is the following: I have 727 IDs, the values result after make a pairwise comparison with 727 * 727 ID.
For this reason, the rows are ~264.000. The column dN is the values for one comparison, the second is for others...
The dataframe is in tsv format, maybe if define cols names: dN, dS and t? Then, color = cols_names
ggplot(DF,aes(dN,dS, color = cols_names)) + geom_point(size = 5) + theme_minimal()
Based on what I see in your data this will probably generate many different "points", probably as many as the number of lines in your data frame (except when you bin the numbers into ranges).
You need to find something to generate just a few (let's say 2-6 classifications).
You need to ask yourself what you want to show. What kind of information do you want to add with colour? What do you miss so far in the data representation.
Or do you just want to make it a bit more "beautiful"?