Large data set, line graph, multiple lines, ggplot: need help!

I'm new to R and programming in general, and I am struggling to create a line graph with multiple lines, representing large groups of respondents. I work in development economics and I am using a large data set containing information on 18,000+ individuals in South Africa. I want to plot income on the x axis and years of education achieved on the y axis. Education is a discrete variable and only has values of 0-18 years. Income is continuous and spans from 652 (ZAR per month) to 20,000+. I then want to do this for 3 groups of people, representing 2,000 to 4,000 people each, and thus have 3 lines. I have tried several graphs and haven't gotten anywhere near what I need. Here is the code I have so far, but it is fairly useless as it creates a graph nothing like what I want (and yes, I know my data set name and var names are huge, but that is for a reason). Do I need to re-code the income data and put it in buckets? What am I doing wrong here? I put a link to a graph in a paper that I basically want to re-create, but with income instead of age.

e <- ggplot(data = Cross_w5_e_h_i_g_a_RStudio_25Sept23, aes(x = w5_hhinc_perm_CSGPool_CSM, y = w5_eduyrs_CSGPool_CSM_T1A)) +
  geom_line() +
  xlim(652, 20000) +
  ylim(0,18)
```````r

Very similar to the graph I want - except this one has age instead of income, and has only 2 lines:

See page 27, graph on top right of page (I tried to cut and paste this in, and it didn't work)
https://opensaldru.uct.ac.za/bitstream/handle/11090/689/2018_125_Saldruwp.pdf?sequence=3

Would someone provide some advice on how to solve this?!

Btw, I tried to do reprex but I kept getting error messages.  Thanks.

I don't think you want a line plot. I expect the scatter in the data makes such a plot unreadable. Try using geom_smooth instead.
To get three lines, you need to have a column that labels each point as belonging to a group. If you post the output of

dput(head(Cross_w5_e_h_i_g_a_RStudio_25Sept23,20))

someone could give detailed advice about that.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.