Having trouble creating a GGPlot with multiple columns

I am working on a project where I am trying to use ggplot to plot ages versus degrees and income based on age. I am looking to make this a scatter plot. I have tried a number of way to melt, combine, etc this data into a ggplot that works, but I get errors no matter what I do. Here is a sample of my data frame:
age_group hs_grad some_college assoc_degree bach_degree bach_plus

1 ..18 to 24 years 32787 32096 34840 52133 54383
2 ..25 to 34 years 40778 45312 47222 68012 74160
3 ....25 to 29 years 37822 42108 43307 62829 65875
4 ....30 to 34 years 44173 48648 51043 73380 81492
5 ..35 to 44 years 49650 59910 60773 90132 103691
6 ....35 to 39 years 49522 56266 58729 89104 101225

I have tried the following as an example:
data15<-data.frame(x = data14$age_group, y= c(data14$hs_grad, data14$some_college, data14$assoc_degree, data14$bach_degree, data14$bach_plus),
group = c(rep("hs_grad", nrow(data)),
rep("some_college", nrow(data)),
rep("assoc_degree", nrow(data)),
rep("bach_degree", nrow(data)),
rep("bach_plus", nrow(data))))

However, I keep getting errors like the following:
Error in rep("hs_grad", nrow(data)) : invalid 'times' argument

I am not sure what values you want on the x and y axes. In any case, I would reshape the data like this:

library(tidyr)
Df <- read.csv("~/R/Play/Dummy.csv", sep = " ")
Df
#>        age_group hs_grad some_college assoc_degree bach_degree bach_plus
#> 1 18_to_24_years   32787        32096        34840       52133     54383
#> 2 25_to_34_years   40778        45312        47222       68012     74160
#> 3 25_to_29_years   37822        42108        43307       62829     65875
#> 4 30_to_34_years   44173        48648        51043       73380     81492
#> 5 35_to_44_years   49650        59910        60773       90132    103691
#> 6 35_to_39_years   49522        56266        58729       89104    101225
DFlong <- pivot_longer(Df, hs_grad:bach_plus, names_to = "Educ")
DFlong
#> # A tibble: 30 x 3
#>    age_group      Educ         value
#>    <chr>          <chr>        <int>
#>  1 18_to_24_years hs_grad      32787
#>  2 18_to_24_years some_college 32096
#>  3 18_to_24_years assoc_degree 34840
#>  4 18_to_24_years bach_degree  52133
#>  5 18_to_24_years bach_plus    54383
#>  6 25_to_34_years hs_grad      40778
#>  7 25_to_34_years some_college 45312
#>  8 25_to_34_years assoc_degree 47222
#>  9 25_to_34_years bach_degree  68012
#> 10 25_to_34_years bach_plus    74160
#> # ... with 20 more rows

Created on 2020-11-22 by the reprex package (v0.3.0)

That worked great. Thank you!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.