# Gender unemployment gap?

Hi there,

I'm trying to calculate & plot the U.S. gender unemployment gap (male unemployment rate - female unemployment rate) by state & roughly decades.

I've tried something like this but to no avail:

``````  Diff <- update_0908 %>%
filter(year %in% (1976:2017) & sex %in% (1:2)) %>%
group_by(sex) %>%
summarise(diff(unemp_rate))
``````

My DF:

Actually that is not a dataframe that is a picture of it. Could you please turn this into a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

1 Like

As @andrescs said, a reproducible example would be helpful. Also, please let us know if you want to calculate the gender unemployment gap for each individual state and each individual year, or if you want to average over states and/or groups of years (e.g., decades).

For now, I would also mention that in your code your should not group by `sex` when calculating the difference. You want both sexes to be included within a group, so that you can get the difference in employment rate within the group. The time to group by `sex` would be when calculating any means that need to be calculated by `sex` and then removing the grouping by `sex` when calculating the difference in means.

Here's an example using the built-in `mtcars` data frame, where we can assume that `cyl` is equivalent to, say, `state` in your data, and `am` is equivalent to `sex` in your data.

``````library(tidyverse)

mean_by_cyl_am = mtcars %>%
group_by(cyl, am) %>%
summarise(mpg = mean(mpg))

mean_by_cyl_am %>%
group_by(cyl) %>%
summarise(diff_mpg_by_am = diff(mpg))
``````
``````    cyl diff_mpg_by_am
<dbl>          <dbl>
1     4          5.18
2     6          1.44
3     8          0.350
``````
2 Likes

Thanks for the help. The plan is to average over states roughly by decades.

``````tibble::tribble(
~state, ~sex, ~year, ~population, ~employed, ~unemployed, ~unemp_rate,
"Alabama",    1,  1976,        1232,       853,          56,         6.1,
"Alaska",    1,  1976,         113,        84,           8,         8.5,
"Arizona",    1,  1976,         755,       513,          53,         9.3,
"Arkansas",    1,  1976,         688,       480,          30,         5.9,
"California",    1,  1976,        7369,      5212,         499,         8.7,
"Colorado",    1,  1976,         862,       671,          35,           5,
"Connecticut",    1,  1976,        1086,       780,          76,         8.8,
"Delaware",    1,  1976,         198,       140,          13,         8.3,
"District of Columbia",    1,  1976,         231,       151,          18,          11,
"Florida",    1,  1976,        2911,      1820,         183,         9.1,
"Georgia",    1,  1976,        1605,      1186,          85,         6.7,
"Hawaii",    1,  1976,         275,       195,          25,        11.4,
"Idaho",    1,  1976,         276,       210,          13,         5.6,
"Illinois",    1,  1976,        3805,      2839,         194,         6.3,
"Indiana",    1,  1976,        1815,      1379,          80,         5.6,
"Iowa",    1,  1976,         976,       759,          29,         3.6,
"Kansas",    1,  1976,         811,       634,          24,         3.7,
"Kentucky",    1,  1976,        1139,       839,          41,         4.7,
"Louisiana",    1,  1976,        1227,       859,          54,         5.9,
"Maine",    1,  1976,         358,       257,          23,         8.2,
"Maryland",    1,  1976,        1395,      1054,          66,         5.9,
"Massachusetts",    1,  1976,        2021,      1472,         143,         8.8,
"Michigan",    1,  1976,        3104,      2221,         200,         8.2,
"Minnesota",    1,  1976,        1394,      1078,          57,         4.9,
"Mississippi",    1,  1976,         740,       510,          27,           5,
"Missouri",    1,  1976,        1632,      1162,          67,         5.4,
"Montana",    1,  1976,         264,       199,          11,         5.2,
"Nebraska",    1,  1976,         533,       417,          13,         3.1,
"Nevada",    1,  1976,         214,       168,          14,         8.1,
"New Hampshire",    1,  1976,         281,       213,          14,         6.2,
"New Jersey",    1,  1976,        2506,      1769,         191,         9.7,
"New Mexico",    1,  1976,         372,       253,          24,         8.6,
"New York",    1,  1976,        6168,      4174,         458,         9.9,
"North Carolina",    1,  1976,        1810,      1379,          67,         4.7,
"North Dakota",    1,  1976,         216,       168,           5,         2.8,
"Ohio",    1,  1976,        3637,      2662,         208,         7.2,
"Oklahoma",    1,  1976,         926,       665,          36,         5.1,
"Oregon",    1,  1976,         831,       581,          59,         9.3,
"Pennsylvania",    1,  1976,        4132,      2865,         246,         7.9,
"Rhode Island",    1,  1976,         318,       227,          22,           9,
"South Carolina",    1,  1976,         915,       690,          37,           5,
"South Dakota",    1,  1976,         228,       177,           6,           3,
"Tennessee",    1,  1976,        1415,      1014,          57,         5.3,
"Texas",    1,  1976,        4154,      3171,         152,         4.6,
"Utah",    1,  1976,         389,       299,          17,         5.2,
"Vermont",    1,  1976,         158,       118,          10,         8.1,
"Virginia",    1,  1976,        1660,      1272,          73,         5.4,
"Washington",    1,  1976,        1222,       866,          68,         7.3,
"West Virginia",    1,  1976,         634,       416,          30,         6.8,
"Wisconsin",    1,  1976,        1607,      1236,          70,         5.3,
"Wyoming",    1,  1976,         136,       108,           4,         3.8,
"Alabama",    2,  1976,        1361,       527,          45,         7.9,
"Alaska",    2,  1976,         117,        61,           5,         7.3,
"Arizona",    2,  1976,         828,       341,          40,        10.5,
"Arkansas",    2,  1976,         822,       327,          32,         8.8,
"California",    2,  1976,        8156,      3607,         389,         9.8,
"Colorado",    2,  1976,         956,       474,          37,         7.1,
"Connecticut",    2,  1976,        1203,       556,          64,        10.3,
"Delaware",    2,  1976,         216,        96,          10,         9.8,
"District of Columbia",    2,  1976,         282,       152,          12,         7.1,
"Florida",    2,  1976,        3365,      1342,         132,         8.9,
"Georgia",    2,  1976,        1848,       841,          94,        10.1,
"Hawaii",    2,  1976,         303,       166,          14,         7.9,
"Idaho",    2,  1976,         300,       135,           8,         5.9,
"Illinois",    2,  1976,        4271,      1905,         141,         6.8,
"Indiana",    2,  1976,        1968,       901,          65,         6.8,
"Iowa",    2,  1976,        1099,       523,          24,         4.5,
"Kansas",    2,  1976,         858,       407,          21,           5,
"Kentucky",    2,  1976,        1291,       528,          40,           7,
"Louisiana",    2,  1976,        1412,       526,          46,         8.1,
"Maine",    2,  1976,         402,       173,          19,         9.9,
"Maryland",    2,  1976,        1568,       718,          61,           8,
"Massachusetts",    2,  1976,        2245,      1028,         119,        10.5,
"Michigan",    2,  1976,        3360,      1401,         175,        11.1,
"Minnesota",    2,  1976,        1441,       677,          54,         7.4,
"Mississippi",    2,  1976,         861,       371,          35,         8.6,
"Missouri",    2,  1976,        1846,       831,          68,         7.4,
"Montana",    2,  1976,         268,       111,           9,         7.6,
"Nebraska",    2,  1976,         576,       278,          10,         3.7,
"Nevada",    2,  1976,         215,       110,          13,        10.4,
"New Hampshire",    2,  1976,         307,       149,          11,         6.7,
"New Jersey",    2,  1976,        2857,      1193,         153,        11.5,
"New Mexico",    2,  1976,         419,       170,          19,          10,
"New York",    2,  1976,        7139,      2767,         338,        10.8,
"North Carolina",    2,  1976,        2044,      1019,          91,         8.1,
"North Dakota",    2,  1976,         230,       101,           5,         4.9,
"Ohio",    2,  1976,        4061,      1698,         162,         8.7,
"Oklahoma",    2,  1976,        1058,       428,          30,         6.4,
"Oregon",    2,  1976,         878,       387,          42,         9.8,
"Pennsylvania",    2,  1976,        4660,      1849,         159,           8,
"Rhode Island",    2,  1976,         362,       168,          12,         6.9,
"South Carolina",    2,  1976,        1025,       478,          51,         9.6,
"South Dakota",    2,  1976,         254,       123,           6,           4,
"Tennessee",    2,  1976,        1632,       703,          53,           7,
"Texas",    2,  1976,        4534,      2046,         167,         7.5,
"Utah",    2,  1976,         423,       184,          13,         6.4,
"Vermont",    2,  1976,         183,        80,           8,         9.5,
"Virginia",    2,  1976,        1862,       898,          65,         6.7,
"Washington",    2,  1976,        1357,       584,          69,        10.6,
"West Virginia",    2,  1976,         698,       213,          20,         8.8,
"Wisconsin",    2,  1976,        1700,       817,          52,           6,
"Wyoming",    2,  1976,         138,        64,           3,         4.6
)
#> # A tibble: 102 x 7
#>    state                sex  year population employed unemployed unemp_rate
#>    <chr>              <dbl> <dbl>      <dbl>    <dbl>      <dbl>      <dbl>
#>  1 Alabama                1  1976       1232      853         56        6.1
#>  2 Alaska                 1  1976        113       84          8        8.5
#>  3 Arizona                1  1976        755      513         53        9.3
#>  4 Arkansas               1  1976        688      480         30        5.9
#>  5 California             1  1976       7369     5212        499        8.7
#>  6 Colorado               1  1976        862      671         35        5
#>  7 Connecticut            1  1976       1086      780         76        8.8
#>  8 Delaware               1  1976        198      140         13        8.3
#>  9 District of Colum…     1  1976        231      151         18       11
#> 10 Florida                1  1976       2911     1820        183        9.1
#> # … with 92 more rows
``````

Created on 2019-09-08 by the reprex package (v0.2.1)`Preformatted text`

2 Likes

Here are two methods. I called your tibble DF and defined the decade simply by rounding the year.

``````Stats <- DF %>% mutate(Decade = round(year - 1900, digits = -1),
sex = ifelse(sex == 1, "Female", "Male")) %>%
group_by(Decade, state, sex) %>%
summarize(Avg = mean(unemp_rate)) %>%
spread(key = sex, value = Avg) %>%
mutate(Diff = Male - Female)

#############

Stats2 <- DF %>% mutate(Decade = round(year - 1900, digits = -1),
sex = ifelse(sex == 1, "Female", "Male")) %>%
group_by(Decade, state, sex) %>%
summarize(Avg = mean(unemp_rate)) %>%
group_by(Decade, state) %>% summarize(Diff = diff(Avg))
``````
2 Likes

So if I wanted to do say, 1976-1989, I could write:

``````Stats2 <- DF %>% mutate(Decade = round(1976 - 1989, digits = -1),
sex = ifelse(sex == 1, "Female", "Male")) %>%
group_by(Decade, state, sex) %>%
summarize(Avg = mean(unemp_rate)) %>%
group_by(Decade, state) %>% summarize(Diff = diff(Avg))

``````
1 Like

If you want to process only the years 1976 - 1989 out of a larger data set, you have to filter for those years.

``````Stats2 <- DF %>% filter(year >= 1976, year <= 1989) %>%
mutate(Decade = round(year, digits = -1),
sex = ifelse(sex == 1, "Female", "Male")) %>%
group_by(Decade, state, sex) %>%
summarize(Avg = mean(unemp_rate)) %>%
group_by(Decade, state) %>% summarize(Diff = diff(Avg))
``````

The year - 1900 in my original code was just intended to make the Decade a two digit number. Since it is confusing, I dropped it in the code above.

1 Like

Ok got it.
Also, since sex = 1 is male & sex = 2 is female, should the ifelse function read (sex == 1, "Male", "Female") ?

Yes, I should have mentioned that I just guessed the sex coding based on alphabetical order.

I do have 1 more question.
I"m not entirely understanding: sex = ifelse(sex == 1, "Female", "Male")) %>%

Why not just filter by sex?

You can skip the renaming and in the second method there is really no reason to change the sex column. In the first method, the sex column gets spread() to be column names and those should not be purely numeric characters.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.