Combining values within a variable

Hi all, I am still pretty new to R and am looking for some help with combining values within a variable. I am using the ELS 2002 dataset that is publicly available on the NCES website. The variable I would like to adjust is the BYRACE variable that has values as follows:

1 = Amer. Indian/Alaska Native;
2 = Asian/Pac. Islander;
3 = Black or African American;
4 = Hispanic, Race Specified;
5 = Hispanic, Race Not Specified;
6 = More than one Race;
7 = White

I would like to combine the two values for Hispanics (4 and 5) to be a single value. I am unsure about how to do this. Any help would be greatly appreciated!!

Hi :wave: and welcome to RStudio Community!

It would be helpful if you could post a sample of the data you're working with. Personally, I'm not familiar with ELS 2002 or NCES. If you can show small examples of your data it would be easier to help.

Assuming that you have loaded the ELS data set into R as data frame df.

You can do the things like this:

library(dplyr)

## the original distribution of BYRACE
count(df, BYRACE)
#>  BYRACE     n
#>1     -8   305
#>2     -4   648
#>3      1   130
#>4      2  1460
#>5      3  2020
#>6      4   996
#>7      5  1221
#>8      6   735
#>9      7  8682

## create a new variable with regrouped values
df <- df %>% 
  mutate(BYRACE_re = if_else(BYRACE %in% c(4, 5), 4, BYRACE))

## the distribution of the new variable
count(df, BYRACE_re)
#>  BYRACE_re     n
#>1        -8   305
#>2        -4   648
#>3         1   130
#>4         2  1460
#>5         3  2020
#>6         4  2217
#>7         6   735
#>8         7  8682

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.