Replace numerical codes with strings from codebook

There are several ways to do what is, in effect, a join (you want to match up the misconduct code numbers with their equivalents in the codebook, and get the string value).

I'm just showing two below (one base, and one with dplyr), but I'll post a link to a StackOverflow thread with lots of options at the bottom.

To can replace the original variable, you could assigning over the misconduct variable (or whatever the equivalent is in your dataset), but I've turned it into a new variable for illustrative purposes in the reprex below.

incidents <- tibble::tribble(
   ~person, ~misconduct,
  "Howard",          0L,
   "Robin",          0L,
    "Fred",          2L,
    "Gary",          2L,
    "John",          3L,
      "JD",          4L,
   "Benjy",          6L
  )

codebook <- tibble::tribble(
              ~code_num, ~code_string,
                     0L,      "pride",
                     1L,       "envy",
                     2L,   "gluttony",
                     3L,       "lust",
                     4L,      "anger",
                     5L,      "greed",
                     6L,      "sloth"
              )

# using base R `match()` function
incidents$code_string <- codebook[match(incidents$misconduct, codebook$code_num), ]$code_string

incidents
#> # A tibble: 7 x 3
#>   person misconduct code_string
#>   <chr>       <int> <chr>      
#> 1 Howard          0 pride      
#> 2 Robin           0 pride      
#> 3 Fred            2 gluttony   
#> 4 Gary            2 gluttony   
#> 5 John            3 lust       
#> 6 JD              4 anger      
#> 7 Benjy           6 sloth

# remove the added variable so I can show the second method
incidents$code_string <- NULL

# with dplyr
suppressPackageStartupMessages(library(dplyr))
incidents %>%
  left_join(codebook, by = c("misconduct" = "code_num"))
#> # A tibble: 7 x 3
#>   person misconduct code_string
#>   <chr>       <int> <chr>      
#> 1 Howard          0 pride      
#> 2 Robin           0 pride      
#> 3 Fred            2 gluttony   
#> 4 Gary            2 gluttony   
#> 5 John            3 lust       
#> 6 JD              4 anger      
#> 7 Benjy           6 sloth

Created on 2019-12-05 by the reprex package (v0.3.0.9001)

As for reading in the data, if it's in an Excel spreadsheet, you might want to look at the readxl package.

2 Likes