One of the data structures that is easy to neglect in R
is the hash, a lookup table. In the example, with only a handful of unique names (a good name to avoid, because it's the name of an R primative
), it's not difficult line up the names with some arbitrary numeric.
But when dealing with more than a handful, doing this semi-by hand becomes infeasible.
To illustrate a more scaleable approach
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(hash))
subj_id <- c(9095, 3906, 1175, 1567, 2692, 6287)
name <- structure(list(name = c("able", "baker", "charlie", "dany", "elaine",
"fay")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
name %>% distinct(name) -> name.keys
h <- hash(keys = name, name = subj_id)
name %>% mutate(value = h$name) %>% select(-name) %>% rename(subject_id = value)
#> # A tibble: 6 x 1
#> subject_id
#> <dbl>
#> 1 9095
#> 2 3906
#> 3 1175
#> 4 1567
#> 5 2692
#> 6 6287
Created on 2020-02-24 by the reprex package (v0.3.0)