Creation of conditional Random IDs for students

Hi,
I have a data with student names. I want to create unique IDs for them with condition. The ID should have 3 capital alphabets (A,B,C, etc.) with 3 numbers. Hence the ID will have 6 characters like say, FGY675 as given in the data frame "data_id" below.
Is this possible to create?


library(tidyverse)
data<-tibble::tribble(
  ~student_name, ~teacher,
         "Ravi",   "Jaya",
       "Gandhi",   "Jaya",
        "Nehru",   "Jaya",
         "Giri",   "Jaya",
        "Nivin",   "Jaya",
       "Nithin",  "Shiva",
       "Mithun",  "Shiva",
      "Vinayak",  "Shiva",
      "Girirsh",  "Shiva",
       "Akshay",  "Shiva"
  )

data_id<-tibble::tribble(
  ~student_name, ~teacher, ~random_id,
         "Ravi",   "Jaya",   "BGH142",
       "Gandhi",   "Jaya",   "GHT675",
        "Nehru",   "Jaya",   "JUS098",
         "Giri",   "Jaya",   "NJS767",
        "Nivin",   "Jaya",   "POI999",
       "Nithin",  "Shiva",   "MIU567",
       "Mithun",  "Shiva",   "NUY144",
      "Vinayak",  "Shiva",   "MRE235",
      "Girirsh",  "Shiva",   "MMS258",
       "Akshay",  "Shiva",   "MAE257"
  )
Created on 2022-08-10 by the reprex package (v2.0.1)

This is a way of doing it. Get all combinations of three letters and three numbers, combine them, and then randomly sample from that list without replacement.

set.seed(123)

library(tidyverse, warn.conflicts = F)

grid <- crossing(
  a = LETTERS,
  b = LETTERS,
  c = LETTERS,
  d = 0:9,
  e = 0:9,
  f = 0:9
) %>%
  mutate(id = str_c(a, b, c, d, e, f, sep = ""))

data <- tibble::tribble(
  ~student_name, ~teacher,
  "Ravi",   "Jaya",
  "Gandhi",   "Jaya",
  "Nehru",   "Jaya",
  "Giri",   "Jaya",
  "Nivin",   "Jaya",
  "Nithin",  "Shiva",
  "Mithun",  "Shiva",
  "Vinayak",  "Shiva",
  "Girirsh",  "Shiva",
  "Akshay",  "Shiva"
)

mutate(data, id = sample(
  grid$id,
  size = length(data$student_name),
  replace = F
))
#> # A tibble: 10 × 3
#>    student_name teacher id    
#>    <chr>        <chr>   <chr> 
#>  1 Ravi         Jaya    RIX277
#>  2 Gandhi       Jaya    SVC969
#>  3 Nehru        Jaya    XPQ957
#>  4 Giri         Jaya    ISP674
#>  5 Nivin        Jaya    YSW436
#>  6 Nithin       Shiva   APM857
#>  7 Mithun       Shiva   NHC029
#>  8 Vinayak      Shiva   ULI691
#>  9 Girirsh      Shiva   HNH584
#> 10 Akshay       Shiva   ECK694

Created on 2022-08-10 by the reprex package (v2.0.1)

Wow! This is too good. But may I know two things:

  1. Would there be any IDs that will be duplicate?
  2. What exactly does the function crossing do?
  3. And what is the sample function used in mutate?

To answer your questions out of order: sample() samples from a vector, randomly. So if you give it the numbers one to ten, it'll grab you a random number.

You can set "replace" to be TRUE or FALSE. If replace is FALSE, it will never return a duplicate value.

See examples below:

# with replacement (get duplicates)
sample(1:10, size = 10, replace = T)
#>  [1]  1  8  2  2  4  6  6 10  1  2

# without replacement (no duplicates)
sample(1:10, size = 10, replace = F)
#>  [1]  2  3 10  1  9  8  5  4  6  7

crossing() gets all combinations of vectors and returns a dataframe. Consider this example:

tidyr::crossing(
  animal = c("dog", "cat"),
  colour = c("brown", "black", "golden"),
  attitude = c("friendly", "aloof", "angry", "independent")
)
#> # A tibble: 24 × 3
#>    animal colour attitude   
#>    <chr>  <chr>  <chr>      
#>  1 cat    black  aloof      
#>  2 cat    black  angry      
#>  3 cat    black  friendly   
#>  4 cat    black  independent
#>  5 cat    brown  aloof      
#>  6 cat    brown  angry      
#>  7 cat    brown  friendly   
#>  8 cat    brown  independent
#>  9 cat    golden aloof      
#> 10 cat    golden angry      
#> # … with 14 more rows

So what my code does is find all combinations between 3 vectors of all letters and 3 vectors of all single-digit numbers, combines the columns into one vector, and then samples from that vector randomly without replacement.

Thanks for the explanation.. It was of great help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.