I'm attempting to create a random selection by group based on a table and having trouble even describing what I want to do, which makes finding help difficult. If anyone can sort this out, I would be very appreciative.
I have two grouping variables and would like them to have a random number from 1:33. But, there must be 4 of each number from 1:33 for each group.
library(tidyverse) df <- tibble( action = rep(c("A", "B", "C", "D", "E", "F"), each = 110), region = rep(rep(c("North", "South", "East", "West", "Central"), each = 22),6) )
For each action + region, I want some random number from 1:33. But then, when grouped by region and random number, I want each group to be size 4.
set.seed(33) df %>% group_by(action, region) %>% mutate(ran_num = sample(1:33, 22, replace = FALSE)) %>% group_by(ran_num) %>% count(region, ran_num) %>% group_by(n) %>% count(sort = TRUE)
That gets me pretty close, but the groups range from size 1 to 4. Is there any way to create the distribution to force equal amounts per this random group? Or perhaps build the tibble a different way to solve it from another angle?