I would like sort a vector by its values (1,2,2,3,3,3,3,4,4,5) and split them into three groups. There are ties in the vector so I need to deal with ties by average. Hence, it is not necessary to have equal group size. I guess that I could use rank(x,ties.method='average') and cut(x) but I am not sure how to put the functions together. Can any one help me out? Thanks.
You were on track. The ties.method of 'average' still runs into duplication. However, using ties.method 'first' will sort and then split. I threw in numbers at the beginning of your vector to make sure this worked out.
library(tidyverse)
df <- tibble( x = c(5,3,1,2,2,3,3,3,3,4,4,5))
n_groups = 3
#included just to see the difference
rank(df$x, ties.method = 'average')
rank(df$x, ties.method = 'first')
df%>%
mutate(grouping = cut(rank(x, ties.method = 'first', na.last = 'keep')
, breaks = n_group
, labels = FALSE))