sort and split a vector into n group

Hi R experts,

I would like sort a vector by its values (1,2,2,3,3,3,3,4,4,5) and split them into three groups. There are ties in the vector so I need to deal with ties by average. Hence, it is not necessary to have equal group size. I guess that I could use rank(x,ties.method='average') and cut(x) but I am not sure how to put the functions together. Can any one help me out? Thanks.

Veda

You were on track. The ties.method of 'average' still runs into duplication. However, using ties.method 'first' will sort and then split. I threw in numbers at the beginning of your vector to make sure this worked out.

library(tidyverse)

df <- tibble( x = c(5,3,1,2,2,3,3,3,3,4,4,5))
n_groups = 3

#included just to see the difference
rank(df$x, ties.method = 'average')
rank(df$x, ties.method = 'first')

df%>%
  mutate(grouping = cut(rank(x, ties.method = 'first', na.last = 'keep')
                      ,  breaks = n_group
                      , labels = FALSE))

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.