How can we add the rank for each arranged dataframe ?
How can we add another column called rank which
describes the rank for that group.
e.g.
Species class value rank
1 setosa Petal.Width 0.1 1
2 setosa Petal.Width 0.1 2
3 setosa Petal.Width 0.1 3
Similarly however, this rank does not rank all 600 rows instead.
It should start and end within the group.
library(tidyr)
my_df =iris %>% pivot_longer(names_to = "class", cols = c(1:4))
op = my_df %>% group_by(Species, class) %>%
arrange(value)
my_df %>% group_by(Species, class) %>%
+ arrange(value)
# A tibble: 600 x 3
# Groups: Species, class [12]
Species class value
<fct> <chr> <dbl>
1 setosa Petal.Width 0.1
2 setosa Petal.Width 0.1
3 setosa Petal.Width 0.1
4 setosa Petal.Width 0.1
5 setosa Petal.Width 0.1
6 setosa Petal.Width 0.2
7 setosa Petal.Width 0.2
8 setosa Petal.Width 0.2
9 setosa Petal.Width 0.2
10 setosa Petal.Width 0.2
# … with 590 more rows
valeri
October 10, 2019, 8:55am
2
Have you tried
op <- my_df %>% group_by(Species, class) %>%
mutate(rank = rank(value)) %>%
arrange(value)
... there are quite a few options in rank
how to deal with ties.
2 Likes
Thanks for quick help.
tried your approach but rank is integer which is undesired
> op$rank %>% unique()
[1] 3.0 20.0 38.0 45.0 49.0 50.0 1.0 4.0 2.0 9.0 3.5 13.0 8.0 22.0 18.0 32.0 31.0 40.5 2.5 41.0 47.0 46.5 5.0 11.0 49.5 19.0 24.5 30.5 35.0 46.0 11.5 15.0 6.5 9.5 15.5
[36] 20.5 5.5 38.5 27.5 10.5 44.0 35.5 40.0 18.5 24.0 31.5 4.5 36.0 6.0 48.0 39.0 7.0 42.5 45.5 10.0 14.0 21.5 33.0 7.5 42.0 47.5 32.5 17.5 19.5 43.0 28.5 48.5 23.0 25.5 8.5
[71] 12.5 16.5 26.5 43.5 34.5 37.0
valeri
October 10, 2019, 9:00am
4
What would you like it to be?
basically, rank meaning... row number
valeri
October 10, 2019, 9:29am
6
So, you mean like this?
op <- my_df %>% group_by(Species, class) %>%
mutate(rank = rank(value, ties.method = "first")) %>%
arrange(value)
2 Likes
Can we retain this format but excluding integer values ?
Meaning, repetition is ok but not integer in the rank column
valeri
October 10, 2019, 10:50am
8
Hi @AbhishekHP ,
again, I am not quite sure what you mean ... by default rank
uses ties.method = 'average'
which results in ranks like you see in my first response (which are not integers by the way). With ties.method = 'first'
you get the rank in fact as integers from 1 to as many values there are in each group. So ... not sure which one you prefer - you can also check the documentation for rank
for other methods.
Thanks for response.
Was wondering if we can somehow get rid of averaging only if they appear to be integer.
However, let me proceed with ties.method = "first"
system
Closed
October 17, 2019, 10:52am
10
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.