Determining difference between ranked observations

Given the following tibble, I'd like to mutate a new column indicating the additional count necessary to tie the next-highest in rank.

v <- tribble(
  ~rank, ~name, ~count,
  1, "Mary", 100,
  2, "Fred", 96,
  3, "Sue", 90, 
  3, "Michelle", 90,
  4, "Tom", 72
)

I've tried dplyr's lag function ( v %>% mutate(toTie = lag(count) - count)). This works, but not when there are ties, as any tied observation after the first is compared to an observation with the same value. For example, after the mutate I have this:

   rank name     count toTie
  <dbl> <chr>    <dbl> <dbl>
1     1 Mary       100    NA
2     2 Fred        96     4
3     3 Sue         90     6
4     3 Michelle    90     0
5     4 Tom         72    18

This output correctly says Sue, who is ranked third, needs 6 to tie Fred, who is second. But because it compares Michelle to Sue (and not Fred), it says Michelle needs none to tie Sue. This is true, but not the intent. Michelle, like Sue, needs 6 to tie second-place Fred.

Any thoughts on a better approach would be most appreciated.

-- Robert

Here is one approach:

library(dplyr, warn.conflicts = FALSE)
library(tibble)
v <- tribble(
  ~rank, ~name, ~count,
  1, "Mary", 100,
  2, "Fred", 96,
  3, "Sue", 90, 
  3, "Michelle", 90,
  4, "Tom", 72
)
RANKS <- v %>% select(RankToJoin = rank, CountToJoin = count) %>%
  unique() %>% 
  mutate(RankToJoin = RankToJoin + 1)
v2 <- left_join(v, RANKS, by = c("rank" = "RankToJoin")) %>% 
  mutate(toTie = CountToJoin - count)
v2
#> # A tibble: 5 x 5
#>    rank name     count CountToJoin toTie
#>   <dbl> <chr>    <dbl>       <dbl> <dbl>
#> 1     1 Mary       100          NA    NA
#> 2     2 Fred        96         100     4
#> 3     3 Sue         90          96     6
#> 4     3 Michelle    90          96     6
#> 5     4 Tom         72          90    18

Created on 2020-11-22 by the reprex package (v0.3.0)

1 Like

Thank you for the reply.

With some help, I found another approach using mutate and fill from tidyr:

v %>%
     mutate(toTie = lag(count) - count, 
            toTie = na_if(toTie, 0)) %>% 
     fill(toTie)

-- Robert

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.