# Pick second and third best values

Hi. I want to find the second and third high results in a dataframe. Explication:

In this data frame:

``````max_vot <- data.frame(
P1 = c(310L, 349L, 2831L, 99L, 161L, 1353L, 196L, 113L, 23L),
P2 = c(193L, 111L, 3126L, 56L, 155L, 1261L, 92L, 125L, 47L),
P3 = c(122L, 147L, 3566L, 54L, 70L, 1007L, 123L, 88L, 22L),
P4 = c(30L, 42L, 645L, 18L, 10L, 247L, 43L, 15L, 6L),
P5 = c(47L, 45L, 574L, 41L, 17L, 378L, 39L, 20L, 5L)
)
``````

using

``````max_vote <- colnames(max_vot)[max.col(max_vot, ties.method = "first")]
``````

I can get the column that have the more high value row:

``````[1] "P1" "P1" "P3" "P1" "P1" "P1" "P1" "P2" "P2"
``````

Now I want to get the second most high value column in each row, in this example will be: P2, P3, P2, P2, P2, P2, P3, P2, P1, P1

And then the third best value.

But thies.method only allow â€śrandomâ€ť, â€śfirstâ€ť, â€ślastâ€ť as arguments. Any idea on how to deal with this?

Hi @jynusmac ,

I'm using the tidyverse for your problem

``````max_vot <- data.frame(
P1 = c(310L, 349L, 2831L, 99L, 161L, 1353L, 196L, 113L, 23L),
P2 = c(193L, 111L, 3126L, 56L, 155L, 1261L, 92L, 125L, 47L),
P3 = c(122L, 147L, 3566L, 54L, 70L, 1007L, 123L, 88L, 22L),
P4 = c(30L, 42L, 645L, 18L, 10L, 247L, 43L, 15L, 6L),
P5 = c(47L, 45L, 574L, 41L, 17L, 378L, 39L, 20L, 5L)
)
``````

Create a table that shows 2nd and 3rd highest values, I don't know the background of your data but used the assumption that Pi are "judges" that give "values" for specific votes (vote_id). If the background of your data is different and I misunderstood it, feel free to rename the objects to make the code more explicit:

``````max_vot_table <-  max_vot %>%
#assigne vote_ID (row number) to each row, useful for when we'll pivot to a longer format below
mutate(vote_ID = 1:nrow(.)) %>%
#pivot to long format, we obtain 3 columns: row ID (or vote ID), judge ID (P1, ... P5), value of the vote
pivot_longer(-vote_ID, names_to = "judge", values_to = "value") %>%
#sort by vote_ID (the rows of your source dataframe) and value of the vote in descending order
arrange(vote_ID, desc(value)) %>%
#group by vote_ID to perform our summary on each of these groups
group_by(vote_ID) %>%
#select 2nd or 3rd position from each group
summarise(
second_pos = judge[2],
third_pos = judge[3]
)
``````

The results are in each column of `max_vot_table` . If you need to export them separately as a vector:

``````max_vot_table %>% pull(second_pos)
max_vot_table %>% pull(third_pos)
``````

Let me know if you'd rather have a function for this.

There are probably more code-savvy ways to solve this, so maybe worth waiting to see if someone suggests something a bit crispier

1 Like

I don't claim this is better than @xvalda's solution, it is just different. You can change the mutate() to summarize() to get only the column A.

``````library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union

max_vot <- data.frame(
P1 = c(310L, 349L, 2831L, 99L, 161L, 1353L, 196L, 113L, 23L),
P2 = c(193L, 111L, 3126L, 56L, 155L, 1261L, 92L, 125L, 47L),
P3 = c(122L, 147L, 3566L, 54L, 70L, 1007L, 123L, 88L, 22L),
P4 = c(30L, 42L, 645L, 18L, 10L, 247L, 43L, 15L, 6L),
P5 = c(47L, 45L, 574L, 41L, 17L, 378L, 39L, 20L, 5L)
)
MyFunc <- function(R,Pos){
tmp <- sort(R,decreasing = TRUE)
colnames(max_vot)[which(R==tmp[Pos])]
}
max_vot |> rowwise() |>
mutate(A = MyFunc(c_across(P1:P5),Pos = 2))
#> # A tibble: 9 x 6
#> # Rowwise:
#>      P1    P2    P3    P4    P5 A
#>   <int> <int> <int> <int> <int> <chr>
#> 1   310   193   122    30    47 P2
#> 2   349   111   147    42    45 P3
#> 3  2831  3126  3566   645   574 P2
#> 4    99    56    54    18    41 P2
#> 5   161   155    70    10    17 P2
#> 6  1353  1261  1007   247   378 P2
#> 7   196    92   123    43    39 P3
#> 8   113   125    88    15    20 P1
#> 9    23    47    22     6     5 P1
``````

Created on 2021-11-29 by the reprex package (v2.0.1)

@FJCC It's definitely shorter code, and self-explanatory.

Sorry for delay in reply, the two solutions work perfectly. Thanks you both for helping me.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.