how to expand case_when() to mutate new col for range of matches

Problem: I want to mutate a new column (top_100k) that specifies values of 1 when rank == 1:100000. However, I am not sure how to expand the case_when() function in this way and if this is indeed appropriate.

It would results in something like the below.

rank             top_100k
1                    1
4                    1
5                    1
100001               0

This is my attempt so far

range <- 1:100000

df <-   mutate(top_100k = case_when(rank >= range[1:100000] ~ 1))

Any help would be appreciated!

I think you can use

df <-   mutate(top_100k = case_when(rank <= 100000 ~ 1,
                                    TRUE ~ 0))

Since there are only two cases, you could also conveniently use ifelse() or even

df <-   mutate(top_100k = rank <= 100000)

to get a TRUE/FALSE result

Thank you very much! This is perfect. With the ", TRUE ~0" bit, would you know how to explain the TRUE argument? I would expect that this might be FALSE ~ 0 if we are allocating 0s to all values that do not match.

I would suggest using the new .default argument from dplyr 1.1.0 rather than TRUE ~, as it is easier to understand.

df |>
    top_100k = case_when(
      rank <= 100000 ~ 1,
      .default = 0

But @FJCC's other suggestion of just using rank <= 100000 sounds best for your use case

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.