How to find the first observation of a column that matches a condition

i have a data frame:

df = tibble(a=c(7,6,10,12,12), b=c(3,5,8,8,7), c=c(4,4,12,15,20), week=c(1,2,3,4,5))

# A tibble: 5 x 4
      a     b     c  week
  <dbl> <dbl> <dbl> <dbl>
1     7     3     4     1
2     6     5     4     2
3    10     8    12     3
4    12     8    15     4
5    12     7    20     5

and i want for every column a, b and c the week in which the observation is equal to or exceeds 10.
I.e. for column a it would be week 3, for column b it would be week NA, for column c it would be week 3 as well.

A desired ouotcome could look like this:

tibble(abc=c("a", NA, "b"), value=c(10, NA, 12), week=c(3, NA, 3))

# A tibble: 3 x 3
  abc   value  week
  <chr> <dbl> <dbl>
1 a        10     3
2 b       NA    NA
3 c        12     3

Your problem would be much easier to tackle if you "tidied" the data frame first.

Also, since you want to retain all groups in your output, I'd suggest making the variable containing a, b and c a factor in order to preserve levels that do not have values greater than or equal to 10.

library(dplyr, warn.conflicts = FALSE)
library(tidyr)

df <- tibble(a = c(7, 6, 10, 12, 12), 
             b = c(3, 5, 8, 8, 7), 
             c = c(4, 4, 12, 15, 20), 
             week = c(1, 2, 3, 4, 5))

pivoted_df <- df %>% 
  pivot_longer(cols = -week, names_to = "abc") %>% 
  mutate(abc = as.factor(abc)) %>% 
  filter(value >= 10) %>% 
  group_by(abc) %>% 
  summarise(value = first(value), .groups = "drop")

left_join(expand_grid(abc = levels(pivoted_df$abc)), pivoted_df, by = "abc")
#> # A tibble: 3 x 2
#>   abc   value
#>   <chr> <dbl>
#> 1 a        10
#> 2 b        NA
#> 3 c        12

Created on 2020-08-20 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.