From your question, I'm not sure if year of diagnosis is relevant or not to counting the number of treatments. In case it does not matter, here is an approach that groups by TXT and FacilityID and then counts the number of observations in each group. If that count is greater than or equal to 5, over_5 will be TRUE, otherwise FALSE. Just a note, I created an n_procedures variable for extra clarity, but you could just use row_number() >= 5 in the if_else() if you want to skip that step.
library(tidyverse)
set.seed(500)
Data1 <- data.frame(
TXT = sample(1:4,100, replace = TRUE),
YEAR_OF_DIAGNOSIS = sample(2004:2015,100, replace = TRUE),
FacilityID = sample(letters[1:5], 100, replace = TRUE)
)
Data1 %>%
group_by(TXT, FacilityID) %>%
mutate(
n_procedures = n(),
over_5 = if_else(n_procedures >= 5, TRUE, FALSE)
)
#> # A tibble: 100 x 5
#> # Groups: TXT, FacilityID [20]
#> TXT YEAR_OF_DIAGNOSIS FacilityID n_procedures over_5
#> <int> <int> <fct> <int> <lgl>
#> 1 3 2004 d 4 FALSE
#> 2 3 2012 e 5 TRUE
#> 3 3 2010 e 5 TRUE
#> 4 1 2013 c 5 TRUE
#> 5 2 2012 b 5 TRUE
#> 6 1 2013 c 5 TRUE
#> 7 3 2014 a 5 TRUE
#> 8 4 2014 e 6 TRUE
#> 9 3 2005 c 3 FALSE
#> 10 2 2010 b 5 TRUE
#> # … with 90 more rows
Created on 2019-06-17 by the reprex package (v0.3.0)