Each row in the Main_table are claims for unique Patient_ID and shows if they are Metastatic or Non_Metastatic. My goal is for each patient delete all claims of Metastatic Drug_Group before the first date of that patient Non_Metastatic claim. However, for the patient with the first claim as Non_metastatic, then, subsequent occurrence of Metastatic should not be deleted.
library(tidyverse)
Main_table <- tibble::tribble(
~Patient_ID, ~Date, ~Drug_Group,
1L, "17/11/2016", "Metastatic",
1L, "18/11/2016", "Metastatic",
1L, "19/11/2016", "Metastatic",
1L, "20/11/2016", "Non_Metastatic",
1L, "21/11/2016", "Non_Metastatic",
2L, "19/01/2017", "Non_Metastatic",
2L, "20/01/2017", "Metastatic",
2L, "21/01/2017", "Non_Metastatic",
2L, "22/01/2017", "Metastatic",
2L, "23/01/2017", "Metastatic"
)
expected_output <-tibble::tribble(
~Patient_ID, ~Date, ~Drug_Group,
1L, "20/11/2016", "Non_Metastatic",
1L, "21/11/2016", "Non_Metastatic",
2L, "21/01/2017", "Non_Metastatic",
2L, "19/01/2017", "Non_Metastatic",
2L, "20/01/2017", "Metastatic",
2L, "21/01/2017", "Non_Metastatic",
2L, "22/01/2017", "Metastatic",
2L, "23/01/2017", "Metastatic"
)
I can do for each patient separately. However, I have 100,000 patients and need some clever solution.
Any help would be really appreciated!!!
Thank you