Business context:
Each patient has admission and discharged date from the hospital. However, the day from admission and discharge data could be stored into multiple rows. (e.g. patient John). Also, each patient has a unique Index Date.
I want to get all the records of patients whose discharged and Index date are same. Then, I want to see all the previous or later records for those patients if that records has continuity.
Continuity means find the first time he got admission and the last he got discharged, considering the days overlap with Index date only. Take example John, the index and discharge date is same on Aug 2011. Then I can see it admission date is June 11. I will see previous record to see if June 11 matches the discharged date, if it does, include that row. I will see next record from the index date to see if the discharged date Aug 2011 matches, the admission date, if it does then include that row.
Below is my sample data
library(tidyverse)
library(lubridate)
sample_date <- tibble::tribble(
~patient, ~admission, ~discharged, ~Index_date,
"John", "01/04/2011", "01/06/2011", "01/08/2011",
"John", "01/06/2011", "01/08/2011", "01/08/2011",
"John", "01/08/2011", "01/10/2011", "01/08/2011",
"kevin", "01/10/2012", "01/12/2012", "01/12/2012",
"Sam", "01/08/2012", "01/09/2012", "01/01/2013",
"Sam", "01/10/2012", "01/01/2013", "01/01/2013",
"Sam", "01/01/2013", "01/03/2013", "01/01/2013",
"Matt", "01/05/2012", "01/06/2012", "01/06/2012",
"Matt", "01/08/2012", "01/09/2012", "01/06/2012",
"Matt", "01/09/2012", "01/11/2012", "01/06/2012",
"Sunny", "01/01/2012", "01/05/2012", "01/08/2012",
"Sunny", "01/05/2012", "01/08/2012", "01/08/2012",
"Sunny", "01/05/2012", "01/08/2012", "01/08/2012",
"Sunny", "01/08/2012", "01/09/2012", "01/08/2012"
) %>%
mutate(admission = dmy(admission), discharged = dmy(discharged), Index_date = dmy(Index_date))
Below is my expected result
expected_result <- tibble::tribble(
~patient, ~admission, ~discharged, ~Index_date,
"John", "01/04/2011", "01/06/2011", "01/08/2011",
"John", "01/06/2011", "01/08/2011", "01/08/2011",
"John", "01/08/2011", "01/10/2011", "01/08/2011",
"kevin", "01/10/2012", "01/12/2012", "01/12/2012",
"Sam", "01/10/2012", "01/01/2013", "01/01/2013",
"Sam", "01/01/2013", "01/03/2013", "01/01/2013",
"Matt", "01/05/2012", "01/06/2012", "01/06/2012",
"Sunny", "01/01/2012", "01/05/2012", "01/08/2012",
"Sunny", "01/05/2012", "01/08/2012", "01/08/2012",
"Sunny", "01/05/2012", "01/08/2012", "01/08/2012",
"Sunny", "01/08/2012", "01/09/2012", "01/08/2012"
)%>%
mutate(admission = dmy(admission), discharged = dmy(discharged), Index_date = dmy(Index_date))
I have 1 million records like this. I'd be grateful to you for any help on solving this problem.