Select period around a specific date

Hello,

I have a dataframe like the one below and I would like to select for each id the dates 5, 10 or 15 days prior and posterior to the peak date. How can I do that?

Capture d’écran 2022-03-23 224029

I have tried to use a for loop, but it is only working for one id.

for (i in unique(dataframe$id)){
sub <- dataframe$id[dataframe$id %in% i, ]
sub <- subset(dataframe, date >= peak_date - 2 & date <= peak_date + 2)
}

I have also tried to use the function split but did not manage to manipulate all the individuals in the list (in reality I have about 60 individuals)...

splitData <- split(dataframe, dataframe$id)

Thank you!

By selecting, do you mean to filter existing rows from the dataframe, or create new dates based on the peak date column?

If it is the first case, depending on your dataframe, this might work:


library(dplyr)
your_df %>% 
  group_by(id) %>% 
  filter(lag(date, 5) == peak_date |
         lag(date, 10) == peak_date |
         lag(date, 15) == peak_date |
         lead(date, 5) == peak_date |
         lead(date, 10) == peak_date |
         lead(date, 15) == peak_date) %>% 
  ungroup()

Thank you for your answer.
I would actually like to select all the rows that are present between a number of days before and after "peak date". For example, if I am working on a lag and a lead of 2 days, I would like these dates to be selected:
Capture d’écran 2022-03-24 091056

I think that the function filter will only help me select the date 2 days before and 2 days after but not the dates between them right?

library(dplyr)
your_df %>% 
  group_by(id) %>% 
  filter(peak_date >=  lag(date, 15) |
             peak_date  <= lead(date, 15)) %>% 
  ungroup()

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.