Hi! I want to build a time-series data frame. I have several observations and start and end dates. I want to create additional observations disaggregated by week. For instance:
I have:
id1 01/01/2000 31/12/2000
I want to get:
id1 2000-01
id1 2000-02
id1 2000-03
...
id1 2000-52
That looks like a great start, thanks a lot! Do you know how I would be able to apply that to all my observations? I have more than 10,000 observations, all of them with different start and end dates.
For make a good response is better see the reproducible example of data:
Im could suggest you check the lubridate library for get the year_week format from date data.
# But if you data frame is df, this code help you. If all 10.000 observation have this format 2000-01-01 this code # run well
df$year_week <- format(df$date, "%Y-%U")
My problem is that different observations have different values for seq.
A minimal reproducible example would be this:
grid <- c(1, 2, 3, 4)
name <- c('A', 'A', 'B', 'B')
start_date <- c('26/12/2022','26/12/2022','23/01/2023','23/01/2023')
end_date <- c('22/01/2023', '22/01/2023', '05/02/2023', '05/02/2023')
df <- data.frame(grid, name, start_date, end_date)
df
grid name start_date end_date
1 1 A 26/12/2022 22/01/2023
2 2 A 26/12/2022 22/01/2023
3 3 B 23/01/2023 05/02/2023
4 4 B 23/01/2023 05/02/2023
I want to transform this into:
grid name year_week
1 1 A 2022-52
2 1 A 2023-01
3 1 A 2023-02
4 1 A 2023-03
5 2 A 2022-52
6 2 A 2023-01
7 2 A 2023-02
8 2 A 2023-03
9 3 B 2023-04
10 3 B 2023-05
11 4 B 2023-04
12 4 B 2023-05
Thanks, @M_AcostaCH, but this doesn't solve my problem.
I want to have one observation per grid per name per week. I think I should combine your first suggestion and expand the observations based on the sequence derived from the start and end dates. Perhaps looping it would be the solution? I am not sure... I have looked into many similar questions and still can't find the proper solution.