Help with filling data in previous years and months

Hello,

I would like to fill in data for previous few months with the values of earliest year in current data. For this example, my data starts form 2018 and I would like to have same values of all months from 2018 to previous years - 2016 & 2017. Essentially 2016, 2017 & 2018 will all have same values.

Thanks for your help in advance!

library(tidyverse)
library(lubridate)

vol <- data.frame(
  Date = c("2018 Jan","2018 Jan","2018 Jan","2018 Jan",
           "2018 Feb","2018 Feb","2018 Feb","2018 Feb",
           "2018 Mar","2018 Mar","2018 Mar","2018 Mar",
           "2018 Apr","2018 Apr","2018 Apr","2018 Apr",
           "2018 May","2018 May","2018 May","2018 May",
           "2018 Jun","2018 Jun","2018 Jun","2018 Jun",
           "2018 Jul","2018 Jul","2018 Jul","2018 Jul",
           "2018 Aug","2018 Aug","2018 Aug","2018 Aug",
           "2018 Sep","2018 Sep","2018 Sep","2018 Sep",
           "2018 Oct","2018 Oct","2018 Oct","2018 Oct",
           "2018 Nov","2018 Nov","2018 Nov","2018 Nov",
           "2018 Dec","2018 Dec","2018 Dec","2018 Dec",
           "2019 Jan","2019 Jan","2019 Jan","2019 Jan",
           "2019 Feb","2019 Feb","2019 Feb","2019 Feb",
           "2019 Mar","2019 Mar","2019 Mar","2019 Mar",
           "2019 Apr","2019 Apr","2019 Apr","2019 Apr",
           "2019 May","2019 May","2019 May","2019 May",
           "2019 Jun","2019 Jun","2019 Jun","2019 Jun",
           "2019 Jul","2019 Jul","2019 Jul","2019 Jul",
           "2019 Aug","2019 Aug","2019 Aug","2019 Aug",
           "2019 Sep","2019 Sep","2019 Sep","2019 Sep",
           "2019 Oct","2019 Oct","2019 Oct","2019 Oct",
           "2019 Nov","2019 Nov","2019 Nov","2019 Nov",
           "2019 Dec","2019 Dec","2019 Dec","2019 Dec",
           "2020 Jan","2020 Jan","2020 Jan","2020 Jan",
           "2020 Feb","2020 Feb","2020 Feb","2020 Feb",
           "2020 Mar","2020 Mar","2020 Mar","2020 Mar",
           "2020 Apr","2020 Apr","2020 Apr","2020 Apr",
           "2020 May","2020 May","2020 May","2020 May",
           "2020 Jun","2020 Jun","2020 Jun","2020 Jun",
           "2020 Jul","2020 Jul","2020 Jul","2020 Jul",
           "2020 Aug","2020 Aug","2020 Aug","2020 Aug",
           "2020 Sep","2020 Sep","2020 Sep","2020 Sep",
           "2020 Oct","2020 Oct","2020 Oct","2020 Oct",
           "2020 Nov","2020 Nov","2020 Nov","2020 Nov",
           "2020 Dec", "2020 Dec","2020 Dec", "2020 Dec",
           "2021 Jan","2021 Jan","2021 Jan","2021 Jan",
           "2021 Feb","2021 Feb","2021 Feb","2021 Feb",
           "2021 Mar","2021 Mar","2021 Mar","2021 Mar",
           "2021 Apr","2021 Apr","2021 Apr","2021 Apr",
           "2021 May","2021 May","2021 May","2021 May",
           "2021 Jun","2021 Jun","2021 Jun","2021 Jun",
           "2021 Jul","2021 Jul","2021 Jul","2021 Jul",
           "2021 Aug","2021 Aug","2021 Aug","2021 Aug",
           "2021 Sep","2021 Sep","2021 Sep","2021 Sep",
           "2021 Oct","2021 Oct","2021 Oct","2021 Oct",
           "2021 Nov","2021 Nov","2021 Nov","2021 Nov",
           "2021 Dec","2021 Dec", "2021 Dec","2021 Dec"),
  
  Country = c("CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US"),
  
  Type = c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B"
  ),
  
  Sales = c(100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,
            220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,20, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,
            340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,
            460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 60, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570)
)

vol$Date <- ym(vol$Date)

Well it's not elegant and not automated, but at least for your particular example you could always "manually" duplicate 2018 only changing the year:

template <- vol |>
  filter(year(Date) == 2018)

bind_rows(template |>
            mutate(Date = Date - dyears(2)),
          template |>
            mutate(Date = Date - dyears(1)),
          vol)

set up

copy_year_as <- function(year,as) {
  filter(vol,
         year==year(Date)) %>% mutate(
           Date = lubridate::`year<-`(Date,as))}

multi_copy <- function(base,to){
  eg <-expand.grid(x=base,y=to)
  map2_dfr(.x = eg$x,
           .y = eg$y,
          ~{copy_year_as(.x,.y)})}

use

multi_copy(2018,2016:2017)
1 Like

Thanks @nirgrahamuk!
It work great!

Where can I find resource where I can learn using functions the way you do. I have also not used expand.grid & map2_dfr before.

Thanks again!

Thanks @AlexisW!
Yes, automating really helps with large data.
Thanks again for your help!

I learned R from swirl package, also R4DS book (online) , Hadley's Advanced R book.
I work with R every day in my job, and I spend a lot of time on this forum practicing my skills on problems that forum users raise. map2_dfr is a useful map variant from the purrr package. I like the functional approach to iteration generally so use a lot of purrr in my code where something beyond base R's vectorisation is needed

1 Like

I have learnt a lot from Hadley's book, but will definitely have to go through swirl package and will also look into the R4DS book. I believe I can improve my knowledge in functions.

Thanks again @nirgrahamuk! It was very helpful!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.