Aggregating Time Series tsibbles on non-calendar years

Hi,

I am exploring the use of the tsibble packages, and I have a question.

My data is currently uses an index of yearquarter. I want to aggregate it up to year, but I want to shift the year, such that a given year goes from Q3 of one year through Q2 of the following year. Is there a way to do that?

To better explain:

library(tsibble)
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

If I want to take a data set, and aggregate it on the year-quarter level, I can do the following.

q_data <- tsibbledata::gafa_stock %>% 
  index_by(yq0 = yearquarter(Date)) %>%
  summarize(close = last(Close))

head(q_data, 10)
#> # A tsibble: 10 x 2 [1Q]
#>        yq0 close
#>      <qtr> <dbl>
#>  1 2014 Q1  554.
#>  2 2014 Q2  572.
#>  3 2014 Q3  574.
#>  4 2014 Q4  524.
#>  5 2015 Q1  545.
#>  6 2015 Q2  521.
#>  7 2015 Q3  608.
#>  8 2015 Q4  759.
#>  9 2016 Q1  745.
#> 10 2016 Q2  692.

If I want to aggregate it on the calendar year level, I can extend the data frame above as follows:

q_data %>%
  index_by(y0 = year(yq0)) %>%
  summarize(close = last(close))
#> # A tsibble: 5 x 2 [1Y]
#>      y0 close
#>   <dbl> <dbl>
#> 1  2014  524.
#> 2  2015  759.
#> 3  2016  772.
#> 4  2017 1046.
#> 5  2018 1036.

But how do I do so if I want non-calendar years?

Created on 2019-09-12 by the reprex package (v0.3.0)

Thanks!

1 Like

Lubridate functions are mostly built for calendar year. For this case, you need a custom function that does different binning. An Aussie version of financial_year() is defined for this example:

library(tsibble)
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

q_data <- tsibbledata::gafa_stock %>% 
  index_by(yq0 = yearquarter(Date)) %>%
  summarize(close = last(Close))

financial_year <- function(date) {
  year <- year(date)
  ifelse(quarter(date) <= 2, year, year + 1)
}

q_data %>% 
  index_by(y0 = ~ financial_year(.))
#> # A tsibble: 20 x 3 [1Q]
#> # Groups:    @ y0 [6]
#>        yq0 close    y0
#>      <qtr> <dbl> <dbl>
#>  1 2014 Q1  554.  2014
#>  2 2014 Q2  572.  2014
#>  3 2014 Q3  574.  2015
#>  4 2014 Q4  524.  2015
#>  5 2015 Q1  545.  2015
#>  6 2015 Q2  521.  2015
#>  7 2015 Q3  608.  2016
#>  8 2015 Q4  759.  2016
#>  9 2016 Q1  745.  2016
#> 10 2016 Q2  692.  2016
#> 11 2016 Q3  777.  2017
#> 12 2016 Q4  772.  2017
#> 13 2017 Q1  830.  2017
#> 14 2017 Q2  909.  2017
#> 15 2017 Q3  959.  2018
#> 16 2017 Q4 1046.  2018
#> 17 2018 Q1 1032.  2018
#> 18 2018 Q2 1116.  2018
#> 19 2018 Q3 1193.  2019
#> 20 2018 Q4 1036.  2019
q_data %>% 
  index_by(y0 = ~ financial_year(.)) %>% 
  summarize(close = last(close))
#> # A tsibble: 6 x 2 [1Y]
#>      y0 close
#>   <dbl> <dbl>
#> 1  2014  572.
#> 2  2015  521.
#> 3  2016  692.
#> 4  2017  909.
#> 5  2018 1116.
#> 6  2019 1036.

I'll add this example to the index_by() help page. Thanks.

Created on 2019-09-13 by the reprex package (v0.3.0)

5 Likes