Recursive calcs on timeseries


#1

Hi. I have been struggling to put my issue into words (English not my strongest point). I tried on SO, but with no luck yet, and I'm starting to think it is because my description makes no sense. Is recursive the correct description for my problem?
Do I need to re-post the code?

My problem is I have a time series in tibble format (actually nested, but that's not the issue here). There is one row per day. I need to do a calculation for each day, but I need the result of the previous day's calculation to do the current day's calculation.
Close(n) = Close(n-1) + Close(n-1)*ratio
I can do this with a for loop, but it seems to slow.


#2

Hi Martin,

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? I see you have some example code in the StackOverflow question, but it will still help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

For pointers specific to the community site, check out the reprex FAQ, linked to below.


#3

As @mara said it is best in include a reprex.

But in any case here is a reprex example you might be able to use to avoid a for loop and which might be faster:

suppressPackageStartupMessages(library(tidyverse))
# there a couple of way to avoid a for loop
# which *might* be faster

# lead and lag can be used to do what you want to do
data <- 1:10

# this doesn't require the tidyverse
sapply(1:length(data), function(index) { data[index] + data[index + 1]})
#>  [1]  3  5  7  9 11 13 15 17 19 NA

# lead "shifts" the vector to the left
lead(data)
#>  [1]  2  3  4  5  6  7  8  9 10 NA

# here is an example where the value at n
# is added to the value of n + 1 
data + lead(data)
#>  [1]  3  5  7  9 11 13 15 17 19 NA

# you can also use purrr to do something similar
map2_dbl(data, lead(data), ~ .x + .y)
#>  [1]  3  5  7  9 11 13 15 17 19 NA

Created on 2018-03-19 by the reprex package (v0.2.0).


#4
library(tidyverse)
library(tibbletime)
#> 
#> Attaching package: 'tibbletime'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

sample <- list(OpenUnits = c(7500000, 7500000, 7500000, 7500000, 7500000, 
                             3300000, 3300000, 3300000, 3300000, 3300000), ClosingUnits = c(7500000, 
                                                                                            7500000, 7500000, 7500000, 3300000, 3300000, 3300000, 3300000, 
                                                                                            3300000, 3300000), AccrualDate = 16892:16901, AiaAdjustAmt = c(1844.70359677349, 
                                                                                                                                                           1845.18465061665, 1845.66582990696, 1846.14713467713, 812.516568582349, 
                                                                                                                                                           812.728453146696, 812.940392965385, 813.152388052826, 813.364438423431, 
                                                                                                                                                           813.576544091616), SellUnits = c(NA, NA, NA, NA, 4200000, NA, 
                                                                                                                                                                                            NA, NA, NA, NA))
sample <- sample %>%
  as_tibble() %>% 
  mutate(
    AccrualDate = lubridate::as_date(AccrualDate),
    SellUnits = if_else(is.na(SellUnits), 0, SellUnits)
  ) %>% 
  as_tbl_time(index = AccrualDate)

sample <- sample %>% 
  mutate(
    RealInterest = 0,
    OpenInterest = cumsum(AiaAdjustAmt) - cumsum(RealInterest) - AiaAdjustAmt - RealInterest,
    RealInterest = OpenInterest*SellUnits/OpenUnits 
  )

Incorrect Answer


# A time tibble: 10 x 7
# Index: AccrualDate
   OpenUnits ClosingUnits AccrualDate AiaAdjustAmt SellUnits s24j_real s24j_open
       <dbl>        <dbl> <date>             <dbl>     <dbl>     <dbl>     <dbl>
 1  7500000.     7500000. 2016-04-01         1845.        0.        0.        0.
 2  7500000.     7500000. 2016-04-02         1845.        0.        0.     1845.
 3  7500000.     7500000. 2016-04-03         1846.        0.        0.     3690.
 4  7500000.     7500000. 2016-04-04         1846.        0.        0.     5536.
 5  7500000.     3300000. 2016-04-05          813.  4200000.     4134.     7382.
 6  3300000.     3300000. 2016-04-06          813.        0.        0.     8194.
 7  3300000.     3300000. 2016-04-07          813.        0.        0.     9007.
 8  3300000.     3300000. 2016-04-08          813.        0.        0.     9820.
 9  3300000.     3300000. 2016-04-09          813.        0.        0.    10633.
10  3300000.     3300000. 2016-04-10          814.        0.        0.    11446.

Correct Answer

# A time tibble: 10 x 7
# Index: AccrualDate
   OpenUnits ClosingUnits AccrualDate AiaAdjustAmt SellUnits s24j_real s24j_open
       <dbl>        <dbl> <date>             <dbl>     <dbl>     <dbl>     <dbl>
 1  7500000.     7500000. 2016-04-01         1845.        0.        0.        0.
 2  7500000.     7500000. 2016-04-02         1845.        0.        0.     1845.
 3  7500000.     7500000. 2016-04-03         1846.        0.        0.     3690.
 4  7500000.     7500000. 2016-04-04         1846.        0.        0.     5536.
 5  7500000.     3300000. 2016-04-05          813.  4200000.     4134.     7382.
 6  3300000.     3300000. 2016-04-06          813.        0.        0.     4060.
 7  3300000.     3300000. 2016-04-07          813.        0.        0.     4873.
 8  3300000.     3300000. 2016-04-08          813.        0.        0.     5686.
 9  3300000.     3300000. 2016-04-09          813.        0.        0.     6499.
10  3300000.     3300000. 2016-04-10          814.        0.        0.     7313.

Which I got with:

sample2 <- sample %>% 
  mutate(
    sell_ratio = if_else(!is.na(SellUnits), SellUnits/OpenUnits, 0),
    s24j_open = 0,
    s24j_close = 0,
    s24j_real = 0     
  )

open <- 0
close <- 0  

for (i in seq_along(sample2$AccrualDate)) {

  open <- close
  sellratio <- sample2[i, ]$sell_ratio
  int <- sample2[i, ]$AiaAdjustAmt
  real <- sellratio*open

  close <- open - real + int

  sample2[i, ]$s24j_open <- open
  sample2[i, ]$s24j_real <- real
  sample2[i, ]$s24j_close <- close
}

sample2 %>% 
  select(
    OpenUnits, ClosingUnits, AccrualDate, AiaAdjustAmt, SellUnits, s24j_real, s24j_open
  )

#5

Hi Mara. Did I do the reprex correctly?


#6

Close enough. :smile:


#7

A solution have been posted on SO. Do I need to close this one?


#8

Yes, and please link or quote the solution here so others know what it was! Thanks.