Calculate Running Balance

dplyr
purrr

#1

I’m trying to calculate a running balance from a set of transactions on each day. Each day’s transactions are applied to the end of the previous day’s balance, those changes are accumulated and used to create the next day’s balance.

I tried using accumulate to no avail, it seems as though it cannot handle a dynamic function.

While (I believe) map does use lazy eval, it’s not being handed a list and therefore the lag value is NA and the default is being used at each occasion.

This is something that’s extremely easy in excel, and I’m assuming I’m missing something that should make it equivalently easy in R.

Thanks in advance for any insight!

Reprex below. For completeness, expected outcome for the gb_df$bal are:

image

library(tidyverse)
#> Loading tidyverse: ggplot2
#> Loading tidyverse: tibble
#> Loading tidyverse: tidyr
#> Loading tidyverse: readr
#> Loading tidyverse: purrr
#> Loading tidyverse: dplyr
#> Conflicts with tidy packages ----------------------------------------------
#> filter(): dplyr, stats
#> lag():    dplyr, stats

df <-
  structure(
    list(
      date = structure(
        c(
          17087,
          17087,
          17087,
          17087,
          17087,
          17087,
          17088,
          17089,
          17089,
          17089,
          17089,
          17089,
          17089,
          17089,
          17089,
          17090
        ),
        class = "Date"
      ),
      txn = c(2, 3, 4, 5, 6,
              9, 11, 14, 15, 16, 17, 18, 20, 23, 24, 25),
      pct.chg = c(
        0.00442937610000094,
        -0.0651782800000006,-0.0316169216666666,-0.0385623980000008,
        -0.137562988086957,-0.0100988556249993,
        0.00840887583333427,
        0.0204354439999992,
        0.00775937730000043,
        0.177345172250001,
        0.00706567181818141,
        -0.173924408,
        0.143829305814815,-0.0467061726206903,-0.0218126823448283,
        0.0155682484814812
      )
    ),
    class = c("tbl_df", "tbl", "data.frame"),
    row.names = c(NA,-16L),
    .Names = c("date", "txn", "pct.chg")
  )


gb_df <-
  df %>%
  group_by(date) %>%
  summarize(net.chg = sum(pct.chg)) %>%
  mutate(bal = 0, #create placeholder for balance
         net.chg = 1 + net.chg) #allow net.chg

start_bal <- 100

#try with static chg
end_bal_static <-
  accumulate(gb_df$bal, ~ .x * 1.1, .init = start_bal)[-1]

#try with dynamic chg
end_bal_dynamic <-
  accumulate(gb_df$bal, ~ .x * gb_df$net.chg[.y], .init = start_bal)

#attempt to map
map_end_bal_dynamic <-
  map2(gb_df$bal,
       gb_df$net.chg,
       ~ accumulate(., ~ .x * .y , .init = start_bal))

#predefine function
get_bal <- function(bal, chg, dflt) {
  x <- lag(bal, n = 1, default = dflt) * chg
  return(x)
}

#attempt to map function
map_predefined <- map2(gb_df$bal,
                       gb_df$net.chg,
                       ~ get_bal(.x, .y, start_bal))

#this is inserting the default every row
map_predefined
#> [[1]]
#> [1] 72.14099
#> 
#> [[2]]
#> [1] 100.8409
#> 
#> [[3]]
#> [1] 111.3992
#> 
#> [[4]]
#> [1] 101.5568

#2

I think you are looking for something like:

gb_df %>% mutate(bal = accumulate(net.chg, `*`, .init = start_bal)[-1])

## A tibble: 4 x 3
#        date   net.chg      bal
#      <date>     <dbl>    <dbl>
#1 2016-10-13 0.7214099 72.14099
#2 2016-10-14 1.0084089 72.74762
#3 2016-10-15 1.1139917 81.04024
#4 2016-10-16 1.0155682 82.30190

You had the right idea to just use accumulate, but you got the right form of the function after you moved it to map (using `*` or ~ .x * .y for the accumulate function are equivalent). As you surmised, map isn’t quite the right tool in this case.


#3

That is exactly what I was needing thinking of! Thanks, Nick! It’d be nice if accumulate offered an option to drop the initial value when it’s used


#4

Does a combination of arrange and cumprod work here too?

gb_df %>% 
  arrange(date) %>% 
  mutate(bal = start_val * cumprod(net.chg))
#> # A tibble: 4 x 3
#>         date   net.chg      bal
#>       <date>     <dbl>    <dbl>
#> 1 2016-10-13 0.7214099 72.14099
#> 2 2016-10-14 1.0084089 72.74762
#> 3 2016-10-15 1.1139917 81.04024
#> 4 2016-10-16 1.0155682 82.30190

#5

Hey Danny, sorry for the delay, it appears this works as well!