Reset variable value based on condition

Hi All,

I have a data table which consists of transaction date, time, account number, amount.

All I need is a cumulative count and amount for each account. The catch is it has to reset cumulative value as soon as it reaches 30 minutes, after this it has to restart from 1.

For loop is working fine, but takes lot of time.

Any other suggestions please..

You can perform time aggregation with tsible and dplyr packages.

If you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

Does this work for you?

library(data.table)

#---- fake data ----
times <- seq.POSIXt(
 from = as.POSIXct("2021-03-12 08:00:00"),
 to =   as.POSIXct("2021-03-12 12:00:00"),
 by = "sec"
)

accounts <- seq(1111, 9999, by = 1000)

amounts <- seq(1, 5000, by = 1)

dt <- data.table(
 time = as.ITime(sample(times, 1500, replace = TRUE)),
 account = sample(accounts, 1500, replace = TRUE),
 amount = sample(amounts, 1500, replace = TRUE)
)

#---- counts and sums ----

# order by account and time
setkeyv(dt, c("account", "time"))

# new time by each half hour
dt[, half_hour := as.ITime(round(as.double(time)/(30*60))*(30*60))]

# count and sum by half hour
dt[,`:=`(n = seq_len(.N), cumulative = cumsum(amount)) , by = .(account, half_hour)]

Thanks Jeremy for your response, I am looking something like attached

I will check with my data on your solution and revert..thans again for your time

Hi Jeremy, this seems to be perfect solution to me..

however i notice half_hour variable getting reset after 15 mins instead of 30 minutes. In the above example reset to start at record number 17 instead of 10.

thanks
Saravana

Thanks for catching that. I think this is working now. I added the lubridate package to manage the 30-minute intervals properly with the floor_date() function.

```{r}

library(data.table)
library(lubridate)


#---- fake data ----

times <- seq.POSIXt(
 from = as.POSIXct("2021-03-12 08:00:00"),
 to =   as.POSIXct("2021-03-12 12:00:00"),
 by = "sec"
)

accounts <- seq(1111, 9999, by = 1000)

amounts <- seq(1, 5000, by = 1)

dt <- data.table(
 datetime = sample(times, 1500, replace = TRUE),
 account = sample(accounts, 1500, replace = TRUE),
 amount = sample(amounts, 1500, replace = TRUE)
)

# order by account and datetime
setkeyv(dt, c("account", "datetime"))

# separate into date and time columns
dt[, `:=`(date = as.IDate(datetime), time = as.ITime(datetime))]
dt[, datetime := NULL]


#---- counts and sums ----

# new time by each half hour
dt[, half_hour := floor_date(as.POSIXct(date) + time, "30 minutes")]

# count and sum by half hour
dt[,`:=`(n = seq_len(.N), cumulative = cumsum(amount)) , by = .(account, half_hour)]

Thanks Jeremy, it's working perfectly fine for my requirement..

Hi Jeremy

Thanks for your help, somehow I find it the same logic now working for 1 hour. Is there any other logic that can be incorporated.

Like take the first occurance of acct number and date time, add 1 hour to it until it reaches plus one hour and then reset again.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.