Duration of the previous month's value under a certain condition

I have a problem that I want to ask for your advice regarding the calculation of the number of months that the previous month's value persisted under a certain condition.

I am trying to compute "the duration of the previous month's Value in consecutive months" only when there are at least 3 months' of observations in the past.

An example data looks like:

structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2), Month = c(2, 
4, 5, 6, 7, 10, 11, 12, 13, 14, 5), Value = c(0.1, 0.1, 0.1, 
0.1, 0.2, 0.1, 0.1, 0.1, 0.2, 0.2, 0.3)), class = "data.frame", row.names = c(NA, 
-11L), codepage = 65001L)

To clarify, the table below ("Stability" column) is what I am seeking:

╔═══════╦═══════╦═══════╦═══════════╦════════════════════════════════════════════════════════════╗
║ Group ║ Month ║ Value ║ Stability ║ Note                                                       ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   2   ║  0.1  ║     na    ║ Because there is no previous month's Value                 ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   4   ║  0.1  ║     na    ║ Because there is no previous month's Value                 ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   5   ║  0.1  ║     na    ║ We cannot know what happened for Month 3                   ║
║       ║       ║       ║           ║ (Or there is no data for 3 consecutive months in the past) ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   6   ║  0.1  ║     na    ║ There is no 3 consecutive months' of data in the past      ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   7   ║  0.2  ║     3     ║ 0.1 (Month 6's Value) was stable for 3 months in the past, ║
║       ║       ║       ║           ║ from Month 4 to 6                                          ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   10  ║  0.1  ║     na    ║ Because there is no previous month's Value                 ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   11  ║  0.1  ║     na    ║ There is no 3 consecutive months' of data in the past      ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   12  ║  0.1  ║     na    ║ There is no 3 consecutive months' of data in the past      ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   13  ║  0.2  ║     3     ║ 0.1 (Month 12's Value) was stable for 3 months in the past ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   1   ║   14  ║  0.2  ║     1     ║ 0.2 (Month 13's Value) was stable for 1 month (Month 13)   ║
╠═══════╬═══════╬═══════╬═══════════╬════════════════════════════════════════════════════════════╣
║   2   ║   5   ║  0.3  ║     na    ║ Because there is no previous month's Value for Group 2...  ║
╚═══════╩═══════╩═══════╩═══════════╩════════════════════════════════════════════════════════════╝
1 Like

To simplify, subset dat into the two Group entries; what will work for one will work for the other. That step is omitted below, because the value of Group does not change until the final row; if values alternate, the problem arises.

The condition test can be performed with dplyr::lag on Month with three tests:

Month - lag(Month,3) == 1 &
Month - lag(Month,2) == 1 &
Month - lag(Month,1) == 1

If all three conditions are met, Stability = lag(Value,1); else, NA

It will prove convenient to use a function for the test.

suppressPackageStartupMessages({library(dplyr)})
dat <- structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2), Month = c(
  2,
  4, 5, 6, 7, 10, 11, 12, 13, 14, 5
), Value = c(
  0.1, 0.1, 0.1,
  0.1, 0.2, 0.1, 0.1, 0.1, 0.2, 0.2, 0.3
)), class = "data.frame", row.names = c(
  NA,
  -11L
), codepage = 65001L)

do_test <- function(x) {
  x - lag(x,3) == 3 &
  x - lag(x,2) == 2 &
  x - lag(x,1) == 1
}

dat %>% mutate(Stability = ifelse(do_test(Month) == TRUE,lag(Value,1),NA))
#>    Group Month Value Stability
#> 1      1     2   0.1        NA
#> 2      1     4   0.1        NA
#> 3      1     5   0.1        NA
#> 4      1     6   0.1        NA
#> 5      1     7   0.2       0.1
#> 6      1    10   0.1        NA
#> 7      1    11   0.1        NA
#> 8      1    12   0.1        NA
#> 9      1    13   0.2       0.1
#> 10     1    14   0.2       0.2
#> 11     2     5   0.3        NA

Created on 2020-09-23 by the reprex package (v0.3.0.9001)

A similar lag approach will serve to detect 5 consecutive months.

Thank you, @technocrat, for the 'Stability' column created, is there a way to change Stability = lag(Value, 1) such that the output is an integer? For instance, for row 5 (or Month 7 for Group 1), the Stability needs to be '3' considering that Month 6's Value was 0.1 for 3 months in the past!

Could it perhaps work by using something like rle function?

I misconstrued the question. For Stability to reflect an integer value of 3 or more consecutive entries different code will be needed. RLE can be used, possibly, by creating a lag(x,1) test of a difference of one, first. I'll look at this later today.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.