I have been working on a retention analysis of for a subscription business. A stakeholder asked me if I could do some analysis to model what improved retention could look like under various 't-shirt sized' scenarios e.g. 'moderately improved' , 'substantially improved', 'no change'.

After analysis of monthly cohorts using ggplot and non linear regression, I believe I have a model that fits our data well. Here's a chart showing cohorts oscillating around the modeled retention rate for each billing cycle (x-axis).

:

The function I used here was:

```
exponential_decay <- function(i, a, lambda, billing_cycle) i + a * exp(-lambda * billing_cycle)
```

Suppose using `nls()`

I got the following model params:

i=0.25

a=0.65

lambda=0.55

standard_error of lambda=0.02

My original goal:

model what improved retention could look like under various 't-shirt sized' scenarios

I focused on the parameter lambda and shifted it by standard errors. The closer lambda is to 0, the greater the survival rate. So I looked at the same modeled curve by t-shirt size:

- moderately improved = lambda - ( 1 * 0.02) = 0.53
- substantially improved = lambda - (2 * 0.02) = 0.51

More than 2 standard errors would be outside of 95% confidence.

I have not modeled retention in this way before. Is what I'm doing 'right'? Is this a sound approach to modeling improved retention?

Aside, can I call my function 'exponential decay'? From searching online and reviewing text books, the 'regular' exponential decay model would not include a constant `i`

nor a coefficient `a`

. But including these params helped my data fit better. What would I call this functional form?