I think that the data is insufficient to do more than simulate very vague predictions of possible s-curves.

a lot more data would be required to pick curves of best fit.

What I'm about to present, is probably flawed, and I'd appreciate help with corrections. I figure that the data presented is very very early in any possible s curve. i.e. theres no sight yet of the inflexion point, so a huge range of possible s-curves can be fit. It seems to me that the fact that only the early portion of an s - curve has been provided can be achieved by dividing linearly the sample values, by some factor that represents how small of a portion of an s-curve has been presented. Then we can try to fit scurves on the early data points, and see the sorts of curves that can fit well for that.

```
library(tidyverse)
library(purrr)
tb <- tibble::tribble(
~ value,
2,
5,
18,
28,
43,
61,
95,
139,
245,
388,
593,
978,
1501,
2336,
2922,
3513,
4747,
5823,
6566,
7161,
8042,
9000,
10075,
11364,
12729,
13938,
14991,
16169
) %>% mutate(
rownum = row_number()
)
opt_func <- function(param_to_opt) {
tb2 <- mutate(tb,
value2 = value / (param_to_opt * max(value))
)
model <- glm(value2 ~ rownum, family = binomial(link = "logit"),weights=log(tb2$rownum), data = tb2)
return( sqrt(mean(model$residuals^2)))
}
ser <- 1:1000 * 10000
rmse <- map_dbl(ser , ~opt_func(.))
min_index <- which(rmse==min(rmse))
ser[[min_index]]
plot(x = ser,y=rmse)
rmse[rmse> 0.570192 -0.000001 & rmse <0.570192 +0.000001]
#only 2 values
rmse[rmse> 0.570192 -0.000002 & rmse <0.570192 +0.000002]
#477 values
rmse[rmse> 0.570192 -0.0000015 & rmse <0.570192 +0.0000015]
#43 values ' so will plot 43 'most likey' s curves from the 1000 trialled.
parms_to_plot <- which(rmse> 0.570192 -0.0000015 & rmse <0.570192 +0.0000015)
extrude_length <- 150
new_prediction <- function(chosen_parm)
{
tb2 <- mutate(tb,
rownum = row_number(),
value2 = value / (chosen_parm * max(value))
)
model <- glm(value2 ~ rownum, family = binomial(link = "logit"),weights = log(tb2$rownum), data = tb2)
mtr <- max(tb2$rownum) + 1
tb2_extended <- union_all(tb2,
data.frame(value=rep(NA_real_,extrude_length),
rownum=mtr:(mtr+extrude_length -1),
value2=rep(NA_real_,extrude_length)))
fitted.results <- predict(model,
newdata=tb2_extended,type='response')
tb3 <- cbind(tb2_extended,fitted.results)
tb4 <- mutate(tb3,
rescaled_prediction = fitted.results * chosen_parm *max(value,na.rm = TRUE))
select(tb4,
rescaled_prediction)
}
extruded_plots <- map_dfc(ser[parms_to_plot],
~new_prediction(.))
mtr <- max(tb$rownum) + 1
tb2_extended <- union_all(tb,
data.frame(value=rep(NA_real_,extrude_length),
rownum=mtr:(mtr+extrude_length -1)))
data_to_plot<-bind_cols(tb2_extended,extruded_plots)
library(plotly)
p<- plot_ly(data=data_to_plot,
type="scatter",
mode="lines",
x=~rownum,
y=~value,
line= list(
color="#0000FFFF"))
additional_line_names <- setdiff(names(data_to_plot),
c("value","rownum"))
add_next_trace <- function(name)
{
p<<-p %>% add_trace(y=as.formula(paste0("~",name)),
line= list(
color="#FF000044",
dash = 'dash')
)
}
walk(additional_line_names,
add_next_trace
)
p_close <- p %>% layout(xaxis = list(range = c(1, 40)),
yaxis = list(range = c(0,60000)))
pfull <- p
```

in the above i have plot() of the root mean square error , rmse that looks like

I then try to find some number of values that are towards the minima of that.

I settle on bounds which represent the 43 smallest values.

and I plot them.

there is a zoomed in view

`p_close`

and a zoomed out view

`p_full`

As can be seen here, there is a range from about 2% of earth population to over 100% , worrying :-/

I wouldn't trust these results to give any insight... but it was interesting to think about abstractly.