Hi, RStudio Community!
When using bootstrapping to estimate parameters, when is the assessment dataset ever used?
data(wa_churn) resample1 <- bootstraps(wa_churn, times = 3) resample1$splits [] <Analysis/Assess/Total> <7043/2590/7043> [] <Analysis/Assess/Total> <7043/2614/7043> [] <Analysis/Assess/Total> <7043/2603/7043>
My understanding is that the analysis splits each contain indices for the total size of the dataset (n=7043) but sampled from it with replacement. The assessment split contains indices for the unused rows (in bootstrap 1, there are 2590 of those) — but don't most (? all?) uses of bootstrap involve calculating the statistics in the bootstrapped (the analysis) dataset? When is the assessment dataset relevant in a bootstrapping framework?
PS: I'd also love an example of a summary statistic that requires the use of apparent = TRUE as that one is also new to me. Thanks!