boot() satitstics = function problem

Hey,

I'm trying to get a bootstrap CI for a moderated mediation indexes. I'm afraid I don't get the function right.

''' r
boot_statistic <- function(dataset, coefficients) {
d = data[coefficients,]
model = Model_EN2
fit_boot <- lavaan::sem(model, d)
return(summary(fit_boot)$PE[6:9])
#alternatively $PE[22:6] which would be the exact place of one Index value
}

boot(data = data_EN2, statistic = boot_statistic, R = 10)
'''

R = 10 is out of timely reasons
Using this script gives me 10 times the summary(fit_boot) (at least with different results) but It is not only the values but the complete summary which means I cant go on with boot.ci().

Any suggestions?

Just to be sure, you do know that in R, 6:9 creates a vector c(6,7,8,9)? If you want to select row 22 and column 6 you would write $PE[22,6].

Anyway, it's hard to understand what the problem is without data to reproduce it. Could you provide a reprex. This summary() function always prints a lot, but still returns a PE dataframe that could be subsetted appropriately.

Hey Alexis,

$PE[6:9] gives me the values I want the bootstrap CI for. Lines six to nine, which are the estimates, standard errors, z values and p values.

I made a path-model with lavaan and calculated the moderated mediation index.
''' r
Model_EN2 <- '
TPz.MA ~ a1SOPz.MA + a2SPPz.MA
ENz.MA ~ c1SOPz.MA + c2SPPz.MA + bTPz.MA + d1SOPz.MAxTPz.MA + d2*SPPz.MAxTPz.MA

SOPz.MA ~~ SPPz.MA
SOPz.MA ~~ SOPz.MAxTPz.MA
SPPz.MA ~~ SPPz.MAxTPz.MA
SOPz.MAxTPz.MA ~~ SPPz.MAxTPz.MA
TPz.MA ~~ SOPz.MAxTpz.MA
Tpz.MA ~~ SPPz.MAxTPz.MA

indirectSOP := a1b
indirectSPP := a2
b
totalSOP := c1 + (a1b)
totalSPP := c2 + (a2
b)
IndexModMedSOP := a1d1
IndexModMedSPP := a2
d2
'
'''

I changed the function a little. But it still gives me the complete summaries.
''' r
boot_statistic <- function(dataset, model) {
d = dataset
model = Model_EN2
fit_boot <- lavaan::sem(model, d)
sum_fit <- summary(fit_boot)
returnValue(sum_fit$PE[6:6])
}
'''

$PE[6:6] gives me only the estimates. At least as long as I don't use it in the function ... When I try the function it gives me the summary plus the estimates in an extra row.

I will try the reprex thing. Give me a moment to read how to do it :wink:

It's mostly a need to try on some data to see what the intermediary results look like. Here is with the example data from the package:

library(lavaan)

Model_EN2 <- '
  # measurement model
    ind60 =~ x1 + x2 + x3
    dem60 =~ y1 + y2 + y3 + y4
    dem65 =~ y5 + y6 + y7 + y8
  # regressions
    dem60 ~ ind60
    dem65 ~ ind60 + dem60
  # residual correlations
    y1 ~~ y5
    y2 ~~ y4 + y6
    y3 ~~ y7
    y4 ~~ y8
    y6 ~~ y8
'

boot_statistic <- function(dataset, coefficients) {
  d = dataset[coefficients,]
  model = Model_EN2
  fit_boot <- lavaan::sem(model, d)
  {sink(nullfile()); smry <- summary(fit_boot);sink()} # recover summary without printing
  return(smry$PE[6:9,"pvalue"])
}

boot::boot(data = PoliticalDemocracy, statistic = boot_statistic, R = 10)

But that only works with explicitly selecting a column from the data.frame PE, just row is not enough.

1 Like

That function worked.
I want to completly understand it.

The ModMedIndex is in row 22 and 23 to get the estimates instead of pvalue would it be:

return(smry$PE[22:23, "est"])?

What do you say about the second function?

I got the data[coefficients, ] out of an other tutorial, but I must confess I don't understand what it does.
In the second function are only arguments I know.

But with the second function i dont get bias and standard errors...

Which one is the first and which one the second function?

I'm not familiar with lavaan, so can't really help you on that aspect, what I do see is that with the example file summary()$PE is a data.frame, so to take values out you need to specify both which rows and which columns.

Also, boot can handle a function that returns a singe statistics for each replicate, or a vector (where each element is a different statistics to estimate), but not a data.frame (what would it mean?).

As to your smry$PE[22:23,"est"] it seems to work with the example data:

> boot_statistic <- function(dataset, coefficients) {
+   d = dataset[coefficients,]
+   model = Model_EN2
+   fit_boot <- lavaan::sem(model, d)
+   {sink(nullfile()); smry <- summary(fit_boot);sink()} # recover summary without printing
+   return(smry$PE[22:23,"est"])
+ }
> (res <- boot::boot(data = PoliticalDemocracy, statistic = boot_statistic, R = 10))

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot::boot(data = PoliticalDemocracy, statistic = boot_statistic, 
    R = 10)


Bootstrap Statistics :
     original       bias    std. error
t1* 0.1198065  0.003608058  0.06622563
t2* 0.4667026 -0.003862914  0.05968241
Warning messages:
1: In lav_object_post_check(object) :
  lavaan WARNING: some estimated lv variances are negative
2: In lav_object_post_check(object) :
  lavaan WARNING: some estimated lv variances are negative

> boot::boot.ci(res, type="perc", index=1)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10 bootstrap replicates

CALL : 
boot::boot.ci(boot.out = res, type = "perc", index = 1)

Intervals : 
Level     Percentile     
95%   ( 0.0418,  0.3229 )  
Calculations and Intervals on Original Scale
Warning : Percentile Intervals used Extreme Quantiles
Some percentile intervals may be unstable
Warning message:
In norm.inter(t, alpha) : extreme order statistics used as endpoints
> boot::boot.ci(res, type="perc", index=2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10 bootstrap replicates

CALL : 
boot::boot.ci(boot.out = res, type = "perc", index = 2)

Intervals : 
Level     Percentile     
95%   ( 0.3352,  0.5205 )  
Calculations and Intervals on Original Scale
Warning : Percentile Intervals used Extreme Quantiles
Some percentile intervals may be unstable
Warning message:
In norm.inter(t, alpha) : extreme order statistics used as endpoints

This one is the second function I was referring to. But when I use it in the boot() function I only get the original value and a zero for bias and std. error...

But the change you made in that first function is working perfectly. Thank you so much!!!

That shouldn't work if PE is a data frame. You should have to specify both row and columns, something like PE[6,4]. But you should get an error message like "incorrect subsetting" or "undefined column". That could be returnValue() that prevents it, I'm not familiar with it.

Bias and std.err at 0 could mean that the function is always returning the same value regardless of the bootstrap replicate.

Great! You can mark the question as solved by accepting a post as solution.

Sorry, I pasted the wrong one.

"' r
boot_statistic <- function(dataset, model) {
d = dataset
model = Model_EN2
fit_boot <- lavaan::sem(model, d)
{sink(nullfile()); smry <- summary(fit_boot);sink()}
return(smry$PE[22:23,"est"])
}
"'

Thats the one meant.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.