Error extracting p values from coefficients matrix "subscript out of bounds"

I'm running power simulations for ANCOVA using the guidance provided here: https://egap.org/content/power-analysis-simulations-r

Using Rstudio Version 1.1.456, I'm experiencing the following error ("Error in summary(fit.sim)$coefficients[2, 4] : subscript out of bounds") when the number of sims > 100. I'm hoping someone can help me with this error as I'd like to use sims <- 1000 (or greater) for the power simulations. Thanks in advance.

rm(list=ls())
possible.ns <- seq(from=10, to=250, by=5)
powers <- rep(NA, length(possible.ns))
powers.cov <- rep(NA, length(possible.ns)) # Need a second empty vector
alpha <- 0.05
sims <- 1000

Outer loop to vary the number of subjects

for (j in 1:length(possible.ns)){
N <- possible.ns[j] # Pick the jth value for N

significant.experiments <- rep(NA, sims) # Empty object to count significant experiments
significant.experiments.cov <- rep(NA, sims) # Need a second empty vector here too

Inner loop to conduct experiments "sims" times over for each N

for (i in 1:sims){
age <- sample(x=18:40, size=N, replace=TRUE) # Generate "age" covariate
baseline <- sample(x=1:15, size=N, replace=TRUE) # Generate "baseline" covariate
effectofage <- 0.5 # Hypothesize the "effect" of age on dv
effectofbaseline <- 0.8 # Hypothesize the "effect" of baseline on dv

## Hypothesize Control Outcome as a function of age, baseline, and error
Y0 <- effectofage*age + effectofbaseline*baseline + rnorm(n=N, mean=6, sd=1.5)

## This is all the same ##
tau <- 2                                       # Hypothesize treatment effect
Y1 <- Y0 + tau                                 # treatment potential outcome
Z.sim <- rbinom(n=N, size=1, prob=.5)          # Do a random assignment
Y.sim <- Y1*Z.sim + Y0*(1-Z.sim)               # Reveal outcomes according to assignment
fit.sim <- lm(Y.sim ~ Z.sim)                   # Do analysis (Simple regression)

## This is the novel analysis -- including three covariates to increase precision ##
fit.sim.cov <- lm(Y.sim ~ Z.sim + age + baseline)

## extract p-values and calculate significance ##
p.value <- summary(fit.sim)$coefficients[2,4]
p.value.cov <- summary(fit.sim.cov)$coefficients[2,4]
significant.experiments[i] <- (p.value <= alpha)
significant.experiments.cov[i] <- (p.value.cov <= alpha)

}

powers[j] <- mean(significant.experiments)
powers.cov[j] <- mean(significant.experiments.cov)
}

plot(possible.ns, powers, ylab="Power", xlab="Sample Size", col="black", type="l", ylim=c(0,1))
abline(h=.90, col="black", lty = "dashed")
points(possible.ns, powers.cov, type="l", ylab="Power", xlab="Sample Size", col="grey")

Daniel

your issue was called by the assumption that there would always be two rows worth of coefficients/ p-values to pull from. This might not be the case given certain inputs that you feed to the lm.

## extract p-values and calculate significance ##
    coeffs<- coef(summary(fit.sim))
    if (dim(coeffs)[[1]]==1) # if only one row of pvalues then NA
      p.value <- NA
    else 
    p.value <- coeffs[2,4]

Awesome, many thanks!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.