bensel
November 16, 2022, 12:40am
1
I hope I'm not barking up the wrong tree posting here. I'm trying to do a LR test of two different regressions (with vs w/o interaction terms) where the dependent variable is continuous and bounded by [0,1]. When the model is a beta regression (using betareg), the R output includes the chi squared quantile and p-value But, when I do a fractional logit regression (glm with a quasibinomial link function) I don't get those terms in the output. Am I doing something wrong or is this an inherent feature of fractional logit models?
Reproducible code below. Thank you!
test = data.frame(y = rbeta(1000,1,2), x1 = rnorm(1000), x2 = rnorm(1000), x3 = rnorm(1000))
betar1 = betareg::betareg(y ~ x1 + x2 + x3, data = test)
betar2 = betareg::betareg(y ~ x1 + x2 + x1:x2 + x3, data = test)
fracr1 = glm(y ~ x1 + x2 + x3, family = quasibinomial("logit"), data = test)
fracr2 = glm(y ~ x1 + x2 + x1:x2 + x3, family = quasibinomial("logit"), data = test)
lmtest::lrtest(betar1,betar2)
lmtest::lrtest(fracr1,fracr2)
test = data.frame(y = rbeta(1000,1,2), x1 = rnorm(1000), x2 = rnorm(1000), x3 = rnorm(1000))
(betar1 = betareg::betareg(y ~ x1 + x2 + x3, data = test))
#>
#> Call:
#> betareg::betareg(formula = y ~ x1 + x2 + x3, data = test)
#>
#> Coefficients (mean model with logit link):
#> (Intercept) x1 x2 x3
#> -0.69631 0.02527 0.01616 -0.01437
#>
#> Phi coefficients (precision model with identity link):
#> (phi)
#> 3.012
(betar2 = betareg::betareg(y ~ x1 + x2 + x1:x2 + x3, data = test))
#>
#> Call:
#> betareg::betareg(formula = y ~ x1 + x2 + x1:x2 + x3, data = test)
#>
#> Coefficients (mean model with logit link):
#> (Intercept) x1 x2 x3 x1:x2
#> -0.69528 0.02731 0.01725 -0.01461 -0.03864
#>
#> Phi coefficients (precision model with identity link):
#> (phi)
#> 3.017
(fracr1 = glm(y ~ x1 + x2 + x3, family = quasibinomial("logit"), data = test))
#>
#> Call: glm(formula = y ~ x1 + x2 + x3, family = quasibinomial("logit"),
#> data = test)
#>
#> Coefficients:
#> (Intercept) x1 x2 x3
#> -0.692084 0.006585 0.022004 -0.018612
#>
#> Degrees of Freedom: 999 Total (i.e. Null); 996 Residual
#> Null Deviance: 271
#> Residual Deviance: 270.8 AIC: NA
(fracr2 = glm(y ~ x1 + x2 + x1:x2 + x3, family = quasibinomial("logit"), data = test))
#>
#> Call: glm(formula = y ~ x1 + x2 + x1:x2 + x3, family = quasibinomial("logit"),
#> data = test)
#>
#> Coefficients:
#> (Intercept) x1 x2 x3 x1:x2
#> -0.69121 0.00929 0.02304 -0.01932 -0.03525
#>
#> Degrees of Freedom: 999 Total (i.e. Null); 995 Residual
#> Null Deviance: 271
#> Residual Deviance: 270.5 AIC: NA
lmtest::lrtest(betar1,betar2)
#> Likelihood ratio test
#>
#> Model 1: y ~ x1 + x2 + x3
#> Model 2: y ~ x1 + x2 + x1:x2 + x3
#> #Df LogLik Df Chisq Pr(>Chisq)
#> 1 5 195.74
#> 2 6 196.48 1 1.4646 0.2262
lmtest::lrtest(fracr1,fracr2)
#> Likelihood ratio test
#>
#> Model 1: y ~ x1 + x2 + x3
#> Model 2: y ~ x1 + x2 + x1:x2 + x3
#> #Df LogLik Df Chisq Pr(>Chisq)
#> 1 4
#> 2 5 1
summary(betar1)
#>
#> Call:
#> betareg::betareg(formula = y ~ x1 + x2 + x3, data = test)
#>
#> Standardized weighted residuals 2:
#> Min 1Q Median 3Q Max
#> -5.3207 -0.5882 0.0920 0.6768 3.4378
#>
#> Coefficients (mean model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.69631 0.03274 -21.265 <2e-16 ***
#> x1 0.02527 0.03124 0.809 0.419
#> x2 0.01616 0.03239 0.499 0.618
#> x3 -0.01437 0.03161 -0.455 0.649
#>
#> Phi coefficients (precision model with identity link):
#> Estimate Std. Error z value Pr(>|z|)
#> (phi) 3.0124 0.1215 24.79 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Type of estimator: ML (maximum likelihood)
#> Log-likelihood: 195.7 on 5 Df
#> Pseudo R-squared: 0.001222
#> Number of iterations: 19 (BFGS) + 1 (Fisher scoring)
summary(betar2)
#>
#> Call:
#> betareg::betareg(formula = y ~ x1 + x2 + x1:x2 + x3, data = test)
#>
#> Standardized weighted residuals 2:
#> Min 1Q Median 3Q Max
#> -5.1421 -0.5935 0.0881 0.6713 3.4410
#>
#> Coefficients (mean model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.69528 0.03274 -21.238 <2e-16 ***
#> x1 0.02731 0.03131 0.872 0.383
#> x2 0.01725 0.03238 0.533 0.594
#> x3 -0.01461 0.03160 -0.462 0.644
#> x1:x2 -0.03864 0.03213 -1.203 0.229
#>
#> Phi coefficients (precision model with identity link):
#> Estimate Std. Error z value Pr(>|z|)
#> (phi) 3.0169 0.1217 24.78 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Type of estimator: ML (maximum likelihood)
#> Log-likelihood: 196.5 on 6 Df
#> Pseudo R-squared: 0.002756
#> Number of iterations: 14 (BFGS) + 2 (Fisher scoring)
summary(fracr1)
#>
#> Call:
#> glm(formula = y ~ x1 + x2 + x3, family = quasibinomial("logit"),
#> data = test)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -0.91244 -0.46429 -0.07879 0.35388 1.42890
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.692084 0.033463 -20.682 <2e-16 ***
#> x1 0.006585 0.033329 0.198 0.843
#> x2 0.022004 0.034566 0.637 0.525
#> x3 -0.018612 0.033743 -0.552 0.581
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for quasibinomial family taken to be 0.2481028)
#>
#> Null deviance: 270.95 on 999 degrees of freedom
#> Residual deviance: 270.76 on 996 degrees of freedom
#> AIC: NA
#>
#> Number of Fisher Scoring iterations: 3
summary(fracr2)
#>
#> Call:
#> glm(formula = y ~ x1 + x2 + x1:x2 + x3, family = quasibinomial("logit"),
#> data = test)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -0.92040 -0.46247 -0.08196 0.34940 1.42460
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.69121 0.03348 -20.648 <2e-16 ***
#> x1 0.00929 0.03345 0.278 0.781
#> x2 0.02304 0.03457 0.667 0.505
#> x3 -0.01932 0.03376 -0.572 0.567
#> x1:x2 -0.03525 0.03430 -1.028 0.304
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for quasibinomial family taken to be 0.2481442)
#>
#> Null deviance: 270.95 on 999 degrees of freedom
#> Residual deviance: 270.50 on 995 degrees of freedom
#> AIC: NA
#>
#> Number of Fisher Scoring iterations: 3
summary(lmtest::lrtest(betar1,betar2))
#> #Df LogLik Df Chisq Pr(>Chisq)
#> Min. :5.00 Min. :195.7 Min. :1 Min. :1.465 Min. :0.2262
#> 1st Qu.:5.25 1st Qu.:195.9 1st Qu.:1 1st Qu.:1.465 1st Qu.:0.2262
#> Median :5.50 Median :196.1 Median :1 Median :1.465 Median :0.2262
#> Mean :5.50 Mean :196.1 Mean :1 Mean :1.465 Mean :0.2262
#> 3rd Qu.:5.75 3rd Qu.:196.3 3rd Qu.:1 3rd Qu.:1.465 3rd Qu.:0.2262
#> Max. :6.00 Max. :196.5 Max. :1 Max. :1.465 Max. :0.2262
#> NA's :1 NA's :1 NA's :1
summary(lmtest::lrtest(fracr1,fracr2))
#> #Df LogLik Df Chisq Pr(>Chisq)
#> Min. :4.00 Min. : NA Min. :1 Min. : NA Min. : NA
#> 1st Qu.:4.25 1st Qu.: NA 1st Qu.:1 1st Qu.: NA 1st Qu.: NA
#> Median :4.50 Median : NA Median :1 Median : NA Median : NA
#> Mean :4.50 Mean :NaN Mean :1 Mean :NaN Mean :NaN
#> 3rd Qu.:4.75 3rd Qu.: NA 3rd Qu.:1 3rd Qu.: NA 3rd Qu.: NA
#> Max. :5.00 Max. : NA Max. :1 Max. : NA Max. : NA
#> NA's :2 NA's :1 NA's :2 NA's :2
Created on 2022-11-16 by the reprex package (v2.0.1)
bensel
November 17, 2022, 6:09pm
3
Thank you for pulling out the details on the results. But why isn't a log likelihood being produced for the fractional logistic regressions? I'm not clear on that.
Sorry, I misunderstood the question. Inexplicably, the print method for glm
doesn't return log likelihood. Maybe part of an elaborate hazing ritual
logLik(model)
1 Like
bensel
November 17, 2022, 9:51pm
5
EDIT: glm does give non-NA log likelihoods for binomial family regressions, but not for quasibinomial.
logLik for glm just gives NA. I wonder if the glm function itself does not produce a log likelihood in the first place.
The R language sometimes feels like an elaborate hazing ritual, but I still love it.
glm()
does yield to logLik()
for binomial models, such as
chdfit <- glm(CHD ~ AGE, data = chdage, family = binomial(link ="logit"))
logLik(chdfit)
## 'log Lik.' -53.67655 (df=2)
Of course getting confidence intervals on that is a whole other bucket of fun. My example is taken from here .
I feel as you do. I liken it to being an ant trying to chomp my way through the Amazon.
1 Like
bensel
November 18, 2022, 12:19am
7
I think I found the answer. Fractional logistic regression is fit using quasi-maximum likelihood. While you get a quasi-likelihood out of this, for whatever reason it is suppressed in glm (See this vignette ). There might be work-arounds (the linked vignette is about that).
1 Like
Dr. Bolker is a benefactor to all mankind for that cogent vignette. Good find. Thanks.
1 Like
system
Closed
December 30, 2022, 4:39am
9
This topic was automatically closed 42 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.