Plotting probit regression with ggplot2

ggplot2

#1

I want to plot probit regression model with ggplot2.

I have been able to plot logit model with ggplot2 but unable to do for probit regression.

Here is reproducible example for logit model:

library(tidyverse)
#> -- Attaching packages ------------------- tidyverse 1.2.1 --
#> v ggplot2 2.2.1     v purrr   0.2.4
#> v tibble  1.4.2     v dplyr   0.7.4
#> v tidyr   0.8.0     v stringr 1.3.0
#> v readr   1.1.1     v forcats 0.3.0
#> -- Conflicts ---------------------- tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag()    masks stats::lag()

logr_vm <- glm(vs ~ mpg, data=mtcars, family=binomial(link="logit"))

#summary(logr_vm)

ggplot(mtcars, aes(x=mpg, y=vs)) + geom_point() + 
  stat_smooth(method="glm", method.args=list(family="binomial"), se=TRUE)

How to do similar thing for probit model?
In addition I expect the graph to show the value for 0.5 along x-axis (similar graph as shown below).

Any help on this?


#2

This SO thread might be helpful:


#3

As discussed in the answer linked by Mara, you can set the link function to "probit" to get a probit regression using stat_smooth. Below I've also added a helper function to get the mpg values at various vs probabilities:

# Function to get mpg value for a given vs probability
probX = function(p, model) {
  data.frame(prob=p, 
             xval = (qnorm(p) - coef(model)[1])/coef(model)[2])
}

d = probX(c(0.25,0.5,0.75), logr_vm)

ggplot(mtcars, aes(x=mpg, y=vs)) + 
  geom_point() + 
  stat_smooth(method="glm", method.args=list(family=binomial(link="probit"))) +
  geom_segment(data=d, aes(x=xval, xend=xval, y=0, yend=prob), colour="red") +
  geom_segment(data=d, aes(x=rng[1], xend=xval, y=prob, yend=prob), colour="red") +
  geom_text(data=d, aes(label=round(xval, 1), x=xval, y=-0.03), size=3, colour="red") +
  theme_bw()

Rplot150

You can also calculate the fitted values and confidence intervals outside of ggplot, which will give you more flexibility if you create models with multiple covariates. For example:

# Regression model
logr_vm <- glm(vs ~ mpg, data=mtcars, family=binomial(link="probit"))

# Get predictions on link scale
pred = data.frame(mpg=seq(min(mtcars$mpg), max(mtcars$mpg), length=100))
pred = cbind(pred,
                        predict(logr_vm, newdata=pred, type="link", se.fit=TRUE)[c("fit", "se.fit")])

# Calculate fit and confidence interval on probability scale
pred$upr = logr_vm$family$linkinv(pred$fit + 1.96*pred$se.fit)  # Equivalent to pnorm(pred$fit + 1.96*pred$se.fit) for the probit link
pred$lwr = logr_vm$family$linkinv(pred$fit - 1.96*pred$se.fit)
pred$fit = logr_vm$family$linkinv(pred$fit)

# Function to get mpg value for a given vs probability
probX = function(p, model) {
  data.frame(prob=p, 
             xval = (qnorm(p) - coef(model)[1])/coef(model)[2])
}

# Set plot x-range
rng = range(mtcars$mpg) + diff(range(mtcars$mpg))*c(-0.02,0.02)

ggplot(probX(c(0.25,0.5,0.75), logr_vm)) + 
  geom_ribbon(data=pred, aes(mpg, ymin=lwr, ymax=upr), fill="grey90") +
  geom_line(data=pred, aes(mpg, fit), size=1, colour="blue") +
  geom_point(data=mtcars, aes(mpg, vs)) + 
  geom_segment(aes(x=xval, xend=xval, y=0, yend=prob), colour="red", linetype="11", size=0.3) +
  geom_segment(aes(x=rng[1], xend=xval, y=prob, yend=prob), colour="red", linetype="11", size=0.3) +
  geom_text(aes(label=round(xval, 1), x=xval, y=-0.03), size=3, colour="red") +
  scale_x_continuous(limits=rng, expand=c(0,0)) +
  theme_bw()

Rplot152


#4

Additionally to what both @mara and joels added about the SO thread and the helper function; there is a package ecotox that allows the user to easily calculate LCs and LTs using a probit or logit model for percentages from 1-99 with correct confidence limits following D.J. Finney's 1971 book on probit and logit models. Check it out!