Linear probability model ?logit and Probit

I am pretty new to R and this probably wouldn't be a huge issue to some one who is familiar with R.
I am interested in plotting LPM Logit and Probit models for a data set but I can't figure out how to make my dependent variable a probability value.

I am trying to plot whether the probablity someone is a part of the labor force or not by using age, gender, education and access to a personally paid internet service.

This is a problem in logistic regression, because the outcome is binary. Let

  1. Y equal labor force participation 1/0
  2. age, continuous
  3. gender female 1/0
  4. gender male 1/0
  5. education level_0 1/0
    ... (etc. for each category)
  6. access 1/0


fit <- glm(Y ~ age + female + level_x + access, method = "binary")

will yield a fitted model from which the log likelihood can be derived. See this post

If I understand your question, you could do the following:

  1. Linear probability model. In this case your outcome variable (Y in the examples below) should be coded as numeric, where not being in the labor force is coded as zero and being in the labor force is coded as 1. Then you run a standard linear regression. In R, that would be (using generic column names):

    m1 = lm(Y ~ x1 + x2 + x3 + x4, data=my_data)

    The predictions from this model are the "probability" of being employed (i.e., the "probability" of Y=1) but note that since we haven't constrained the values of Y, this type of model can produce predicted probabilities less than 0 and greater than 1. To get the model predictions, you would do:

  2. Logistic regression. in this case your outcome can be coded as numeric 0/1 or as a factor with two categorical levels. To run a logistic regression, you would do the following:

    m2 = glm(Y ~ x1 + x2 + x3 + x4, data=my_data, family=binomial)

    In this case, the regression transforms the outcome to be the log-odds of being employed. To get the predicted probability of being employed, you would do:

    predict(m2, type="response")
  3. Probit regression. You can run a probit regression in a similar way as a logistic regression, but in this case you need to specify the probit (cumulative normal) link function:

    m3 = glm(Y ~ x1 + x2 + x3 + x4, data=my_data, family=binomial(link="probit"))

    To get the predicted probability of being employed:

    predict(m3, type="response")

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.