polr - choose of method and validating the model

Charlie_CS · November 17, 2022, 10:21am

Hi R community,

I am doing ordered logistic regression in R and I am using the polr() function from the MASS package.

First, I am becoming in drought of whatever or not I should specify method. Do you know what method R takes as a default when running the polr function? I can't find this information.

Secondly, for validating the ordered logistic regression how is it easiest to go about doing that. I have only done it for linear regression, and there you can easily plot the lm() model and get 4 plots that are nicely showing you if you fulfill the model assumptions or not.

Thank you in advance.

Best,
Charlie

technocrat · November 17, 2022, 9:50pm

The MASS::polr function signature is

polr(formula, data, weights, start, ..., subset, na.action,
contrasts = NULL, Hess = FALSE, model = TRUE,
method = c("logistic", "probit", "loglog", "cloglog", "cauchit"))

Read that as there are the five methods taken as a default.

Go on to the Arguments section

method: logistic or probit or (complementary) log-log or cauchit (corresponding to a Cauchy latent variable)

Moving on to details

The vcov method uses the approximate Hessian: for reliable results the model matrix should be sensibly scaled with all columns having range the order of one

and other nuggets all within help(polr).

Finally, a couple of text books are provided for further reference.

This answer is not a rtfm, but a reminder that closely reading the help pages can be a big help when trying to navigate. As a beginner I had an aversion to reading them. I though that help() needed its own help page. I was abashed to find out that it does— help(help). It was only when I began to think of R as it presents to the user primarily as a functional language that I realized that the help should be read as presenting the arguments for f(x) = y and the return values. Since then, I've had a much easier time.

I run through the basics of assessing glm results here.

Charlie_CS · November 18, 2022, 8:30am

Thank you a lot for the comments.

I still find it hard with understanding which method that is chosen. You say all 5 methods are defaults - Dose R justify the method it self based on the data put into the model?

For validating the model . In the link you send, lm() is used for the model and thus you can simply plot(model) and get the 4 plots: 1, Residuals vs. fitted, 2. Normal Q-Q, 3. Scale-location (standardized residuals), and 4. residuals vs. leverage. Any idea of how to get that with polr()?

Thanks again.

technocrat · November 19, 2022, 2:37am

The use of lm() on the model was to illustrate that the diagnostic plots do not work with glm() models using family = binomial.

Statistics is hard, and there is no magic wand to unlock the right answer. R provides an amazing set of tools but those tools aren't enough to without learning how to use them.

Charlie_CS · November 21, 2022, 9:07am

I do not see how this answer apply to me.

I have never asked nor looked for a magic wand, so please don't indicate that. I have asked 2 questions, and if you can't answer or don't want to use the time to answer, please just let the question slide.

system · January 2, 2023, 9:07am

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.