Making predictions with ordinal regression

I am using an ordinal regression model to calculate the probability of the outcome, predict (pred) as a binomial variable, predicted and observed as a factor variables. Below is a sample of the data set and the script. I received an error message (below) that "pred must be size 20 or 1, not 180." The data set has 20 observations. I believe that "pred = if_else(prob >= 0.944, 1, 0) is adding 160 observations. Maybe, they are missing values. In the pred calculation (line 23), I inserted "na.rm=TRUE" or "na.action= na.exclude" and received the error message "problem computing 'pred' caused by error in 'if_else(): unused argument "na.rm=TRUE" or "na.action = na.exclude." Could someone help me out? I will appreciate it. Thanks.

Creating the dataset

column1<-c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
column2<-c("A", "B", "B+", "A-", "A", "B", "C-", "C", "B", "A", "D-", "B", "A-", "A", "D+", "C", "A", "C+", "B", "A")
column3<-c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1)
column4<-c(0.0001, 0.0001, 0.0008, 0.0002, 0.002, 0.0013, 0.0004, 0.0039, 0.0044, 0.001, 0.0061, 0.0009, 0.0035, 0.0003, 0.0026, 0.0044, 0.0039, 0.0021, 0.0001, 0.0003)
column5<-c(0.9375, 0.9375, 0.7571, 0.7708, 0.9, 0.9, 0.4167, 0.8125, 0.8542, 0.9375, 0.8438, 0.7944, 0.9107, 0.9031, 0.7647, 0.5833, 0.7321, 0.9085, 0.9397, 0.9643)
column6<-c(0.95, 0.7879, 0.9333, 1.0000, 0.9906, 0.9833, 0.8333, 0.85, 1.0000, 0.9688, 0.975, 1.0000, 0.98, 0.9407, 0.9951, 0.905, 1.0000, 0.9549, 0.7879, 0.9)
column7<-c(0.9492, 0.7843, 0.8644, 1.0000, 0.9576, 0.9492, 1.0000, 0.6949, 0.7458, 1.0000, 1.0000, 1.0000, 0.9797, 0.819, 1.0000, 0.9034, 1.0000, 0.7966, 0.7843, 0.7966)
column8<-c(0.0205, 0.0585, 0.0896, 0.0322, 0.0326, 0.0531, 0.0027, 0.0342, 0.076, 0.0072, 0.0011, 0.1226, 0.0156, 0.0404, 0.0108, 0.0456, 0.0143, 0.0132, 0.1161, 0.0045)

testing<-data.frame(Student=column1, Outcome1=column2, Outcome2=column3, Var1=column4, Var2=column5, Var3=column6, Var4=column7, Var5=column8)

Fitting the model to the data

library(MASS)
Model<- polr(as.factor(Outcome1) ~ Var1 + Var2 + Var3 + Var4 + Var5, data = testing, Hess = TRUE)
summary(Model)

Calculating the probability of the outcome, pred, predicted, and observed

library(tidyverse)
testing2<-testing %>%
mutate(prob = predict(Model, type = "prob"),
pred = if_else(prob >= .944, 1, 0),
Predicted= as.factor(pred),
Observed = as.factor(Outcome2)) %>% collect()
View(testing2)

Error in mutate():
! Problem while computing pred = if_else(prob >= 0.944, 1, 0).
x pred must be size 20 or 1, not 180.

Trying to remove missing values

testing2<-testing %>%
mutate(prob = predict(Model, type = "prob"),
pred = if_else(prob >= .944, 1, 0, na.action = na.exclude),
Predicted= as.factor(pred),
Observed = as.factor(Outcome2)) %>% collect()

Error in mutate():
! Problem while computing pred = if_else(prob >= 0.944, 1, 0, na.action = na.exclude()).
Caused by error in if_else():
! unused argument (na.action = na.exclude())

1 Like

The "predict()" call returns a matrix of 20 rows (one for each subject) and 9 columns (one for each variables, A, A-, B, B+, etc). So, the total size is 180. Try the following and examine the output in "prob":

prob <- predict(Model, testing, type = "prob")

You (probably) need to select which of the columns you want to process. Or, alternatively, if you want to process all of them, other tidyverse select rules would need to be used. But, try and get code to process one column working first.

Stephen

Hello Stephen, thank you very much for your feedback and explanations. I followed your suggestion, but still I obtained the same error.

testing2<-testing %>%

  • mutate(prob = predict(Model, testing, type = "prob"),
  •     pred = if_else(prob >= .944, 1, 0),
    
  •     Predicted= as.factor(pred),
    
  •     Observed = as.factor(Outcome2)) %>% collect()
    

Error: Problem with mutate() column pred.
i pred = if_else(prob >= 0.944, 1, 0).
i pred must be size 20 or 1, not 180.

I do not know any tidyverse select rule to be included in the syntax to calculate "prob." I searched online, but unfortunately I did not find a select rule specifically to be used with the function "predict." If you know any documentation, please let me know. Thanks.

The call to predict() with type="prob" returns the probability in each class (A, A+ etc). Probably you want the actual prediction, in which case you want something like:

testing <- bind_cols(testing,pred=predict(Model, testing))
testing$iscorrect <- testing$Outcome1 == testing$pred

You should have a column "pred" with the predicted grade and a final column "iscorrect" that is TRUE when the predicted grade is the same as the predicted grade, or FALSE otherwise. Have a look at "testing" after the above change and see if that gives you what you want.

Stephen

Stephen, yes, it did help. I already calculated the probability of pred. Thank you so much for your time and help.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.