Help with Titanic Dataset: Prediction Values are Wrong

I have a titanic dataset and I want to predict the values of 2 passengers on whether they will survive the titanic.

These are the first 2 passengers data from my dataset:

However, the results of my predicted values differ from a reference answer key.
My predicted values for Mr. Owen and Mrs. John respectively are 0.5908706 and 0.3686942.
But my reference key answers respectively are 0.08557347 (Mr. Owen) and 0.9001295 (Mrs. John)

Where has it gone wrong given my code written below?

titanic<-read.csv("titanic.csv")
table(titanic$Sex)
titanic$Sex<-ifelse(titanic$Sex %in% "male",1,2)
str(titanic)

glmfit<-glm(Survived ~ Pclass + Sex + Age + Siblings.Spouses.Aboard + Parents.Children.Aboard + Fare, data=titanic, family = 'binomial')
summary(glmfit)

glmfit1<-glm(Survived ~ Pclass + Sex + Age + Siblings.Spouses.Aboard, data=titanic, family = 'binomial')
summary(glmfit1)

#Passenger 1 - Mr Owen
newdata = data.frame(Pclass=3, Sex=2, Age=22, Siblings.Spouses.Aboard=1)
predict(glmfit1, newdata, type="response")

#Passenger 2 - Mrs John
newdata2 = data.frame(Pclass=1, Sex=1, Age=38, Siblings.Spouses.Aboard=1)
predict(glmfit1, newdata2, type="response")

dput data

structure(list(Survived = c(0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L), Pclass = c(3L, 1L, 
3L, 1L, 3L, 3L, 1L, 3L, 3L, 2L, 3L, 1L, 3L, 3L, 3L, 2L, 3L, 2L, 
3L, 3L), Name = c("Mr. Owen Harris Braund", "Mrs. John Bradley (Florence Briggs Thayer) Cumings", 
"Miss. Laina Heikkinen", "Mrs. Jacques Heath (Lily May Peel) Futrelle", 
"Mr. William Henry Allen", "Mr. James Moran", "Mr. Timothy J McCarthy", 
"Master. Gosta Leonard Palsson", "Mrs. Oscar W (Elisabeth Vilhelmina Berg) Johnson", 
"Mrs. Nicholas (Adele Achem) Nasser", "Miss. Marguerite Rut Sandstrom", 
"Miss. Elizabeth Bonnell", "Mr. William Henry Saundercock", "Mr. Anders Johan Andersson", 
"Miss. Hulda Amanda Adolfina Vestrom", "Mrs. (Mary D Kingcome) Hewlett", 
"Master. Eugene Rice", "Mr. Charles Eugene Williams", "Mrs. Julius (Emelia Maria Vandemoortele) Vander Planke", 
"Mrs. Fatima Masselmani"), Sex = c(1, 2, 2, 2, 1, 1, 1, 1, 2, 
2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2), Age = c(22, 38, 26, 35, 35, 
27, 54, 2, 27, 14, 4, 58, 20, 39, 14, 55, 2, 23, 31, 22), Siblings.Spouses.Aboard = c(1L, 
1L, 0L, 1L, 0L, 0L, 0L, 3L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 4L, 
0L, 1L, 0L), Parents.Children.Aboard = c(0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 2L, 0L, 1L, 0L, 0L, 5L, 0L, 0L, 1L, 0L, 0L, 0L), 
    Fare = c(7.25, 71.2833, 7.925, 53.1, 8.05, 8.4583, 51.8625, 
    21.075, 11.1333, 30.0708, 16.7, 26.55, 8.05, 31.275, 7.8542, 
    16, 29.125, 13, 18, 7.225)), row.names = c(NA, 20L), class = "data.frame")

Google Sheet link to the full dataset:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.