Bet on tennis matches

Hi,
I'm dealing with tennis betting. I got a prediction model of win or loss for a certain player with 90% Accuracy which you can see here:

Load packages:

library(caret)
library(h2o)
library(dplyr)

Part 1: Data

The _atp data.csv dataset can be obtained by running the following code: ATPBetting/main.py at master · edouardthom/ATPBetting · GitHub using the following .xls files:

Load data:

df <- read.csv("atp_data.csv")

colnames(df)
"ATP" "Location" "Tournament" "Date" "Series" "Court" "Surface" "Round" "Best.of" "Winner"
"Loser" "WRank" "LRank" "Wsets" "Lsets" "Comment" "PSW" "PSL" "B365W" "B365L"
"elo_winner" "elo_loser" "proba_elo"

Part 2: Modeling

df1<-df %>%
  filter(Winner=="Hajek J."|Loser=="Hajek J.")

vet <- rep(NA,nrow(df1))

vet[which(df1$Winner=="Hajek J.")]<-1
vet[-which(df1$Winner=="Hajek J.")]<-2

df1$target<-as.factor(vet)

h2o.init()
n <-nrow(df1)
training <- df1[1:round(n*0.7),]
testing <- df1[round(n*0.7):n,]
train <-as.h2o(training)  
y <- "target"
x <- setdiff(names(train), y)

aml <- h2o.automl(x = x, y = y,
                  training_frame = train,
                  max_runtime_secs = 120)


model <- aml@leader
model
p2 = h2o.predict(model, newdata=as.h2o(testing))
df3<- as.data.frame(p2)
confusionMatrix(df3$predict,testing$target)

but I don't understand where to find the data to make predictions. Here is data for the next games: http://livescore.tennis-data.co.uk/ but the variables of the dataset atp_data.csv to make predictions for a certain player where can I find them? Thanks

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.