Help with StatsBomb free football data

Hi all, total noob to R so apols if I don't provide the right info first time, but I'm hoping someone can help me work out what I'm doing wrong when trying to pull some of their free data following the instructions here: https://statsbomb.com/wp-content/uploads/2021/11/Working-with-R.pdf

It looks like I have managed to install all the packages they recommend (tidyverse, devtools, ggplplot2, StatsBombR) as they are all showing with ticks next to them in my packages window, but I'm running into error messages when I'm trying to follow these instructions:

FreeCompetitions() - This shows you all the competitions that are available as free data

If you want to store the output of this (or any other functions) so you can pull it up at any time, instead of just having it in the R console, you can run something like the following:

Comp <- FreeCompetitions(). Then, anytime you run Comp (or whatever word you choose to store it under, you can go with anything), you will see the output of FreeCompetitions().

Matches <- FreeMatches(Comp) - This shows the available matches within the competitions chosen

StatsBombData <- StatsBombFreeEvents(MatchesDF = Matches, Parallel = T) - This pulls all the event data for the matches that are chosen.

When I run the FreeCompetitions() part it seems too pull a load of data in the console but then for the rest of it I'm getting this:

Comp <- FreeCompetitions()
[1] "Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please credit StatsBomb as your data source when using the data and visit our website to obtain our logos for public use."
Matches <- FreeMatches(Comp)
[1] "Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please credit StatsBomb as your data source when using the data and visit Media Pack | StatsBomb to obtain our logos for public use."
StatsBombData <- StatsBombFreeEvents(MatchesDF = Matches, Parallel = T)
[1] "Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please credit StatsBomb as your data source when using the data and visit our website to obtain our logos for public use."
Error in if (MatchesDF == "ALL") { : the condition has length > 1

Any advice would be much appreciated!

thanks

1 Like

What happens if you execute the examples in the pdf one by one?

You did not include the filter in this example, so it's possible you were subsequently attempting to pull too much data:

Comp <- FreeCompetitions() %>%
filter(competition_id==37 & season_name=="2020/2021")

(I haven't tested this myself).

Yes actually you're right I didn't include the filter - but I've tried it with that now and it still comes up with the same error message when I try and run this bit:

StatsBombData <- StatsBombFreeEvents(MatchesDF = Matches, Parallel = T)
Error in if (MatchesDF == "ALL") { : the condition has length > 1

I'm not sure what the MatchesDF part of the code would be intended to do?

I've noticed there is a slightly updated version of the pdf:
Working with R (statsbomb.com)

Either way you should probably contact the author (on the last page) or else contact them via their github page for help.

Thanks for spotting that, I'll check through and see if there's anything different that would explain it and get it touch with them if not. Cheers

Hi - I am experiencing the same issue and wondering if a resolution was ever found.

I think I am understanding the error correctly. The condition within the if statement is checking if Matches, a data frame object, is equal to a string ("ALL"). This returns a Matches-sized data frame of trues and falses, but the if statement requires a single boolean value, e.g. length of 1. The length() of a dataframe is number of columns, which in this case is > 1.

My only thought is that I somehow have an outdated version of the function even though I installed the package today and have tried updating the package, since others have used the command successfully. However, the version shown in my Packages tab is 0.1.0, the same version shown in the screenshot from the official StatsBomb guide that martin linked above.

Did you ever contact or hear back from the author?

Hi yes I did actually, had been intending to post but hadn't got round to it yet. This was the response and it sorted it out for me (so basically you were right)

"There is a function called free_allevents that I've added that replaces that one, so update your package and use that instead. It works the exact same way."

1 Like

Thanks for the reply! I'll update my package.

I also found another way to do it - when you just type StatsBombFreeEvents (no function call parentheses) into the console you can see the source code of the function. It's a structure of if statements checking if MatchesDF == "All" and if Parallel is True; I just extracted and used the code section that applies to a provided subset of matches and Parallel is True, so the condition isn't checked and the error isn't thrown. That section is:

            cl <- makeCluster(detectCores())
            registerDoParallel(cl)
            events.df <- foreach(i = 1:dim(MatchesDF)[1], .combine = bind_rows, 
                .multicombine = TRUE, .errorhandling = "remove", 
                .export = c("get.matchFree"), .packages = c("httr", 
                  "jsonlite", "dplyr")) %dopar% {
                get.matchFree(MatchesDF[i, ])
            }
            stopCluster(cl)

Interesting! Thanks for sharing :+1:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.