unable to scrape rstudio::conf agenda?

Hi all,

So I thought I would teach myself a little bit about web scraping today and chose the rstudioconf agenda as my target. But it seems to be the case that the company hosting the event database (cvent) has architected their application soas to render scraping impossible, or at least very difficult without tricks I do not know.

Does anyone know if these data are available elsewhere or do you know a trick for scraping cvent content?

Sort of ironic data for a data science conference are not readily available! LOL

Thanks

Rob

2 Likes

We could crowd source it into a google sheet. If we had 3-5 people it wouldn't take too long.

Yes. All that was easy to get was the session type, time, and title. It does seem designed to make scraping difficult.

I copied the raw data into a google sheet. I'm planning on making it pretty this weekend but here's a link to the raw data if it will help anyone.

Conference Agenda Data Raw

I've scraped this data a couple of days ago.
The code is available in the R folder of this repo: https://github.com/mitchelloharawild/rstudioconf20-helper
I have also made a shiny app (function over form) for searching and adding sessions to a Google Calendar: https://shiny.mitchelloharawild.com/rstudioconf20

2 Likes

I wanted to be able to see which sessions are happening concurrently. I'll finish up day 2 over the weekend. I will also try to adjust the table width, but hopefully this is useful.

Agenda By Session