Scraping Reviews in R

I am working on a sentiment analysis project. My aim is to scrape all the reviews from the rotten tomatoes website of a particular movie. I have tried to scrape it but it is giving me illegal characters, not the reviews I want. Any suggestion will be highly appreciated.

I am using this function:

dune_movie <- read_html("https://www.rottentomatoes.com/m/dune_2021/reviews")
dune_movie

Output I am getting:

{html_document}
<html lang="en" dir="ltr" xmlns:fb="http://www.facebook.com/2008/fbml" xmlns:og="http://opengraphprotocol.org/schema/">
[1] <head prefix="og: http://ogp.me/ns# flixstertomatoes: http://ogp.me/ns/ap ...
[2] <body class="body no-touch">\n\n        \n\n        <div id="emptyPlaceho ...

Essentially, the resulting object is the structure of the page. You can use xml2::xml_structure function to explore it further. You can also use RSeleniumto help with scraping the website.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.