@Maybellyne, you are perfectly right. There should be 63 rows and I corrected the code - it works now.
Questions 1 and 2
The goal is to scrape 3 web pages, so we need a way to generate the 3 URLs and this is exactly what this code does (this is the code that I corrected actually!):
page_urls <- c("https://wordpress.org/support/plugin/easyrecipe/reviews/", paste0("https://wordpress.org/support/plugin/easyrecipe/reviews/page/", 2:3))
It's primarily a call to the c() function. The paste0() function is called inside the c() function in order to generate the URLs for page 2 and page 3 (hence the use of 2:3). If you want to do it manually, this is how you would do it:
c("https://wordpress.org/support/plugin/easyrecipe/reviews/", "https://wordpress.org/support/plugin/easyrecipe/reviews/page/2", "https://wordpress.org/support/plugin/easyrecipe/reviews/page/3")
which you would agree could be very tedious if you had more than 3 web pages to scrape.
Question 3
You are right. This is a code that I used for personal testing and I forgot to remove it from the final script before pasting it here. You can completely disregard it.
So this is the updated code as well as the final result. Once again, do not hesitate if you have more questions:
# Load needed libraries ----
library(rvest)
library(dplyr)
library(stringr)
library(purrr)
# Build functions for scraping reviews ----
# > Function for scraping topic names and their urls
scrape_topic_url <- function(page_url){
page_html <- read_html(page_url)
topic_names <- page_html %>%
html_nodes(css = ".bbp-topic-permalink") %>%
html_text() %>%
str_squish()
topic_urls <- page_html %>%
html_nodes(css = ".bbp-topic-permalink") %>%
html_attr(name = "href")
tibble(topic = topic_names, topic_url = topic_urls)
}
# > Function for scraping the comments/reviews
scrape_topic_thread <- function(topic_url){
topic_html <- read_html(topic_url)
topic_html %>%
html_nodes(css = ".bbp-topic-content") %>%
html_text() %>%
str_squish()
}
# Perform the scraping task ----
page_urls <- c("https://wordpress.org/support/plugin/easyrecipe/reviews/", paste0("https://wordpress.org/support/plugin/easyrecipe/reviews/page/", 2:3))
master <- map_dfr(page_urls, scrape_topic_url) %>%
mutate(content = map_chr(topic_url, scrape_topic_thread))
master
# A tibble: 63 x 3
topic topic_url content
<chr> <chr> <chr>
1 Terrible https://wordpress.org/support/topic/terrible-161/ "If I could give zero stars, I would. They have failed to answer any support questions I h~
2 Not working with WP 5.03 https://wordpress.org/support/topic/not-working-w~ "I’m very disappointed that Easy Recipe hasn’t updated to be compatible with WP 5.03. Limi~
3 easyrecipe plugin https://wordpress.org/support/topic/easyrecipe-pl~ "Have been using for years. Love this plugin"
4 Don’t do it! ZERO support for~ https://wordpress.org/support/topic/dont-do-it-ze~ "Bought slEazyRecipe Plus for the extra features, updates, and support. The plugin update ~
5 Worst Ever Don’t Buy Premium https://wordpress.org/support/topic/worst-ever-do~ "I’ve sent several repeated emails to get the updated version after buying the premium. Th~
6 Does not work https://wordpress.org/support/topic/does-not-work~ "I am already using Recipe taxonomy on my website and plugin does not work =("
7 Stay away! Don’t buy the PRO ~ https://wordpress.org/support/topic/stay-away-don~ "There is no support and the author is using this free plugin to scam people into buying t~
8 After Using this for 3 Yrs Be~ https://wordpress.org/support/topic/after-using-t~ "I have been using the paid version of this plugin for three years – my original developer~
9 The worst support ever https://wordpress.org/support/topic/the-worst-sup~ "Support tickets are not being red. Bought the plugin twice! Licence key is not shown in e~
10 No Support and No Longer Upda~ https://wordpress.org/support/topic/no-support-an~ "I’m not sure what’s happened because this plugin used to be great. When you submitted a s~
# ... with 53 more rows