I am attempting to scrape the results from google scholar but I'm running into some issues.
- It only does the first 10 results and then I'm not sure how to go to the next page without copying and pasting each url for each page
- I was able to get the elements i need from the webpage, but I am tying to make a table/dataframe with 3 columns and as many rows as are results.
This is the code I have so far:
library(rvest)
#READ URL
page <- read_html("https://scholar.google.com/scholar?start=0&q=Eriophyidae&hl=en&as_sdt=0,6")
#TOTAL NUMBER results of search
results <- page %>% html_nodes (".gs_ab_mdw") %>% html_text()
#Name, authors, description
name_ref_descr<- page %>% html_nodes (".gs_rt,.gs_a,.gs_rs") %>% html_text()
#create a table with 3 columns
mite_google_db <- data.frame(matrix(ncol=3, nrow = 20))
x<-c("title","authors_year","description")
colnames(mite_google_db)<-x
#write to .csv
write.csv(name_ref_descr,"C:\\Users\\Monkey\\Documents\\R\\First_10.csv")