I've noticed we don't have many questions here about Rcrawler, and I thought it's a great tool to scrape website. However, I have a problem telling it to scrape multiple websites as it can only do 3 currently. Please let me know if anyone has experience with this issue. Thanks.
I've tried putting all URLs in a list/vector, but it still doesn't do it. Here are my scraping codes to get the title, description, and keywords of the websites.
Rcrawler(Website = c("http://www.amazon.com", "www.yahoo.com", "www.wsj.com"),
no_cores = 3, no_conn = 3, MaxDepth = 0,
ExtractXpathPat = c('/html/head/title', '//meta[@name="description"]/@content', '//meta[@name="keywords"]/@content'),
PatternsName = c("Title", "Description", "Keywords"), saveOnDisk = FALSE)
If I have more than 3 websites, it will give me this error:
Error in Rcrawler(Website = c("http://www.amazon.com", "www.yahoo.com", :
object 'getNewM' not found