Loops FOR : My loop for doesn't work well (not looping as I want to)

Hi everyone,
Here is my problem :
I would like to scrap Google PAA (people also ask) for several keywords with R.
I've got a problem with my R loop.
It does not take all my keyword in the loop ...
Could you please help me ?
Thank you in advance

Here is my code :

mykeyword <- c("canape","ps4","macbook")

my_user_agent <- "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0"

#Lancement
url_to_check <- paste0("https://www.google.com/search?q=",mykeyword[1],"&ie=utf-8&oe=utf-8&client=firefox-b")
PAA <- GET(url_to_check, user_agent(my_user_agent)) %>%
htmlParse(encoding = "UTF-8") %>%
xpathSApply('//div[/*]/g-accordion-expander/div/div', xmlValue)

for (i in 1: length(mykeyword)) {
url_to_check <- paste0("https://www.google.com/search?q=",mykeyword[i],"&ie=utf-8&oe=utf-8&client=firefox-b")
PAA_2 <- GET(url_to_check, user_agent(my_user_agent)) %>%
htmlParse(encoding = "UTF-8") %>%
xpathSApply('//div[/*]/g-accordion-expander/div/div', xmlValue)
df = rbind(PAA,PAA_2)
}

I think you want

PAA = rbind(PAA,PAA_2)

Also you are checking mykeyword[1] twice.

This is not very good form though. It's better to make an empty list, then populate it in your loop, and bind it afterwards. This way you avoid slowdown due to PAA growing on each loop iteration. Like this

PAA <- vector("list", length(mykeyword))
for (i in 1: length(mykeyword)) {
  url_to_check <- paste0("https://www.google.com/search?q=",mykeyword[i],"&ie=utf-8&oe=utf- 
    8&client=firefox-b")
  PAA[[i]] <- GET(url_to_check, user_agent(my_user_agent)) %>%
    htmlParse(encoding = "UTF-8") %>%
    xpathSApply('//div[/*]/g-accordion-expander/div/div', xmlValue)
}
PAA <- dplyr::bind_rows(PAA)

Thank you so much, it works :wink:

1 Like

You're welcome. And thank you for saying thank you!

1 Like

Last question : How can I change colnames ? By default its V1, V2, V3

Thank you

Easiest way is

names(PAA) <- c("name1", "name2", ...)
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.