Scraping with RSelenium is the best way for this page? --Selenium message:invalid selector

** Hi community**

Im want make search in this page and download the Passport data of many PI numbers.

Im try to make the automatic form because are 15.000 PI numbers (dataPI), and all Passport data I need.

Im try with Rselenium but when put click in the Passport selector show me an error:

Selenium message:invalid selector: Compound class names not permitted
  (Session info: chrome=107.0.5304.107)
dataPI <- structure(list(NUMPI = c("PI 092872", "PI 136683", "PI 136692", "PI 136718", "PI 136720", "P 136730")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
 
# Librerias para trabajar ------------------------------------------------
 library(RSelenium)
 library(rvest)
 library(rJava)
 library(dplyr)
 library(purrr)
 
 # Pasos iniciales---------------------------------------------------------
 
 # SIEMPRE SE DEBE HACER ESTO [NO OLVIDAR] ----------------------------------------------
 #Ejecutar en el CMD (simbolo del sistema)  # Se debe tener abierto cmd durante todo el proceso
 #cd C:\Users\sarai\Documents\Miguel\
 #java -jar selenium-server-standalone-3.2.0.jar -port 4444
 
 # Establecer conexicción --------------------------------------------------
 remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "chrome")
 remDr$open()
 Sys.sleep(2)
 
 # Navegar -----------------------------------------------------------------
 remDr$navigate(url = "https://npgsweb.ars-grin.gov/gringlobal/search")

 ### Escribir el numero PI------------------------------------------
 search_box <- remDr$findElement(using = 'xpath',  '//*[@id="MainContent_txtSearch"]')
 search_box$clickElement()
 search_box$sendKeysToElement(list("PI 092872", key = "enter"))
 
 # click en [link NUM PI] -----------------------------------------------------
 Sys.sleep(1)
 link <- remDr$findElement(using = 'xpath', "//*[@id='passporttab']") # Show error
 link$clickElement()
# In this part im try with other selector, like `css` but is the same error.

*Im not sure that if is possible make this with rvest for make more easy form.

Tnks

Most large web-facing deliver user-requested results through a paged-Javascript interface that assumes a human clicking on next or equivalent intervention. While it is possible at least in theory to fully automate this, before embarking on that attempt check the source website to see if the data is available through other channels such as ftp or an API.

1 Like

What is the form for check the API in this site and how make this with R?

somes ideas or links @technocrat ?

Is my first time with APIS

I don't know of an R package to search a website to discover if it has an API that's something that has to be done manually. Here's a large collection

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.