Hi,
How can I extract data daily from a secured website that needs a username and password into R.
Any help is really appreciated.
Thank you
Can you provide more details, please?
Hi Ferran,
I'm working on a project where I have to export data from a password-protected website. I remember that we used to pull data (available in Excel) from another password-protected website using R program (There was only one report to be pulled).
For this project, there are 13 reports from 2 webpages, same website (Microsoft Access) that need to be pulled.
I was just wondering if these reports could be pulled in R?
Also, if this could be automated since it's an everyday task?
Thank you
This sound that you need R Selenium for make this or try with rvest.`
See this link for R selenium:
Sorry, I have never done that before. However, it must be possible to do so. The post from @M_AcostaCH seems to be useful for that!
Good luck
An alternative to RSelenium is chromote. I found it much easier to set up than Selenium (all you need is Chrome installed)
Thank you adamsma. Are there any tutorials on chromote?
I'm not aware of any specific tutorials, but the README is pretty extensive. If you Google/search for "Chrome DevTools Protocol" tutorials there may be some out there for other languages. Generally, the translation from JS to R is using $
instead of .
to access the object methods.
When I've processed login pages before, my general approach is to use the session's Runtime$evalutate
method to just set the value of the id/password text boxes with javascript. Something along the lines of
session$Runtime$evaluate(
glue::glue('document.querySelector("#input_username").value = "{Sys.getenv("USER_ID")}"'
)
session$Runtime$evaluate(
glue::glue('document.querySelector("#input_password").value = "{Sys.getenv("USER_PASSWORD")}"')
)
session$Runtime$evaluate(
'document.querySelector("#login_button").click()'
)
with my id and password stored in environment variables. The DOM element ids may be different for your specific log in page