Best practice for scheduling (scraping) scripts

What is considered as best practice for scheduling R-scraping -scripts or R-scripts in generall?
Once i defined my scraping function that scrapes www.exampledomain. de for certain inforamtion, i want to run that script every hour: How would i approach such matter?

Any advice appriciated!

1 Like

There are ways to do this directly in R, but I find that when I want my R scripts ran regularly I either use GitHub Actions for running in the cloud (configured by YAML), or I will set crontabs on my local Machine (Mac).

There are R packages that support setting cron jobs, such as {cronR}, but I have not used it myself. Do you want this to run on your local machine? For example, I have have it automated so my computer wakes up at 2:55AM every day, then crontabs runs my script to update any out-of-date R packages.

1 Like

Ideally this should run in an R-based Docker Container on my comapnies Linux server.

If the container is up and running, then I think any of the above options still apply. One is R specific and the other is R-independent.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.