Hello again,
Please, I have written a simple web scraping application, which I want to automate with github actions. I believe I have set everything up correctly, including my YML file, but after committing, my workflow run is still "0". Please find below my YML file:
name: Aljazeera_headlines_scraper
# Controls when the action will run.
on:
schedule:
- cron: '*/10 * * * 1-5'
jobs:
autoscrape:
# The type of runner that the job will run on
runs-on: windows-latest
# Load repo and install R
steps:
- uses: actions/checkout@master
- uses: r-lib/actions/setup-r@master
# Set-up R
- name: Install packages
run: |
R -e 'install.packages("rvest")'
R -e 'install.packages("tidyverse")'
# Run R script
- name: Scrape
run: Rscript Aljazeera_scraper.R
# Add new files in data folder, commit along with other modified files, push
- name: Commit files
run: |
git config --local user.name actions-user
git config --local user.email "actions@github.com"
git add data/*
git commit -am "GH ACTION Headlines $(date)"
git push origin main
env:
REPO_KEY: ${{secrets.GITHUB_TOKEN}}
username: github-actions
The script I want to automate is written below:
library(rvest)
library(tidyverse)
aljurl <- read_html(paste0("https://www.aljazeera.com/"))
headlinks <- aljurl %>%
html_nodes(".u-clickable-card__link") %>%
html_attr("href")
links <- data.frame(
date = Sys.Date(),
headline_links = headlinks
)
write.csv(links,file = paste0("Headlinks.csv"),append = TRUE)
I want it to scrape headline links from the news website every 10 minutes of every day, and store that in a csv file in the data
folder. Nevertheless, I do not see any indication that my workflow is running; and when I check the log, I see 0 activity. Please, is there something I am not getting right? This is my first attempt at automating my R script. Thanks for your support as always.