Thanks for the advice! I tried to use taskscheduleR with my R web-scraping script, which essentially creates a csv file with the updated web-scraped data, but I am having some issues. I essentialy want the script to run every six hours, but I am not seeing the csv file as a result. Here is the R script:
library(Rcrawler)
library(rvest)
library(dplyr)
library(stringr)
library(tidyr)
library(ggmap)
setwd("C:/Users/Eli/Desktop/ap")
ap <- read_html("https://www.apnews.com/apf-intlnews")
api <- "AIzaSyBcG9a8hiNdYHPcw45aqEr6r4ccLtbvNII"
register_google(key = api)
headline <-
data.frame(
ap %>%
html_nodes("h1") %>%
html_text()
)
headline <- data.frame(headline[-1,])
paragraph <- data.frame(
ap %>%
html_nodes(".content") %>%
html_text()
)
link <-
ap %>%
html_nodes("a") %>%
html_attr("href")
link <- na.omit(unique(link[nchar(link) > 20 & str_detect(substring(link, 1, 1), "/")]))
link <- str_c("https://www.apnews.com", link)
link2 <- str_c("<a href='", link)
link2 <- str_c(link2, "'>AP News</a>")
article <- c()
for (i in link) {
article[i] <- read_html(i) %>% html_nodes(".Article") %>% html_text()
}
data <- cbind(headline, paragraph, link, link2, article)
colnames(data) <- c("headline", "paragraph", "link", "link2", "article")
data <- data %>% separate(paragraph, c("location", "text"), sep = " \\(AP\\) \\— ")
data <- mutate_geocode(data, location)
write.csv(data, file = "newsdata.csv")
Here is the script I used for taskscheduleR - I originally tried with the RStudio plugin, to no avail, so I tried writing the task as an r script.
library(taskscheduleR)
setwd("C:/Users/Eli/Desktop/ap")
taskscheduler_create(taskname = "news_script", rscript = "C:/Users/Eli/Desktop/ap/news.R",
schedule = "HOURLY", starttime = "12:00", modifier = 6)
Is my formatting off, or am I not running the task correctly?
Thanks for your help,
Eli