Running Code in Shiny Periodically

Hi all,

Right now, I am working on a Shiny app that involves web scraping. I am currently scraping headlines / articles / links from a news website - my goal is that the app will update and show news headlines / articles.

This process is computationally expensive and therefore takes a lot of time. Therefore, I was wondering if there was a way to periodically run part of a code within the Shiny app at a specified time interval as opposed to every time the Shiny app opens. What I am thinking is to have the data 'update' every six hours so that way it doesn't take a long time to load whenever someone opens up the Shiny app. Is this possible to do? I know this kind of a strange question.

My first reaction is to

  • separate the computationally expensive from the app and run that on a scheduler.
  • app checks periodically to see if it needs to refresh data.

for details, see these other posts



https://shiny.rstudio.com/reference/shiny/1.0.2/reactiveFileReader.html

2 Likes

I like the idea of a data object as a reactive that is invalidated 6 hrs later. This assumes you have a compute_data() method available.

server <- function(input, output) {

app_data <- reactive({
  # invalidate 6 hrs later
  invalidateLater(1000 * 60 * 60 * 6)

  compute_data()
})


#... use the app_data object like a normal reactive value
}

The above approach will do this for every user that visits the website. If you'd rather consolidate the computation to a single place, set up a plumber API that your shiny application loads at run time. Then the plumber API would be a single source of truth. It would only be computed 1 time every 6 hours, rather than length(users) times

Thanks for the advice! I tried to use taskscheduleR with my R web-scraping script, which essentially creates a csv file with the updated web-scraped data, but I am having some issues. I essentialy want the script to run every six hours, but I am not seeing the csv file as a result. Here is the R script:

library(Rcrawler)
library(rvest)
library(dplyr)
library(stringr)
library(tidyr)
library(ggmap)

setwd("C:/Users/Eli/Desktop/ap")

ap <- read_html("https://www.apnews.com/apf-intlnews")

api <- "AIzaSyBcG9a8hiNdYHPcw45aqEr6r4ccLtbvNII"
register_google(key = api)

headline <- 
  data.frame(
  ap %>%
  html_nodes("h1") %>%
  html_text()
  )

headline <- data.frame(headline[-1,])

paragraph <- data.frame(
  ap %>%
  html_nodes(".content") %>%
  html_text()
)

link <- 
  ap %>%
    html_nodes("a") %>%
    html_attr("href")

link <- na.omit(unique(link[nchar(link) > 20 & str_detect(substring(link, 1, 1), "/")]))
link <- str_c("https://www.apnews.com", link)

link2 <- str_c("<a href='", link)
link2 <- str_c(link2, "'>AP News</a>")

article <- c()

for (i in link) {
  article[i] <- read_html(i) %>% html_nodes(".Article") %>% html_text()
}

data <- cbind(headline, paragraph, link, link2, article)

colnames(data) <- c("headline", "paragraph", "link", "link2", "article")

data <- data %>% separate(paragraph, c("location", "text"),  sep = " \\(AP\\) \\— ")

data <- mutate_geocode(data, location)

write.csv(data, file = "newsdata.csv")

Here is the script I used for taskscheduleR - I originally tried with the RStudio plugin, to no avail, so I tried writing the task as an r script.

library(taskscheduleR)

setwd("C:/Users/Eli/Desktop/ap")

taskscheduler_create(taskname = "news_script", rscript = "C:/Users/Eli/Desktop/ap/news.R", 
                     schedule = "HOURLY", starttime = "12:00", modifier = 6)

Is my formatting off, or am I not running the task correctly?

Thanks for your help,
Eli

This seems like a very cool approach. Right now I am having some issues with taskscheduleR but I will try this if I get the task to run correctly. Thanks for the advice!

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.