How to run an observer chain _exactly_ once on application startup

I have a simple snippet of a (Shiny) reactive process below and I've already come up with a solution that makes use of observers' priority value. I'm curious if others can see any obvious solution to the problem that doesn't require explicit control of priority. Also worth mentioning that I'm aware of reactivePoll(), but (i) I don't think reactivePoll() actually solves the underlying issue here and (ii) the snippet below is very a simplified scenario in a much larger program (i.e. reprex).

When the Shiny application starts, one of the first things it does is connect to an external data source and begins polling for data updates (via obs_poll). This polling is relatively cheap (it 'announces' a change to be retrieved from separate network data sources, e.g. a subset of sensors). Downstream processing of these other data sources can be expensive, so we only want to run when there are updates, and at the start of the application there's always guaranteed to be an update available. The poller sets the reactive value rv to some new value, (possibly, depending on the order) invalidating obs_rv. In the snippet I've made obs_rv and observeEvent to explicitly test the ignoreInit parameter, but a simple observe is a degenerate case (mentioned below).

Here's the snippet:

rv <- reactiveVal(0)  ## sentinel value guaranteed to be increasing for all future updates

obs_poll <- observe({
  print("exec: obs_poll")
  invalidateLater(5 * 1000)
  ## some external datastore polling logic here.
  if (some_condition_from_the_polling_logic) {
    rv(some_new_value_gt_rv)  ## new value > rv(); i.e. strictly increasing (but not necessarily by 1)
  }
  ## and possibly set a bunch of other reactiveVal sentinels here (i.e. rv2, rv3, rv4, ...).
}, priority = obs_poll_priority)

obs_rv <- observeEvent({
  print("exec: obs_rv.eventExpr")
  rv()  ## take rdep
}, {
  print("exec: obs_rv.handlerExpr")
  ## an expensive computation here.
}, ignoreInit = obs_rv_ignore_init, priority = obs_rv_priority)

I've left the priority values as parameters along with ignoreInit for obs_rv. (Since rv() is guaranteed to never return NULL, ignoreNULL is irrelevant and setting ignoreInit = FALSE thus turns this observeEvent into a regular observe with some embedded isolate() semantics.)

Here's the problem: when no priority is set (i.e. they're all implicitly set to 0), invalidated observers are not guaranteed to execute in any specific order during the reactive flush. This can lead to obs_rv running zero, one, or two times during the first reactive flush on application startup:

  • 1 time is ideal
  • 2 times is wasteful, note that obs_rv's handlerExpr is expensive
  • 0 times is broken, as now obs_rv will need to wait for new data to update, but we want it to first sync to existing data on application start.

There are four scenarios I tested to 'simulate' the arbitrary ordering of equal-priority observers during the initial reactive flush (tested explicitly with shiny:::flushReact()):

  1. obs_rv_priority > obs_poll_priority & obs_rv_ignore_init = FALSE:
    1. obs_rv's eventExpr runs; sees rv() == 0; takes dependency on rv;
    2. since ignoreInit = FALSE, obs_rv's handlerExpr runs (first time);
    3. obs_poll runs & sees new remote values; set rv(x), invalidating obs_rv;
    4. obs_rv's eventExpr runs; sees rv() == x, takes dependency on rv;
    5. obs_rv's handlerExpr runs (second time);
    6. done with flush.
      on next poll, there are no data updates; obs_rv doesn't run.
      net result: 2 executions of handlerExpr.
  2. obs_rv_priority > obs_poll_priority & obs_rv_ignore_init = TRUE:
    1. obs_rv's eventExprruns; sees rv() == 0; takes dependency on rv;
    2. since ignoreInit = TRUE obs_rv's handlerExpr does not run.
    3. obs_poll runs & sees new remote values; set rv(x), invalidating obs_rv;
    4. obs_rv's eventExpr runs; sees rv() == x, takes dependency on rv;
    5. obs_rv's handlerExpr runs (first time);
    6. done with flush.
      on next poll, there are no data updates; obs_rv doesn't run.
      net result: 1 execution of handlerExpr.
  3. obs_poll_priority > obs_rv_priority & obs_rv_ignore_init = FALSE:
    1. obs_poll runs & sees new remote values; set rv(x); (no dependency to invalidate yet);
    2. obs_rv's eventExpr runs; sees rv() == x; takes dependency on rv;
    3. since ignoreInit = FALSE, obs_rv's handlerExpr runs (first time).
    4. done with flush.
      on next poll, there are no data updates; obs_rv doesn't run.
      net result: 1 execution of handlerExpr.
  4. obs_poll_priority > obs_rv_priority & obs_rv_ignore_init = TRUE:
    1. obs_poll runs & sees new remote values; set rv(x); (no dependency to invalidate yet);
    2. obs_rv's eventExpr runs; sees rv() == x; takes dependency on rv;
    3. since ignoreInit = TRUE, obs_rv's handlerExpr does not run.
    4. done with flush.
      on next poll, there are no data updates; obs_rv doesn't run.
      net result: 0 executions of handlerExpr.

Scenario 1 (two executions) isn't great. Scenario 4 is broken (since the application never properly initializes to the available data). Scenarios 2 & 3 both work, but require carefully setting/managing the observeInit and priority values. This last case is fine, but feels somehow non-idiomatic.

How have others handled this type of scenario? Worth mentioning that Scenario 1, while not ideal works fine and all future updates to data lead to only a single execution of handlerExpr. Only Scenario 4 is 'broken', and simply setting ignoreInit = FALSE prevents failure, but I'm curious to see if there are other nice solutions to preventing Scenario 1, too, that might not involve priority.

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.