Script runs inside of RStudio, but not via a BAT file. Breaks on date conversion

Hi everyone,

I have an R script that pulls covid data from a Public Health Department. One step is to convert the date, but while the script works fine inside of R Studio, it fails when we call/run it via a batch file.

This is on a Windows Server (2019) with R 3.5.2.

This is the relevant bit of the script:

library(boxr)
library(lubridate)
library(curl)
library(jsonlite)
library(tidyverse)

# Set up CURL link to APH data
travis.counts <- curl('https://services.arcgis.com/0L95CJ0VTaxqcmED/arcgis/rest/services/Austin_Travis_County_COVID19_Daily_Counts_(Public_View)/FeatureServer/0/query?f=json&where=Record_Status_1%3C%3E%27Staging%27&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&orderByFields=Last_Update%20asc&resultOffset=0&resultRecordCount=32000&resultType=standard&cacheHint=true')

# Extract json and convert to dataframe
travis.counts.json <- readLines(travis.counts, warn = FALSE)
travis.counts.parsed <- fromJSON(travis.counts.json, simplifyDataFrame = TRUE)
travis.counts.nesteddf <- as.data.frame(travis.counts.parsed[5])
travis.counts.df <- as.data.frame(flatten(travis.counts.nesteddf))

# Remove "attributes." from column names, and add indicator for Daily Counts as the source table
names(travis.counts.df) <- gsub(x = names(travis.counts.df), pattern = "attributes.", replacement = "DC.")

# Convert 'Last_Update' from Epoch time (milliseconds) to DateTime.
travis.counts.df$DC.Last_Update <- as.POSIXct((travis.counts.df$DC.Last_Update/1000), origin="1970-01-01")

This is where it fails (when run from a batch file).

I have also tried:

travis.counts.df <- travis.counts.df %>% 
  mutate(DC.Last_Update = as_datetime(DC.Last_Update/1000, origin="1970-01-01"))

(Among like 10 other similar variations.)

Like I said, this runs fine within RStudio.

Though the batch file I get an error like this:

Error in as_datetime(DC.Last_Update/1000, origin = "1970-01-01") :
object 'DC.Last_Update' not found
Calls: %>% ... mutate -> mutate.tbl_df -> mutate_impl -> as_datetime
Execution halted

Again, the column is actually there. It runs fine from RStudio.

Any help is greatly appreciated. We run this data pull twice a day, so while it's great that we're able to run it manually, getting it automated would be a big help.

Thanks again!

Are you sure the code runs locally? I just ran the code provided and it does not work. There are no column names with "attributes." so the column "DC.Last_Update" does not exist. However, the column Last_Update exists.

I apologize. I didn't quite copy/paste the right thing. (I was trying to clean it up, and I tried to change everything I could think of at some point to try to fix this.

SORRY.

The code below does run within RStudio. (I just copied into a clean instance and tried successfully.) Here is the error from the cmd console:

"Error in as_datetime(DC.Last_Update/1000, origin = "1970-01-01") :
object 'DC.Last_Update' not found
Calls: %>% ... mutate -> mutate.tbl_df -> mutate_impl -> as_datetime
Execution halted"

library(boxr)
library(lubridate)
library(curl)
library(jsonlite)
library(tidyverse)

### Pull APH Daily Counts data from ARCGIS and put into dataframe #####

# Set up CURL link to APH data
travis.counts <- curl('https://services.arcgis.com/0L95CJ0VTaxqcmED/arcgis/rest/services/Austin_Travis_County_COVID19_Daily_Counts_(Public_View)/FeatureServer/0/query?f=json&where=Record_Status_1%3C%3E%27Staging%27&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&orderByFields=Last_Update%20asc&resultOffset=0&resultRecordCount=32000&resultType=standard&cacheHint=true')

# Extract json and convert to datafram
travis.counts.json <- readLines(travis.counts, warn = FALSE)
travis.counts.parsed <- fromJSON(travis.counts.json, simplifyDataFrame = TRUE)
travis.counts.nesteddf <- as.data.frame(travis.counts.parsed[5])
travis.counts.df <- as.data.frame(flatten(travis.counts.nesteddf))
travis.counts.tibble <- as_tibble(travis.counts.df)

# Remove "attributes." from column names, and add indicator for Daily Counts as the source table

names(travis.counts.tibble) <- gsub(x = names(travis.counts.tibble), pattern = "attributes", replacement = "DC")

# Convert 'Last_Update' from Epoch time (milliseconds) to DateTime.
travis.counts.tibble <- travis.counts.tibble %>% 
  mutate(DC.Last_Update = as_datetime(DC.Last_Update/1000, origin="1970-01-01"))

SORRY again that the original code wasn't correct.

Luke

The code still doesn't work for me locally. There are no column names in travis.count.df and travis.counts.tibble which have the prefix 'attributes.'. Only the travis.counts.nesteddf object has columns with that prefix.

This is a namespace issue as the function flatten is in both the jsonlite package and the purrr package. My machine defaults to the purrr::flatten() and drops the 'attributes.' prefix and perhaps this is what is occurring on your server. If I use the jsonlite::flatten(), then the 'attributes.' prefix is maintained. Therefore, I suggest you specify in your code to use that function if you would like to keep the prefix.

travis.counts.df <- as.data.frame(jsonlite::flatten(travis.counts.nesteddf))
1 Like

toryn_stat,

THANK YOU!

This is the fix and the script now works.

I still don't really have my head around the whole json flattening and what's really happening. I'd actually prefer to not have the attributes. (I'll try to change the script to use purrr.) This was one of those things where we needed the data ASAP and getting it working is priority #1. I definitely need to learn more about what's going on here, but can't get out of firefighting mode.

It now runs automatically, which will free up very valuable time. I cannot thank you enough!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.