Managing of Scheduled R script

dplyr
rstudio

#1

I scheduled the Rscript which takes data from MS SQL Server and performs regression. The script takes only new data.
For example, today was loaded 100 obs from 01.01.2017-03.03.2017,script conducted the regression on this data.

Tomorrow will be loaded 100 obs from 04.03.2017-04.06.2017 and script will work with this obs and not from 01.01.2017 -04.06.2017.

I asked at this forum the question,how to make that R only works with data that have a fresh date and got this useful answer,
where we create last date log and take the data older than it.
If anyone is interested, here the link

# READ DATE FROM LOG FILE
log_dt <- readLines("/path/to/SQL_MaxDate.txt", warn=FALSE)

# QUERY WITH WHERE CLAUSE
sql <- paste0("SELECT Dt, CustomerName, ItemRelation, SaleCount, 
                      DocumentNum, DocumentYear, IsPromo
               FROM dbo.mytable WHERE Dt > '", log_dt, "'")

df <- sqlQuery(dbHandle, sql)

# RETRIEVE MAX DATE VALUE
max_DT <- as.character(max(df$Dt))

# ...here code for regression, now, it's not important for this question

# WRITE DATE TO LOG FILE
cat(max_DT, file="/path/to/SQL_MaxDate.txt")

The question:My Scheduler runs 1 time per day, but it happens that the data in the SQL database does not load every day, it can loaded for example 1 time of 3 days and so on.
Can R make a check?
If R determines that there is no new data( there is no fresh date), then the script does not start?
If R determines("see") that there is new data, it runs script.
I.E. for example, last date when Rscript ran ,was 12.05.2018, and 13.05.2018 Rscipt was run by schedule, but on this date nothing was loaded in sql, R "see" that there is no new date, and it will work with same last date and it doesn't run. And R must do this checking everytime when running, is there new date or not.
Is it possible to do or no?


#2

Not sure I understand the issue you have (i.e. I don't know what your "Scheduler" is, "but it happens that the data in the SQL database does not load every day, it can loaded for example 1 time of 3 days and so on" is ambiguous.)

But I seems like you would benefit from a conditional if statement to check if the condition you state is meet, and then only run your script if it is.

Google for R control structures to learn how to set this up.