Suppose in my project directory there is a folder called "data". The folder contains three Excel files:
df.xlsx
df_2012017.xlsx
df_2012018.xlsx
df.xlsx looks like this:
| date |
sales |
week |
| 2012-04-07 |
10 |
2012014 |
| 2012-04-14 |
30 |
2012015 |
| 2012-04-21 |
60 |
2012016 |
| 2012-04-28 |
40 |
2012017 |
df_2012017.xlsx looks like this:
| date |
sales |
week |
| 2012-04-28 |
40 |
2012017 |
df_2012018.xlsx looks like this:
| date |
sales |
week |
| 2012-05-05 |
100 |
2012018 |
Given the above files, I want to do the following:
-
Read the df.xlsx as df.
-
Identify the latest week from the variable week and find out if there is any file in the "data" folder with the name df_*.xlsx where * is greater than the last week. For example, when we read the file df.xlsx, we notice that the latest week is 2012017. So, there is only one file that satisfies the condition is df_2012018.xlsx
-
After the identification, if there is any, we will read the file and bind the rows with df
-
Then write the new data frame as df.xlsx
In other words, I am trying to write a script that will update the excel file if there is any new data.
library(tidyverse)
library(lubridate)
library(writexl)
# Prepare toy data ----
df <- tibble(date = seq(ymd('2012-04-07'),ymd('2012-04-28'), by = '1 week')) %>%
mutate(sales = c(10, 30, 60, 40),
week = str_c(isoyear(date), "0", isoweek(date)))
## Existing data
df_2012017 <- tibble(date = ymd('2012-04-28'),
sales = 40,
week = '2012017')
## New data
df_2012018 <- tibble(date = ymd('2012-05-05'),
sales = 100,
week = '2012018')
## Write
write_xlsx(df, "data/df.xlsx")
write_xlsx(df_2012017, "data/df_2012017.xlsx")
write_xlsx(df_2012018, "data/df_2012018.xlsx")