Suppose I have the following files:
library(tidyverse)
# Toy data
## toy file 1
write_lines(c("510020221015123456.00000",
"510020221016456456.00000",
"510020221017678456.00000"), "abc_w20220111.txt")
## toy file 2
write_lines(c("510020221115123456.00000",
"510020221116456456.00000",
"510020221117678456.00000"), "abc_w20220112.txt")
For each file, there is a corresponding date. For example,
read_lines("abc_w20220111.txt") |>
as_tibble() |>
mutate(date = str_sub(value, 5, 12))
value | date |
---|---|
510020221015123456.00000 | 20221015 |
510020221016456456.00000 | 20221016 |
510020221017678456.00000 | 20221017 |
I want to drop the lines if they belong to a vector. For example,
dates_not_wanted <- c(20221015, 20221117)
.
And then create text files again, maintaining the same name, however adding the text_updated
. The below code does what I want, but not efficient. How can I achieve the result more programmatically? Imagine I have 1,000 files to update.
dates_not_wanted <- c(20221015, 20221117)
read_lines("abc_w20220111.txt") |>
as_tibble() |>
mutate(date = str_sub(value, 5, 12)) |>
filter(!date %in% dates_not_wanted) |>
select(-date) |>
pull() |>
write_lines("abc_w20220111_updated.txt")
read_lines("abc_w20220112.txt") |>
as_tibble() |>
mutate(date = str_sub(value, 5, 12)) |>
filter(!date %in% dates_not_wanted) |>
select(-date) |>
pull() |>
write_lines("abc_w20220112_updated.txt")