There are four steps in that:
- read the Excel input
- find the word
- generate the content of file2
- save file2
1. Read xlsx
This is made easy with the package readxl
.
file1_contents <- readxl::read_excel("fil1.xlsx")
file1_contents
# A tibble: 7 x 5
# column1 column2 column3 column4 column5
# <chr> <chr> <chr> <chr> <chr>
# 1 AAAA1 BBBB2 CCCCC1 DDDD1 EEEEE1
# 2 AAAA2 BBBB3 CCCCC2 DDDD2 EEEEE2
# 3 AAAA3 something word1 DDDD3 EEEEE3
# 4 AAAA4 BBBB5 CCCCC4 DDDD4 EEEEE4
# 5 AAAA5 BBBB6 CCCCC5 DDDD5 EEEEE5
# 6 AAAA6 BBBB7 CCCCC6 DDDD6 EEEEE6
# 7 AAAA7 BBBB8 CCCCC7 DDDD7 EEEEE7
2. Extract value
That is the harder part. We will use the packages stringr
and dplyr
, part of the tidyverse:
library(tidyverse)
For a given vector, we can look for a specific word using str_detect()
:
str_detect(file1_contents$column3, "word")
# [1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE
(note the TRUE
in 3rd position)
Now we can use filter()
to keep only the row of interest:
file1_contents %>%
filter(str_detect(column3, "word"))
# A tibble: 1 x 5
# column1 column2 column3 column4 column5
# <chr> <chr> <chr> <chr> <chr>
# 1 AAAA3 something word1 DDDD3 EEEEE3
So all we need to do is pull
out the content of column2 and save it:
out_word <- file1_contents %>%
filter(str_detect(column3, "word")) %>%
pull(column2)
out_word
# [1] "something"
We want to make sure we have a single value, so let's create an error if several rows do match:
stopifnot(length(out_word) == 1)
So if for any reason the number of rows is not 1, we get an error and know we have to investigate (that's better than the script executing without problem but the result being horribly wrong).
3. Generate the output
Here, several methods can work, depending on what your problem really looks like. To do exactly what you describe in the question, we can simply use paste()
:
paste("name:",out_word,"
city:
place:
type:
color:")
# [1] "name: something \ncity: \nplace: \ntype: \ncolor:"
(the "\n" indicate a newline, it will be saved to file with actual newlines)
But if the problem is more complex, it can make sense to have a template text file with a placeholder, and only replace that:
file_2 <- read_lines("file2_template.txt")
file_2
# [1] "name: %PLACEHOLDER%" "city:" "place:"
# [4] "type:" "color:"
I had saved the file previously, with a big %PLACEHOLDER%
where I want to insert the string. Note that this was read line-by-line, so I have a vector with one element for each line. Now we just need to replace the placeholder:
file_2_edited <- str_replace(file_2, "%PLACEHOLDER%", out_word)
file_2_edited
# [1] "name: something" "city: " "place: " "type: "
# [5] "color:"
There can be other ways to generate this, especially if you also want to fill in the other fields using R. For example:
exple_df <- tibble(name = out_word,
city= "my_city",
place = "my_place",
type = "my_type",
color = "my_color")
yaml::as.yaml(exple_df)
# [1] "name: something\ncity: my_city\nplace: my_place\ntype: my_type\ncolor: my_color\n"
4. Save to file
Once you have the content of file2, it's trivial to write it with write_lines
(or yaml::write_yaml
).