How can I fill a range of rows in one column with an item from a different column?

Dear all,

Given the data frame below, I wish to accomplish the following:

Every time a new sequence starting with “d1” is found in the column “letter”, then the corresponding rows in the “target” column are filled with an item from the column “word”.

The instertion of items from “word” into "target" should follow a sequential order.

Here is a data sample:

library(tidyverse)
df <- tibble::tribble(
        ~letter, ~target,   ~word,
           "d1",      NA,   "pat",
           "d2",      NA,   "tap",
            "1",      NA,   "cap",
            "2",      NA, "house",
            "3",      NA, "patty",
           "d1",      NA,   "cap",
           "d2",      NA,   "tap",
            "1",      NA, "patty",
            "2",      NA, "house",
            "3",      NA,   "pat",
           "d1",      NA,   "pat",
           "d2",      NA,   "tap",
            "1",      NA, "house",
            "2",      NA, "house",
            "3",      NA,   "cap",
           "d1",      NA, "patty",
           "d2",      NA, "patty",
            "1",      NA,   "pat",
            "2",      NA,   "tap",
            "3",      NA,   "pat",
           "d1",      NA, "house",
           "d2",      NA, "patty",
            "1",      NA,   "cap",
            "2",      NA,   "tap",
            "3",      NA,   "cap"
        )
head(df)
#> # A tibble: 6 x 3
#>   letter target word 
#>   <chr>  <lgl>  <chr>
#> 1 d1     NA     pat  
#> 2 d2     NA     tap  
#> 3 1      NA     cap  
#> 4 2      NA     house
#> 5 3      NA     patty
#> 6 d1     NA     cap

Created on 2023-06-02 with reprex v2.0.2

The following is an ideal(ized) rendering of the final result.

I would very much appreciate learning from you about the best way to approach this issue.

Thank you!
Dan

Below is one way to achieve the desired output. First, the vector of words to choose from is isolated as word_list. Next, the data set is transformed to create a column to check for rows with "d1". Then, a column is added to count the "d1" groups. The count of the "d1" group then corresponds with the position of the word in the word_list. Finally, the columns of interest are selected.

# list of words to choose from
word_list = df$word

out = df |>
  mutate(d1_check = ifelse(letter == 'd1', 1, 0),
         d1_count = cumsum(d1_check),
         target = word_list[d1_count]
         ) |>
  select(letter, target, word)

out
#> # A tibble: 25 × 3
#>    letter target word 
#>    <chr>  <chr>  <chr>
#>  1 d1     pat    pat  
#>  2 d2     pat    tap  
#>  3 1      pat    cap  
#>  4 2      pat    house
#>  5 3      pat    patty
#>  6 d1     tap    cap  
#>  7 d2     tap    tap  
#>  8 1      tap    patty
#>  9 2      tap    house
#> 10 3      tap    pat  
#> # ℹ 15 more rows

Created on 2023-06-02 with reprex v2.0.2

Hi Scotty,
I tried your code and it solved the problem!
Thank you for taking the time to explain the process so well.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.