Partially duplicate rows and get elements from variables's names

a b c x v100 v200 p
1 2 3 900 not_needed1 not_needed2 500
4 5 6 500 not_needed3 not_needed4 900

Considering x as id, I want to do the following:

  • Partially duplicate each row:a, b, and c get duplicated.

  • Modify x: the first element of x will become 900100, the 100 coming from the variable v100. Similarly, the second element of x will become 900200, the 200 coming from the variable v200(Do the same thing for other x==500).

  • Every second element of p gets the value 0

  • Drop v100 and v200

Here is the desired output:

a b c x p
1 2 3 900100 500
1 2 3 900200 0
4 5 6 500100 900
4 5 6 500200 0
library(tidyverse)
# toy data
df <- structure(list(a = c(1, 4), b = c(2, 5), c = c(3, 6),
               x = c(900, 500),
               v100 = c("not_needed1", "not_needed3"),
               v200 = c("not_needed2", "not_needed4"), 
               p = c(500, 900)), 
               class = c("tbl_df", "tbl", "data.frame"),
               row.names = c(NA, -2L))

Try this:

df %>% 
  pivot_longer(starts_with("v")) %>% 
  group_by(x) %>% 
  mutate(
    x = str_c(x, str_extract(name, "[0-9]+")),
    p = if_else(seq(n()) == 1, p, 0)
  ) %>% 
  dplyr::select(a, b, c, x, p)

It will be a problem if the x values are not unique.

1 Like

@woodward Many thanks!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.