append 2 columns with new column name using original

Hi there - I'm looking for a method to append 2 columns of a data frame under a new name and also retain some information from original column headings in a new factor column. For example, starting with the following data frame

df <- data.frame(colA = c("a","a","b","b"),
                 id_rep1_num = c(1,2,3,4),
                 id_rep2_num = c(5,6,7,8))

And then combine the 2 columns into single column under a new name and retaining the "rep1" and "rep2" in a new column (ie a brand new column title "rep" that retains the original "1" and "2", so the desired data frame would be

desired <- data.frame(colA = c("a","a","b","b","a","a","b","b"),
                      rep = c(1,1,1,1,2,2,2,2),
                      new_name = c(1,2,3,4,5,6,7,8))

Hi,

Welcome to the RStudio community!

You can do this using some handy functions from the dplyr and tidyr packages from the Tidyverse.

library(tidyverse)

df <- data.frame(colA = c("a","a","b","b"),
                 id_rep1_num = c(1,2,3,4),
                 id_rep2_num = c(5,6,7,8))

#Transform into long format
desired = df %>% 
  pivot_longer(-colA, names_to = "rep", values_to = "new_name")

desired
#> # A tibble: 8 × 3
#>   colA  rep         new_name
#>   <chr> <chr>          <dbl>
#> 1 a     id_rep1_num        1
#> 2 a     id_rep2_num        5
#> 3 a     id_rep1_num        2
#> 4 a     id_rep2_num        6
#> 5 b     id_rep1_num        3
#> 6 b     id_rep2_num        7
#> 7 b     id_rep1_num        4
#> 8 b     id_rep2_num        8

#Change formatting / order
desired = desired %>% 
  mutate(rep = str_extract(rep, "\\d+") %>% as.integer()) %>% 
  arrange(new_name)

desired
#> # A tibble: 8 × 3
#>   colA    rep new_name
#>   <chr> <int>    <dbl>
#> 1 a         1        1
#> 2 a         1        2
#> 3 b         1        3
#> 4 b         1        4
#> 5 a         2        5
#> 6 a         2        6
#> 7 b         2        7
#> 8 b         2        8

Created on 2022-08-05 by the reprex package (v2.0.1)

To learn more about the Tidyverse packages and functions, you can to their website

Hope this helps,
PJ

Awesome thank you! I was looking at the pivots but was unable to figure out how to combine elements.
Can you please clarify: the "-colA" retains the info for the column that is being moved? If I had many columns to retain (whic I do irl), can I list them? Ie pivot_longer(-colA, -colB, names_to...) or pivot_longer(-c(colA, colB), names_to...)

Hello,

The -colA means: make all columns apart from colA in a longer format. So in this case id_rep1_num and id_rep2_num will be made longer, where all the values are combined in one column and all the original column names are put in one column. The values in the ignored columns (colA) are repeated for each of the new rows created.

Hope this helps,
PJ

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.