Hello,
I am trying to concatenate (paste) several columns and rows.
Here is an example:
My data:
What I need:
it's a simple task, but I didn't find a way to make it work in this case.
Can someone help, please?
Hello,
I am trying to concatenate (paste) several columns and rows.
Here is an example:
My data:
What I need:
it's a simple task, but I didn't find a way to make it work in this case.
Can someone help, please?
Close enough?
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
DF <- data.frame(ID=rep(1:3,each=3),
V1=paste0("A",1:9),
V2=paste0("B",1:9),
V3=paste0("C",1:9),
V4=paste0("D",1:9))
DF
#> ID V1 V2 V3 V4
#> 1 1 A1 B1 C1 D1
#> 2 1 A2 B2 C2 D2
#> 3 1 A3 B3 C3 D3
#> 4 2 A4 B4 C4 D4
#> 5 2 A5 B5 C5 D5
#> 6 2 A6 B6 C6 D6
#> 7 3 A7 B7 C7 D7
#> 8 3 A8 B8 C8 D8
#> 9 3 A9 B9 C9 D9
DFlng <- pivot_longer(DF,V1:V4)
DFall <- DFlng %>% group_by(ID) %>%
summarize(Alltext=paste(value,collapse=","))
#> `summarise()` ungrouping output (override with `.groups` argument)
DF <- inner_join(DF,DFall,by="ID")
DF
#> ID V1 V2 V3 V4 Alltext
#> 1 1 A1 B1 C1 D1 A1,B1,C1,D1,A2,B2,C2,D2,A3,B3,C3,D3
#> 2 1 A2 B2 C2 D2 A1,B1,C1,D1,A2,B2,C2,D2,A3,B3,C3,D3
#> 3 1 A3 B3 C3 D3 A1,B1,C1,D1,A2,B2,C2,D2,A3,B3,C3,D3
#> 4 2 A4 B4 C4 D4 A4,B4,C4,D4,A5,B5,C5,D5,A6,B6,C6,D6
#> 5 2 A5 B5 C5 D5 A4,B4,C4,D4,A5,B5,C5,D5,A6,B6,C6,D6
#> 6 2 A6 B6 C6 D6 A4,B4,C4,D4,A5,B5,C5,D5,A6,B6,C6,D6
#> 7 3 A7 B7 C7 D7 A7,B7,C7,D7,A8,B8,C8,D8,A9,B9,C9,D9
#> 8 3 A8 B8 C8 D8 A7,B7,C7,D7,A8,B8,C8,D8,A9,B9,C9,D9
#> 9 3 A9 B9 C9 D9 A7,B7,C7,D7,A8,B8,C8,D8,A9,B9,C9,D9
Created on 2021-02-25 by the reprex package (v0.3.0)
This is a slightly different approach, with an additional step to remove duplicated values. You can, of course, combine elements of both approaches to get the best of both solutions.
library(tidyverse)
df <- tibble(id = rep(1:3, each = 3),
var1 = str_c("A", 1:9),
var2 = str_c("B", 1:9),
var3 = str_c("C", 1:9),
var4 = str_c("D", 1:9),
)
df %>%
## first concatenate variables per row
rowwise() %>%
mutate(new_col = str_c(c(var1, var2, var3, var4), collapse = ", ")) %>%
## then combine/collapse all values per id
group_by(id) %>%
mutate(new_col = str_c(new_col, collapse = ","),
## remove duplicated values
new_col = if_else(row_number() > 1 & new_col == lag(new_col), "-", new_col),
) %>%
ungroup()
#> # A tibble: 9 x 6
#> id var1 var2 var3 var4 new_col
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 A1 B1 C1 D1 A1, B1, C1, D1,A2, B2, C2, D2,A3, B3, C3, D3
#> 2 1 A2 B2 C2 D2 -
#> 3 1 A3 B3 C3 D3 -
#> 4 2 A4 B4 C4 D4 A4, B4, C4, D4,A5, B5, C5, D5,A6, B6, C6, D6
#> 5 2 A5 B5 C5 D5 -
#> 6 2 A6 B6 C6 D6 -
#> 7 3 A7 B7 C7 D7 A7, B7, C7, D7,A8, B8, C8, D8,A9, B9, C9, D9
#> 8 3 A8 B8 C8 D8 -
#> 9 3 A9 B9 C9 D9 -
Created on 2021-02-26 by the reprex package (v1.0.0)
Both solutions worked perfectly! Thank you!!
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.