I have my df1, where A and B column has same uniqe entry. C and D column may have different entries as shown. I like to trasform it to df2, where C column is expanded to C1,C2,C3 and same as D column.
I try to use pivot_wider to make it wider, but seems it does not work well in this case. Any better way to do that using dplyr tools? Thanks in advance.
library(tibble)
df1<-tribble(
~A,~B,~C,~D,
4,1,2,7,
4,1,3,9,
4,1,4,10
)
df1
#> # A tibble: 3 x 4
#> A B C D
#> <dbl> <dbl> <dbl> <dbl>
#> 1 4 1 2 7
#> 2 4 1 3 9
#> 3 4 1 4 10
# Transform df1 to df2
df2<-tribble(
~A,~B,~C1,~C2,~C3,~D1,~D2,~D3,
4,1,2,3,4,7,9,10
)
df2
#> # A tibble: 1 x 8
#> A B C1 C2 C3 D1 D2 D3
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 1 2 3 4 7 9 10
With a little work before pivot_wider() actually works here pretty nicely
library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
df1 <- tribble(
~A,~B,~C,~D,
4,1,2,7,
4,1,3,9,
4,1,4,10
)
# it seems that the columns `A` and `B` are supposed to uniquely
# identify an observation
# but the following doesn't work as we want because everything goes into
# a single column `C` resp. `D`
df1 %>%
pivot_wider(
id_cols = c(A, B),
values_from = c(C, D),
names_sep = ""
)
#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates
#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates
#> # A tibble: 1 x 4
#> A B C D
#> <dbl> <dbl> <list> <list>
#> 1 4 1 <dbl [3]> <dbl [3]>
df1 %>%
# so we need to add a column that helps us to create the names `C1`, `C2`, ...
group_by(A, B) %>%
mutate(id = 1:n()) %>%
ungroup() %>%
pivot_wider(
id_cols = c(A, B),
names_from = id,
values_from = c(C, D),
names_sep = ""
)
#> # A tibble: 1 x 8
#> A B C1 C2 C3 D1 D2 D3
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 1 2 3 4 7 9 10
Thank you for the solution. It works perfectly what I want.
I noticed the following code also gives same result (the reason may be A and B column have unique elements).
df1 %>%
# so we need to add a column that helps us to create the names `C1`, `C2`, ...
#group_by(A, B) %>%
mutate(id = 1:n()) %>%
#ungroup() %>%
pivot_wider(
id_cols = c(A, B),
names_from = id,
values_from = c(C, D),
names_sep = ""
)
#> # A tibble: 1 x 8
#> A B C1 C2 C3 D1 D2 D3
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 1 2 3 4 7 9 10