Can you tell me code to transform this data frame to other data frame?
Please see my picture for more detail.
best regard,
Hi Sophanna, welcome!
I think this can be achived using a combination of the dplyr
and tidyr
packages but, we don't really have enough info to help you out. could you please turn this into a self-contained REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
If you've never heard of a reprex before, you might want to start by reading this FAQ:
In original DATA FRAME, there are duplicate product code(in the Obs) with different SALE (in the Variable).
In this sense, I want to matrix PRODUCT_CODE to the unique value ( In OBS)and each SALE of the duplicate product code to another different SALE column .
That isn't really more helpful, I think you want to do something similar to this, but we can't actually give better help, if you don't provide a reproducible example.
library(dplyr)
library(tidyr)
df <- tibble::tribble(
~Species, ~Petal.Length,
"setosa", 1.4,
"setosa", 1.4,
"setosa", 1.3,
"setosa", 1.5,
"setosa", 1.4,
"versicolor", 4.7,
"versicolor", 4.5,
"versicolor", 4.9,
"versicolor", 4,
"versicolor", 4.6,
"virginica", 6,
"virginica", 5.1,
"virginica", 5.9,
"virginica", 5.6,
"virginica", 5.8
)
df
#> # A tibble: 15 x 2
#> Species Petal.Length
#> <chr> <dbl>
#> 1 setosa 1.4
#> 2 setosa 1.4
#> 3 setosa 1.3
#> 4 setosa 1.5
#> 5 setosa 1.4
#> 6 versicolor 4.7
#> 7 versicolor 4.5
#> 8 versicolor 4.9
#> 9 versicolor 4
#> 10 versicolor 4.6
#> 11 virginica 6
#> 12 virginica 5.1
#> 13 virginica 5.9
#> 14 virginica 5.6
#> 15 virginica 5.8
df %>%
group_by(Species) %>%
mutate(n = row_number(), cols = paste0('petal_length_', n)) %>%
select(-n) %>%
spread(cols, Petal.Length)
#> # A tibble: 3 x 6
#> # Groups: Species [3]
#> Species petal_length_1 petal_length_2 petal_length_3 petal_length_4
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 setosa 1.4 1.4 1.3 1.5
#> 2 versic… 4.7 4.5 4.9 4
#> 3 virgin… 6 5.1 5.9 5.6
#> # … with 1 more variable: petal_length_5 <dbl>
Created on 2019-02-08 by the reprex package (v0.2.1)
I think that what you need is the dcast function. look at the dcast/ melt functions, one does the translation from the right form of column per type to one column with the column name (melt, Rey useful for charting) , and the other (what I think you need, dcast) does the translation at the other direction I’m not near my computer to give you a concrete example but I think that if you look at the help page it will be clear whether it’s indeed what you need.
You will want to use tidyr::spread
library(tidyverse)
set.seed(1)
df1 <- tibble(PRODUCT_CODE = c("a","a","b","b","c","c","d","d","d"),
SALE = sample(1:5,9,T))
df1
# # A tibble: 9 x 2
# PRODUCT_CODE SALE
# <chr> <int>
# 1 a 2
# 2 a 2
# 3 b 3
# 4 b 5
# 5 c 2
# 6 c 5
# 7 d 5
# 8 d 4
# 9 d 4
df1 %>%
group_by(PRODUCT_CODE) %>%
mutate(new_col = paste0("SALE_",row_number())) %>%
spread(new_col,SALE) %>%
ungroup()
# # A tibble: 4 x 4
# PRODUCT_CODE SALE_1 SALE_2 SALE_3
# * <chr> <int> <int> <int>
# 1 a 2 2 NA
# 2 b 3 5 NA
# 3 c 2 5 NA
# 4 d 5 4 4
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.