Filter unique and

Can you tell me code to transform this data frame to other data frame?
Please see my picture for more detail.
best regard,

Hi Sophanna, welcome!

I think this can be achived using a combination of the dplyr and tidyr packages but, we don't really have enough info to help you out. could you please turn this into a self-contained REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

In original DATA FRAME, there are duplicate product code(in the Obs) with different SALE (in the Variable).
In this sense, I want to matrix PRODUCT_CODE to the unique value ( In OBS)and each SALE of the duplicate product code to another different SALE column .

That isn't really more helpful, I think you want to do something similar to this, but we can't actually give better help, if you don't provide a reproducible example.

library(dplyr)
library(tidyr)

df <- tibble::tribble(
    ~Species, ~Petal.Length,
    "setosa",           1.4,
    "setosa",           1.4,
    "setosa",           1.3,
    "setosa",           1.5,
    "setosa",           1.4,
    "versicolor",           4.7,
    "versicolor",           4.5,
    "versicolor",           4.9,
    "versicolor",             4,
    "versicolor",           4.6,
    "virginica",             6,
    "virginica",           5.1,
    "virginica",           5.9,
    "virginica",           5.6,
    "virginica",           5.8
)

df
#> # A tibble: 15 x 2
#>    Species    Petal.Length
#>    <chr>             <dbl>
#>  1 setosa              1.4
#>  2 setosa              1.4
#>  3 setosa              1.3
#>  4 setosa              1.5
#>  5 setosa              1.4
#>  6 versicolor          4.7
#>  7 versicolor          4.5
#>  8 versicolor          4.9
#>  9 versicolor          4  
#> 10 versicolor          4.6
#> 11 virginica           6  
#> 12 virginica           5.1
#> 13 virginica           5.9
#> 14 virginica           5.6
#> 15 virginica           5.8

df %>% 
    group_by(Species) %>%
    mutate(n = row_number(), cols = paste0('petal_length_', n)) %>% 
    select(-n) %>% 
    spread(cols, Petal.Length)
#> # A tibble: 3 x 6
#> # Groups:   Species [3]
#>   Species petal_length_1 petal_length_2 petal_length_3 petal_length_4
#>   <chr>            <dbl>          <dbl>          <dbl>          <dbl>
#> 1 setosa             1.4            1.4            1.3            1.5
#> 2 versic…            4.7            4.5            4.9            4  
#> 3 virgin…            6              5.1            5.9            5.6
#> # … with 1 more variable: petal_length_5 <dbl>

Created on 2019-02-08 by the reprex package (v0.2.1)

2 Likes

I think that what you need is the dcast function. look at the dcast/ melt functions, one does the translation from the right form of column per type to one column with the column name (melt, Rey useful for charting) , and the other (what I think you need, dcast) does the translation at the other direction I’m not near my computer to give you a concrete example but I think that if you look at the help page it will be clear whether it’s indeed what you need.

1 Like

You will want to use tidyr::spread

library(tidyverse)
set.seed(1)
df1 <- tibble(PRODUCT_CODE = c("a","a","b","b","c","c","d","d","d"),
              SALE = sample(1:5,9,T))
df1
# # A tibble: 9 x 2
#   PRODUCT_CODE  SALE
#          <chr> <int>
# 1            a     2
# 2            a     2
# 3            b     3
# 4            b     5
# 5            c     2
# 6            c     5
# 7            d     5
# 8            d     4
# 9            d     4

df1 %>% 
  group_by(PRODUCT_CODE) %>% 
  mutate(new_col = paste0("SALE_",row_number())) %>%
  spread(new_col,SALE) %>%
  ungroup()
# # A tibble: 4 x 4
#   PRODUCT_CODE SALE_1 SALE_2 SALE_3
# *        <chr>  <int>  <int>  <int>
# 1            a      2      2     NA
# 2            b      3      5     NA
# 3            c      2      5     NA
# 4            d      5      4      4
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.