How to transform data and wrangle it to fine detail ?

Hi Data Wranglers,

I had used gather and spread earlier but here column name itself has 2 names and hence found it tricky.

Let's consider a data frame.

df <- data.frame( "Year" = c(2001:2010,2001:2010), "Car" = c(rep("BMW",10),rep("Benz",10)),
                  "IND 1" = c(1:10),"IND 2" = c(1:10),"IND 3" = c(1:10),"IND 4" = c(1:10),"IND 5" = c(1:10),"IND 6" = c(1:10),
                  "CHN 1" = c(1:10),"CHN 2" = c(1:10),"CHN 3" = c(1:10),"CHN 4" = c(1:10),"CHN 5" = c(1:10),"CHN 6" = c(1:10),
                  "ITL 1" = c(1:10),"ITL 2" = c(1:10),"ITL 3" = c(1:10),"ITL 4" = c(1:10),"ITL 5" = c(1:10),"ITL 6" = c(1:10))

is there a way to transform the data into hierarchy ?

Year > Car > Country > Code
Eg:
Columns would be:
Year, Car, Country, Code, Value
1st row would be:
2001, BMW, IND, 1, 1

thanks in advance
Happy weekend :slight_smile:

I am not sure I understand the request. Is this what you mean?

df <- data.frame( "Year" = c(2001:2010,2001:2010), "Car" = c(rep("BMW",10),rep("Benz",10)),
                  "IND 1" = c(1:10),"IND 2" = c(1:10),"IND 3" = c(1:10),"IND 4" = c(1:10),"IND 5" = c(1:10),"IND 6" = c(1:10),
                  "CHN 1" = c(1:10),"CHN 2" = c(1:10),"CHN 3" = c(1:10),"CHN 4" = c(1:10),"CHN 5" = c(1:10),"CHN 6" = c(1:10),
                  "ITL 1" = c(1:10),"ITL 2" = c(1:10),"ITL 3" = c(1:10),"ITL 4" = c(1:10),"ITL 5" = c(1:10),"ITL 6" = c(1:10))
library(tidyr)
colnames(df)
#>  [1] "Year"  "Car"   "IND.1" "IND.2" "IND.3" "IND.4" "IND.5" "IND.6"
#>  [9] "CHN.1" "CHN.2" "CHN.3" "CHN.4" "CHN.5" "CHN.6" "ITL.1" "ITL.2"
#> [17] "ITL.3" "ITL.4" "ITL.5" "ITL.6"
df2 <- df %>% gather(key = RawCode, value = Value, -Year, -Car)
df2 <- df2 %>% separate(RawCode, into = c("country", "code"))
head(df2)
#>   Year Car country code Value
#> 1 2001 BMW     IND    1     1
#> 2 2002 BMW     IND    1     2
#> 3 2003 BMW     IND    1     3
#> 4 2004 BMW     IND    1     4
#> 5 2005 BMW     IND    1     5
#> 6 2006 BMW     IND    1     6

Created on 2019-03-22 by the reprex package (v0.2.1)

1 Like

That is a common use case in Python's Pandas package, but I don't think R has popular tools for this. My guess is you could simulate one level of hierarchy with nested data frames.

Thanks for quick help @FJCC

You understood my request perfectly.....Kudos Champ :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.