Restructuring dataframe using Rstudio

Hello!
I need to reshape my data a little bit and I'm struggling to get an idea what would be the best way to go about it. I've made an example dataset to illustrate what I want to achieve:

This is the data how it is:

Employee Drink Day Quantity
Fred Tea Monday 1
Fred Tea Tuesday 3
Fred Tea Wednesday 2
Fred Coffee Monday 6
Marsha Coffee Monday 3
Marsha Coffee Wednesday 1
Marsha Tea Tuesday 2
Marsha Tea Thursday 5
Marsha Coffee Friday 2
Bob Tea Monday 3
Bob Tea Thursday 7

This is what it needs to become:

Employee Drink Monday Tuesday Wednesday Thursday Friday
Fred Tea 1 3 2 NA NA
Fred Coffee 6 NA NA NA NA
Marsha Tea NA 2 NA 5 NA
Marsha Coffee 3 NA 1 NA 2
Bob Tea 3 NA NA 7 NA
Bob Coffee NA NA NA NA NA

I've attempted to work with a multitude of loops, but it's not really doing what I want and I feel like I might be overcomplicating things, so any ideas on how to achieve this are very welcome!

https://tidyr.tidyverse.org/articles/pivot.html#wider

1 Like

Thanks for this! However, when I try

pivot_wider(data, names_from = Day, values_from = Quantity)

I get error/vctrs_error_names_must_be_unique
Is this because for example for Fred and Monday, there are two different quantities (one for coffee and one for tea)? And how do I account for that?

> pivot_wider(data, names_from = Day, values_from = Quantity)
# A tibble: 5 x 7
  Employee Drink  Monday Tuesday Wednesday Thursday Friday
  <fct>    <fct>   <int>   <int>     <int>    <int>  <int>
1 Fred     Tea         1       3         2       NA     NA
2 Fred     Coffee      6      NA        NA       NA     NA
3 Marsha   Coffee      3      NA         1       NA      2
4 Marsha   Tea        NA       2        NA        5     NA
5 Bob      Tea         3      NA        NA        7     NA

it works, perhaps you have an earlier version of tidyr?
what does this give you ?

packageVersion("tidyr")
1 Like

Actually, it works with this data now, thanks! I just can't get it to work with my actual dataset, the example may have been a bit oversimplified...
I'm trying to run pivot_wider(Pollutant_Releases_Old, names_from = ReportingYear, values_from = TotalQuantity) where the aspired result is to have one row per pollutant per facility

FacilityID NationalID FacilityName StreetName BuildingNumber PostalCode City CountryName Lat Long CompetentAuthorityName MainIAActivityCode MainIASectorName MainIAActivityName MainIASubActivityName MainActivityIndicator ReportingYear PollutantName PollutantGroupName TotalQuantity ReleaseMediumName
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2008 Nickel and compounds (as Ni) Heavy metals 94.7 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2008 Zinc and compounds (as Zn) Heavy metals 200 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2009 Nickel and compounds (as Ni) Heavy metals 64 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2009 Zinc and compounds (as Zn) Heavy metals 320 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2010 Nickel and compounds (as Ni) Heavy metals 193 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2010 Zinc and compounds (as Zn) Heavy metals 172 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2011 Copper and compounds (as Cu) Heavy metals 132 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2011 Nickel and compounds (as Ni) Heavy metals 593 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2011 Zinc and compounds (as Zn) Heavy metals 208 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2011 Total organic carbon (TOC) (as total C or COD/3) Other organic substances 178000 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2012 Chromium and compounds (as Cr) Heavy metals 106 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2012 Nickel and compounds (as Ni) Heavy metals 177 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2012 Total organic carbon (TOC) (as total C or COD/3) Other organic substances 104000 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2013 Nickel and compounds (as Ni) Heavy metals 47 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2013 Arsenic and compounds (as As) Heavy metals 14.2 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2013 Total organic carbon (TOC) (as total C or COD/3) Other organic substances 110000 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2014 Nickel and compounds (as Ni) Heavy metals 43.9 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2014 Mercury and compounds (as Hg) Heavy metals 21.5 Water
117743 2008000360 Acque Industriali Srl - Impianto di Pontedera Via Hangar 28/30 56025 PONTEDERA Italy 43.663333 10.615278 ISPRA 5.(c) Waste and waste water management Installations for the disposal of non-hazardous waste TRUE 2014 Arsenic and compounds (as As) Heavy metals 7.83 Water

Leanne, you simply need to sufficiently aggregate your long data before attempting to widen it.
I would show you but your way of sharing your data is inefficient and difficult to work with.
use dput() function on your dataframe to present it in a shareable way.
if you need to shorten your dataframe before sharing it use the head() function first to pick n number of rows from it.

1 Like

So sorry, hope this is better.

structure(list(FacilityID = c(117743, 117743, 117743, 117743, 
117743, 117743, 117743, 117743, 117743, 117743, 117743, 117743, 
117743, 117743, 117743, 117743, 117743, 117743, 117743), NationalID = c("2008000360", 
"2008000360", "2008000360", "2008000360", "2008000360", "2008000360", 
"2008000360", "2008000360", "2008000360", "2008000360", "2008000360", 
"2008000360", "2008000360", "2008000360", "2008000360", "2008000360", 
"2008000360", "2008000360", "2008000360"), FacilityName = c("Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera", 
"Acque Industriali Srl - Impianto di Pontedera", "Acque Industriali Srl - Impianto di Pontedera"
), StreetName = c("Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", 
"Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", 
"Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", 
"Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar", "Via Hangar"
), BuildingNumber = c("28/30", "28/30", "28/30", "28/30", "28/30", 
"28/30", "28/30", "28/30", "28/30", "28/30", "28/30", "28/30", 
"28/30", "28/30", "28/30", "28/30", "28/30", "28/30", "28/30"
), PostalCode = c("56025", "56025", "56025", "56025", "56025", 
"56025", "56025", "56025", "56025", "56025", "56025", "56025", 
"56025", "56025", "56025", "56025", "56025", "56025", "56025"
), City = c("PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", 
"PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", 
"PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", 
"PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA", "PONTEDERA"
), CountryName = c("Italy", "Italy", "Italy", "Italy", "Italy", 
"Italy", "Italy", "Italy", "Italy", "Italy", "Italy", "Italy", 
"Italy", "Italy", "Italy", "Italy", "Italy", "Italy", "Italy"
), Lat = c(43.663333, 43.663333, 43.663333, 43.663333, 43.663333, 
43.663333, 43.663333, 43.663333, 43.663333, 43.663333, 43.663333, 
43.663333, 43.663333, 43.663333, 43.663333, 43.663333, 43.663333, 
43.663333, 43.663333), Long = c(10.615278, 10.615278, 10.615278, 
10.615278, 10.615278, 10.615278, 10.615278, 10.615278, 10.615278, 
10.615278, 10.615278, 10.615278, 10.615278, 10.615278, 10.615278, 
10.615278, 10.615278, 10.615278, 10.615278), CompetentAuthorityName = c("ISPRA", 
"ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", 
"ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", "ISPRA", 
"ISPRA", "ISPRA", "ISPRA", "ISPRA"), MainIAActivityCode = c("5.(c)", 
"5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", 
"5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", "5.(c)", 
"5.(c)", "5.(c)", "5.(c)", "5.(c)"), MainIASectorName = c("Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management", 
"Waste and waste water management", "Waste and waste water management"
), MainIAActivityName = c("Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste", 
"Installations for the disposal of non-hazardous waste", "Installations for the disposal of non-hazardous waste"
), MainIASubActivityName = c(NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), MainActivityIndicator = c(TRUE, 
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), `2007` = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA), `2008` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), `2009` = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `2010` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA), `2011` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), `2012` = c(NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `2013` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA), `2014` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), `2015` = c(NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `2016` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA), `2017` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), ReportingYear = c(2008, 2008, 
2009, 2009, 2010, 2010, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 
2013, 2013, 2013, 2014, 2014, 2014), PollutantName = c("Nickel and compounds (as Ni)", 
"Zinc and compounds (as Zn)", "Nickel and compounds (as Ni)", 
"Zinc and compounds (as Zn)", "Nickel and compounds (as Ni)", 
"Zinc and compounds (as Zn)", "Copper and compounds (as Cu)", 
"Nickel and compounds (as Ni)", "Zinc and compounds (as Zn)", 
"Total organic carbon (TOC) (as total C or COD/3)", "Chromium and compounds (as Cr)", 
"Nickel and compounds (as Ni)", "Total organic carbon (TOC) (as total C or COD/3)", 
"Nickel and compounds (as Ni)", "Arsenic and compounds (as As)", 
"Total organic carbon (TOC) (as total C or COD/3)", "Nickel and compounds (as Ni)", 
"Mercury and compounds (as Hg)", "Arsenic and compounds (as As)"
), PollutantGroupName = c("Heavy metals", "Heavy metals", "Heavy metals", 
"Heavy metals", "Heavy metals", "Heavy metals", "Heavy metals", 
"Heavy metals", "Heavy metals", "Other organic substances", "Heavy metals", 
"Heavy metals", "Other organic substances", "Heavy metals", "Heavy metals", 
"Other organic substances", "Heavy metals", "Heavy metals", "Heavy metals"
), TotalQuantity = c(94.7, 200, 64, 320, 193, 172, 132, 593, 
208, 178000, 106, 177, 104000, 47, 14.2, 110000, 43.9, 21.5, 
7.83), ReleaseMediumName = c("Water", "Water", "Water", "Water", 
"Water", "Water", "Water", "Water", "Water", "Water", "Water", 
"Water", "Water", "Water", "Water", "Water", "Water", "Water", 
"Water")), row.names = c(NA, -19L), class = c("tbl_df", "tbl", 
"data.frame"))
Pollutant_Releases_Old %>% 
  select(ReportingYear,TotalQuantity,FacilityID,PollutantName) %>% 
  pivot_wider(names_from = ReportingYear, values_from = TotalQuantity) 
# # A tibble: 7 x 9
# FacilityID PollutantName                                    `2008` `2009` `2010` `2011` `2012`   `2013` `2014`
# <dbl> <chr>                                             <dbl>  <dbl>  <dbl>  <dbl>  <dbl>    <dbl>  <dbl>
# 1     117743 Nickel and compounds (as Ni)                       94.7     64    193    593    177     47    43.9 
# 2     117743 Zinc and compounds (as Zn)                        200      320    172    208     NA     NA    NA   
# 3     117743 Copper and compounds (as Cu)                       NA       NA     NA    132     NA     NA    NA   
# 4     117743 Total organic carbon (TOC) (as total C or COD/3)   NA       NA     NA 178000 104000 110000    NA   
# 5     117743 Chromium and compounds (as Cr)                     NA       NA     NA     NA    106     NA    NA   
# 6     117743 Arsenic and compounds (as As)                      NA       NA     NA     NA     NA     14.2   7.83
# 7     117743 Mercury and compounds (as Hg)                      NA       NA     NA     NA     NA     NA    21.5
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.