Tidyr:nest() isn't nesting on the column I expect


#1

I have a sample of United Nations data set. I was trying to nest so that there are two columns, one with country name, and one that is a list of data for each country, like so:

# A tibble: 200 x 2
   country                         data             
   <chr>                           <list>           
 1 Afghanistan                     <tibble [34 x 3]>
 2 Argentina                       <tibble [34 x 3]>
 3 Australia                       <tibble [34 x 3]>
 4 Belarus                         <tibble [34 x 3]>
 5 Belgium                         <tibble [34 x 3]>

When I run the command inside of datacamp, it works as expected.

However, when I run inside RStudio, each year is showing up as a row. I tried doing nest(-country) , as well as simply listing all the variables I wanted to nest (leaving country out), in both cases year was left out of the next instead of country.

library(tidyverse)
#> Warning: package 'tidyr' was built under R version 3.4.4
#> Warning: package 'purrr' was built under R version 3.4.4
#> Warning: package 'dplyr' was built under R version 3.4.4
#> Warning: package 'stringr' was built under R version 3.4.4
#> Warning: package 'forcats' was built under R version 3.4.4
library(countrycode)
#> Warning: package 'countrycode' was built under R version 3.4.4
library(broom)
#> Warning: package 'broom' was built under R version 3.4.4
library(tidyr)
library(reprex)

#subset of data (put together using dput)

Sample_Data <-
    structure(list(year = structure(c(1997, 1997, 1997, 1999, 1999, 
                                      1999, 2001, 2001, 2001, 2003, 2003, 2003, 2005, 2005, 2005, 2007, 
                                      2007, 2007, 2009, 2009, 2009, 2011, 2011, 2011, 2013, 2013, 2013
    ), comment = ""), country = c("France", "United Kingdom", "United States", 
                                  "France", "United Kingdom", "United States", "France", "United Kingdom", 
                                  "United States", "France", "United Kingdom", "United States", 
                                  "France", "United Kingdom", "United States", "France", "United Kingdom", 
                                  "United States", "France", "United Kingdom", "United States", 
                                  "France", "United Kingdom", "United States", "France", "United Kingdom", 
                                  "United States"), total = c(69L, 69L, 69L, 68L, 68L, 68L, 65L, 
                                                              67L, 67L, 76L, 76L, 76L, 74L, 74L, 73L, 76L, 77L, 77L, 69L, 69L, 
                                                              69L, 65L, 65L, 65L, 64L, 64L, 64L), percent_yes = c(0.565217391304348, 
                                                                                                                  0.565217391304348, 0.289855072463768, 0.558823529411765, 0.558823529411765, 
                                                                                                                  0.235294117647059, 0.538461538461538, 0.537313432835821, 0.164179104477612, 
                                                                                                                  0.565789473684211, 0.513157894736842, 0.171052631578947, 0.581081081081081, 
                                                                                                                  0.581081081081081, 0.164383561643836, 0.526315789473684, 0.506493506493506, 
                                                                                                                  0.116883116883117, 0.478260869565217, 0.492753623188406, 0.188405797101449, 
                                                                                                                  0.569230769230769, 0.553846153846154, 0.261538461538462, 0.515625, 
                                                                                                                  0.5, 0.203125)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
                                                                                                                  ), row.names = c(NA, -27L), vars = "year", drop = TRUE, .Names = c("year", 
                                                                                                                                                                                     "country", "total", "percent_yes"), indices = list(0:2, 3:5, 
                                                                                                                                                                                                                                        6:8, 9:11, 12:14, 15:17, 18:20, 21:23, 24:26), group_sizes = c(3L, 
                                                                                                                                                                                                                                                                                                       3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), biggest_group_size = 3L, labels = structure(list(
                                                                                                                                                                                                                                                                                                           year = structure(c(1997, 1999, 2001, 2003, 2005, 2007, 2009, 
                                                                                                                                                                                                                                                                                                                              2011, 2013), comment = "")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                               -9L), vars = "year", drop = TRUE, .Names = "year"))

# Attempt at nesting

nested <- Sample_Data %>%
    nest(-country)
#> Warning: package 'bindrcpp' was built under R version 3.4.4

Created on 2018-06-19 by the reprex package (v0.2.0).


#2

Hi,

try this:

Sample_Data %>% ungroup() %>% nest(-country)

Sample_Data is still grouped by year and therefore nest is not working as expected.


#3

That was exactly my problem.

Thank you!