Nest, not as one-column tibbles, but as vectors

So I have a simple data frame, a bunch of names, each with ~2,000 rows of varying numeric values each. There's a function I want to use in my analysis that takes in numeric vectors only, and I want to iterate that function over each name's values in my data frame.

Using nest() creates a tibble for each name (see below). What I am trying to do is have each element be a vector, not an one-column tibble.

library(tidyverse)

df <- tibble(
  name = c(rep("Adam",5), rep("Liz",5)),
  value = c(1:5, 21:25)
)

df
#> # A tibble: 10 x 2
#>    name  value
#>    <chr> <int>
#>  1 Adam      1
#>  2 Adam      2
#>  3 Adam      3
#>  4 Adam      4
#>  5 Adam      5
#>  6 Liz      21
#>  7 Liz      22
#>  8 Liz      23
#>  9 Liz      24
#> 10 Liz      25

df %>%
  nest(-name)
#> # A tibble: 2 x 2
#>   name  data            
#>   <chr> <list>          
#> 1 Adam  <tibble [5 × 1]>
#> 2 Liz   <tibble [5 × 1]>

### What I think I want: 

#> # A tibble: 2 x 2
#>   name  data             
#>   <chr> <list>           
#> 1 Adam  <dbl [5]>
#> 2 Liz   <dbl [5]>

Created on 2019-05-23 by the reprex package (v0.2.1)

I've played around with unlist(), pluck(), pull() mixed with assorted map()s but none of them are getting me where I need to be. :pray:t2: Thanks in advance!

try this bro


df %>% group_by(name) %>% summarize(magic_list = list(value))
# A tibble: 2 x 2
  name  magic_list
  <chr> <list>    
1 Adam  <int [5]> 
2 Liz   <int [5]>
4 Likes

You can also wrap @von_olaf solution into a function that will allow you to group by multiple columns and nest a single column into a vector:

library(dplyr)
library(ggplot2)

nest_vec <- function(data, value, ...){
  value <- enquo(value)
  value_name <- quo_name(value)
  groups <- enquos(...)
  
  output <- data %>% 
    group_by(!!!groups) %>% 
    summarize(value_name := list(!!value))
  
  return(output)
}

df <- tibble(
  name = c(rep("Adam",5), rep("Liz",5)),
  value = c(1:5, 21:25)
)

df %>% 
  nest_vec(value, name)
#> # A tibble: 2 x 2
#>   name  value_name
#>   <chr> <list>    
#> 1 Adam  <int [5]> 
#> 2 Liz   <int [5]>


diamonds %>% 
  nest_vec(depth, cut, color)
#> # A tibble: 35 x 3
#> # Groups:   cut [5]
#>    cut   color value_name 
#>    <ord> <ord> <list>     
#>  1 Fair  D     <dbl [163]>
#>  2 Fair  E     <dbl [224]>
#>  3 Fair  F     <dbl [312]>
#>  4 Fair  G     <dbl [314]>
#>  5 Fair  H     <dbl [303]>
#>  6 Fair  I     <dbl [175]>
#>  7 Fair  J     <dbl [119]>
#>  8 Good  D     <dbl [662]>
#>  9 Good  E     <dbl [933]>
#> 10 Good  F     <dbl [909]>
#> # ... with 25 more rows

Created on 2019-05-23 by the reprex package (v0.2.0).

1 Like

So easy, thank you! I was fixating on using nest() (which seemed to make sense in this context).

Next question. I'm trying to use those new vectors in a map() function with a predicate function that requires an atomic vector as input, but it's throwing the error (list) object cannot be coerced to type 'double'. How do I make the predicate function grab the vector inside the list item?

# Your solution! 
df2 <- df %>%
  group_by(name) %>%
  summarise(magic_list = list(value))

# This is what I tried next: 
df2 %>%
  group_by(name) %>%
  mutate(outcome = map(magic_list, crqa::optimizeParam(., rhand_y, mlpar)))
#> Error in crqa::optimizeParam(., rhand_y, mlpar) : 
#>  (list) object cannot be coerced to type 'double'

# But this works fine (double square brackets): 
crqa::optimizeParam(df2$magic_list[[1]], rhand_y, mlpar)

# This won't work, though (single square brackets): 
crqa::optimizeParam(df2$magic_list[1], rhand_y, mlpar)
#> Error in min(series) : invalid 'type' (list) of argument

That is useful @tbradley - nominate that for an upcoming dplyr release! nest_*() similar to map_*() would be useful.

@tbradley - would you perhaps know an answer to my new question above? :grimacing: thank you in advance!

Never mind! I figured it out by using anonymous function syntax:

df2 %>%
  group_by(name) %>%
  mutate(outcome = map(magic_list, ~ crqa::optimizeParam(.x, rhand_y, mlpar)))

Apparently that is better than the below which is what I tried before:

df2 %>%
  group_by(name) %>%
  mutate(outcome = map(magic_list, crqa::optimizeParam(., rhand_y, mlpar)))

I'm not really sure why the first works but the second doesn't, but I'll run with it. :smiley:

1 Like

Two lines are completely different (hence you get different results).

When you use anonymous function (the one with ~), it'll treat .x as a vector (what you want).
When you use . in you second example, it is a shorthand for previous step from magrittr pipe (so, in your case it will be grouped dataframe, i.e., not what you want :slight_smile: ).

1 Like

I see, thanks for the assist!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.