different implementation of NSE in `dplyr` and `tidyr`?

I have a few package functions where I am using NSE, which works fine when the function internally uses only dplyr, but runs into issues when there is also tidyr involved. Since both of these packages belong to the tidyverse ecosystem, I had assumed that their implementation of NSE is similar.

Is this not true? Are the following differences expected? Or I am doing something wrong here?
Thanks.

setup

set.seed(123)
library(tidyverse)
df <- as.data.frame(Titanic)

# function definition
foo <- function(data, x) {
  # dplyr
  print(as_tibble(dplyr::select(data, {{x}})))
  
  # tidyr
  print(as_tibble(tidyr::uncount(data = data, weights = {{ x }})))
}

with symbol

Works as expected for both dplyr and tidyr.

foo(df, Freq)
#> # A tibble: 32 x 1
#>     Freq
#>    <dbl>
#>  1     0
#>  2     0
#>  3    35
#>  4     0
#>  5     0
#>  6     0
#>  7    17
#>  8     0
#>  9   118
#> 10   154
#> # ... with 22 more rows

#> # A tibble: 2,201 x 4
#>    Class Sex   Age   Survived
#>    <fct> <fct> <fct> <fct>   
#>  1 3rd   Male  Child No      
#>  2 3rd   Male  Child No      
#>  3 3rd   Male  Child No      
#>  4 3rd   Male  Child No      
#>  5 3rd   Male  Child No      
#>  6 3rd   Male  Child No      
#>  7 3rd   Male  Child No      
#>  8 3rd   Male  Child No      
#>  9 3rd   Male  Child No      
#> 10 3rd   Male  Child No      
#> # ... with 2,191 more rows

with string

Works as expected with dplyr, but not tidyr.

foo(df, "Freq")
#> # A tibble: 32 x 1
#>     Freq
#>    <dbl>
#>  1     0
#>  2     0
#>  3    35
#>  4     0
#>  5     0
#>  6     0
#>  7    17
#>  8     0
#>  9   118
#> 10   154
#> # ... with 22 more rows

#> Error: `weights` must evaluate to a numeric vector

with a column name

Works as expected with dplyr, but not tidyr.

foo(df, names(df)[5])
#> # A tibble: 32 x 1
#>     Freq
#>    <dbl>
#>  1     0
#>  2     0
#>  3    35
#>  4     0
#>  5     0
#>  6     0
#>  7    17
#>  8     0
#>  9   118
#> 10   154
#> # ... with 22 more rows

#> Error: `weights` must evaluate to a numeric vector

with NULL

Works as expected for both dplyr and tidyr.

foo(df, NULL)
#> # A tibble: 32 x 0

#> # A tibble: 2,201 x 5
#>    Class Sex   Age   Survived  Freq
#>    <fct> <fct> <fct> <fct>    <dbl>
#>  1 3rd   Male  Child No          35
#>  2 3rd   Male  Child No          35
#>  3 3rd   Male  Child No          35
#>  4 3rd   Male  Child No          35
#>  5 3rd   Male  Child No          35
#>  6 3rd   Male  Child No          35
#>  7 3rd   Male  Child No          35
#>  8 3rd   Male  Child No          35
#>  9 3rd   Male  Child No          35
#> 10 3rd   Male  Child No          35
#> # ... with 2,191 more rows

Created on 2020-06-11 by the reprex package (v0.3.0)

1 Like

I don't know that it is a dplyr vs. tidyr thing so much as a select vs other functions thing. select allows the use of strings to select columns, though I believe it is not currently encouraged. I believe current best practices is to use the tidyselect functions any_of() or all_of().

Whereas for uncount, it expects/requires a bare variable name. If you add arrange and mutate for comparison, you can see that "Freq" doesn't really work with other dplyr functions correctly, select is special in this way.

set.seed(123)
library(tidyverse)
df <- as.data.frame(Titanic)

# function definition
foo <- function(data, x) {
  # dplyr
  print(as_tibble(dplyr::select(data, {{x}})))
  print(as_tibble(dplyr::arrange(data, {{x}})))
  print(as_tibble(dplyr::mutate(data, {{x}})))
  
  # tidyr
  print(as_tibble(tidyr::uncount(data = data, weights = {{ x }})))
}

foo(df, Freq)
#> # A tibble: 32 x 1
#>     Freq
#>    <dbl>
#>  1     0
#>  2     0
#>  3    35
#>  4     0
#>  5     0
#>  6     0
#>  7    17
#>  8     0
#>  9   118
#> 10   154
#> # … with 22 more rows
#> # A tibble: 32 x 5
#>    Class Sex    Age   Survived  Freq
#>    <fct> <fct>  <fct> <fct>    <dbl>
#>  1 1st   Male   Child No           0
#>  2 2nd   Male   Child No           0
#>  3 Crew  Male   Child No           0
#>  4 1st   Female Child No           0
#>  5 2nd   Female Child No           0
#>  6 Crew  Female Child No           0
#>  7 Crew  Male   Child Yes          0
#>  8 Crew  Female Child Yes          0
#>  9 1st   Female Child Yes          1
#> 10 Crew  Female Adult No           3
#> # … with 22 more rows
#> # A tibble: 32 x 5
#>    Class Sex    Age   Survived  Freq
#>    <fct> <fct>  <fct> <fct>    <dbl>
#>  1 1st   Male   Child No           0
#>  2 2nd   Male   Child No           0
#>  3 3rd   Male   Child No          35
#>  4 Crew  Male   Child No           0
#>  5 1st   Female Child No           0
#>  6 2nd   Female Child No           0
#>  7 3rd   Female Child No          17
#>  8 Crew  Female Child No           0
#>  9 1st   Male   Adult No         118
#> 10 2nd   Male   Adult No         154
#> # … with 22 more rows
#> # A tibble: 2,201 x 4
#>    Class Sex   Age   Survived
#>    <fct> <fct> <fct> <fct>   
#>  1 3rd   Male  Child No      
#>  2 3rd   Male  Child No      
#>  3 3rd   Male  Child No      
#>  4 3rd   Male  Child No      
#>  5 3rd   Male  Child No      
#>  6 3rd   Male  Child No      
#>  7 3rd   Male  Child No      
#>  8 3rd   Male  Child No      
#>  9 3rd   Male  Child No      
#> 10 3rd   Male  Child No      
#> # … with 2,191 more rows
foo(df, "Freq")
#> # A tibble: 32 x 1
#>     Freq
#>    <dbl>
#>  1     0
#>  2     0
#>  3    35
#>  4     0
#>  5     0
#>  6     0
#>  7    17
#>  8     0
#>  9   118
#> 10   154
#> # … with 22 more rows
#> # A tibble: 32 x 5
#>    Class Sex    Age   Survived  Freq
#>    <fct> <fct>  <fct> <fct>    <dbl>
#>  1 1st   Male   Child No           0
#>  2 2nd   Male   Child No           0
#>  3 3rd   Male   Child No          35
#>  4 Crew  Male   Child No           0
#>  5 1st   Female Child No           0
#>  6 2nd   Female Child No           0
#>  7 3rd   Female Child No          17
#>  8 Crew  Female Child No           0
#>  9 1st   Male   Adult No         118
#> 10 2nd   Male   Adult No         154
#> # … with 22 more rows
#> # A tibble: 32 x 6
#>    Class Sex    Age   Survived  Freq `"Freq"`
#>    <fct> <fct>  <fct> <fct>    <dbl> <chr>   
#>  1 1st   Male   Child No           0 Freq    
#>  2 2nd   Male   Child No           0 Freq    
#>  3 3rd   Male   Child No          35 Freq    
#>  4 Crew  Male   Child No           0 Freq    
#>  5 1st   Female Child No           0 Freq    
#>  6 2nd   Female Child No           0 Freq    
#>  7 3rd   Female Child No          17 Freq    
#>  8 Crew  Female Child No           0 Freq    
#>  9 1st   Male   Adult No         118 Freq    
#> 10 2nd   Male   Adult No         154 Freq    
#> # … with 22 more rows
#> Error: `weights` must evaluate to a numeric vector

Created on 2020-06-11 by the reprex package (v0.3.0)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.