Creating New NA Columns With mutate_at worked in dplyr 0.8.1 but not in 0.8.2

It took me a while to get this code working to create a template tibble a while ago while using earlier versions of dplyr. Eventually I had to settle in using the deprecated funs, as I couldn't get list and/or ~ to work.

I updated the dplyr 0.8.2 yesterday and the code no longer works - error message is "could not find column 2013/14" (or similar). Can anybody help in getting the code working?

library(dplyr)
yrsToInclude = c("2013/14", "2014/15", "2015/16", "2016/17", "2017/18")

tibble(
  Pos = seq_len(10),
  Provider  = NA_character_
) %>% 
  mutate_at(yrsToInclude, funs(1 * NA))

This is the tibble that's produced in 0.8.1:

# A tibble: 10 x 7
     Pos Provider `2013/14` `2014/15` `2015/16` `2016/17` `2017/18`
   <int> <chr>        <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
 1     1 <NA>            NA        NA        NA        NA        NA
 2     2 <NA>            NA        NA        NA        NA        NA
 3     3 <NA>            NA        NA        NA        NA        NA
 4     4 <NA>            NA        NA        NA        NA        NA
 5     5 <NA>            NA        NA        NA        NA        NA
 6     6 <NA>            NA        NA        NA        NA        NA
 7     7 <NA>            NA        NA        NA        NA        NA
 8     8 <NA>            NA        NA        NA        NA        NA
 9     9 <NA>            NA        NA        NA        NA        NA
10    10 <NA>            NA        NA        NA        NA        NA

I think you were relying on an unwanted behaviour that got fixed, using mutate_at this way doesn't seems natural, this is an alternative approach, I'm sure there are more.

library(tidyverse)
yrsToInclude = c("2013/14", "2014/15", "2015/16", "2016/17", "2017/18")

tibble(Pos = seq_len(10),
       Provider  = NA_real_,
       yrsToInclude = rep(yrsToInclude, 2)) %>% 
    spread(yrsToInclude, Provider) %>% 
    mutate(Provider = NA_character_) %>% 
    select(Pos, Provider, everything())
#> # A tibble: 10 x 7
#>      Pos Provider `2013/14` `2014/15` `2015/16` `2016/17` `2017/18`
#>    <int> <chr>        <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
#>  1     1 <NA>            NA        NA        NA        NA        NA
#>  2     2 <NA>            NA        NA        NA        NA        NA
#>  3     3 <NA>            NA        NA        NA        NA        NA
#>  4     4 <NA>            NA        NA        NA        NA        NA
#>  5     5 <NA>            NA        NA        NA        NA        NA
#>  6     6 <NA>            NA        NA        NA        NA        NA
#>  7     7 <NA>            NA        NA        NA        NA        NA
#>  8     8 <NA>            NA        NA        NA        NA        NA
#>  9     9 <NA>            NA        NA        NA        NA        NA
#> 10    10 <NA>            NA        NA        NA        NA        NA

Created on 2019-07-02 by the reprex package (v0.3.0)

Is it imperative you do this with dplyr/mutate_at() or do everything in a pipe chain?

I saw several seemingly viable options to this Stack Overflow question about adding multiple columns to a tibble. However, not all of them would naturally work with pipes.

For example, you'll see in one answer you can add columns to an existing tibble based on a vector of strings using [ and assignment:

test = tibble(
    Pos = seq_len(10),
    Provider  = NA_character_
) 

test[ , yrsToInclude] = NA_real_

# A tibble: 10 x 7
     Pos Provider `2013/14` `2014/15` `2015/16` `2016/17` `2017/18`
   <int> <chr>        <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
 1     1 NA              NA        NA        NA        NA        NA
 2     2 NA              NA        NA        NA        NA        NA
 3     3 NA              NA        NA        NA        NA        NA
 4     4 NA              NA        NA        NA        NA        NA
 5     5 NA              NA        NA        NA        NA        NA
 6     6 NA              NA        NA        NA        NA        NA
 7     7 NA              NA        NA        NA        NA        NA
 8     8 NA              NA        NA        NA        NA        NA
 9     9 NA              NA        NA        NA        NA        NA
10    10 NA              NA        NA        NA        NA        NA
1 Like

This can be done in a pipe chain using [<- as follows:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

yrsToInclude <- c("2013/14", "2014/15", "2015/16", "2016/17", "2017/18")

tibble(Pos = seq_len(10),
       Provider = NA_character_) %>%
  "[<-"(yrsToInclude, value = NA_real_)
#> # A tibble: 10 x 7
#>      Pos Provider `2013/14` `2014/15` `2015/16` `2016/17` `2017/18`
#>    <int> <chr>        <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
#>  1     1 <NA>            NA        NA        NA        NA        NA
#>  2     2 <NA>            NA        NA        NA        NA        NA
#>  3     3 <NA>            NA        NA        NA        NA        NA
#>  4     4 <NA>            NA        NA        NA        NA        NA
#>  5     5 <NA>            NA        NA        NA        NA        NA
#>  6     6 <NA>            NA        NA        NA        NA        NA
#>  7     7 <NA>            NA        NA        NA        NA        NA
#>  8     8 <NA>            NA        NA        NA        NA        NA
#>  9     9 <NA>            NA        NA        NA        NA        NA
#> 10    10 <NA>            NA        NA        NA        NA        NA
1 Like

Thanks for this solution. It is not currently imperative that a chained solution is required, but this might end up a purrr workflow so this should work nicely.

David

Oh, great. I thought that should work but I must have mis-typed something when I tried it. :slightly_smiling_face:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.