trying to use dplyr::mutate_at() with purrr::map()


#1

I'm trying to replace NA values in a tibble with -9999 as is common in our field. Doing this to a single tibble with mutate_at() works, but doing it with a list of tibbles and map() does not. I have tried many variations of what you can see in the reprex, but I can't seem to get the incantation right. I've done it with and without funs(), with and without the ~ operator. I always seem to get an error about class or quosures.

I could mutate each column individually, but our actual dataset has many more variables. It would be cumbersome but not impossible.

Could somebody show me what I'm missing? Thank you!


library(tidyverse)

# create list of tibbles 'y'

x <- tibble(X1 = seq.Date(
  from = as.Date("2018-01-01"),
  to = as.Date("2018-01-10"),
  by = "1 day"
)) %>%
  mutate(X2 = as.numeric(sample(
    c(11:13, NA), size = 10, replace = T
  ))) %>%
  mutate(X3 = as.numeric(sample(
    c(14:16, NA), size = 10, replace = T
  ))) %>%
  mutate(X4 = as.numeric(sample(
    c(17:19, NA), size = 10, replace = T
  )))

y <- list(x, x)

# replace NAs with '-9999' as is customary in the field

y[[1]] %>% mutate_all(funs(case_when(is.na(.) ~ -9999, TRUE ~ .)))
#> Error in mutate_impl(.data, dots): Evaluation error: must be numeric, not Date.

# mutate_all() doens't work because the column type doesn't match, as one would
# expect.

y[[1]] %>% mutate_at(vars(-X1), funs(case_when(is.na(.) ~ -9999, TRUE ~ .)))
#> # A tibble: 10 x 4
#>    X1            X2    X3    X4
#>    <date>     <dbl> <dbl> <dbl>
#>  1 2018-01-01    13    15 -9999
#>  2 2018-01-02    12    14    19
#>  3 2018-01-03 -9999    15    19
#>  4 2018-01-04 -9999    15 -9999
#>  5 2018-01-05    11    15 -9999
#>  6 2018-01-06    12    16    19
#>  7 2018-01-07 -9999    15    19
#>  8 2018-01-08    13    16 -9999
#>  9 2018-01-09 -9999 -9999    17
#> 10 2018-01-10    11    14    19

# mutate_at() works, but it doesn't seem to scale with map

y %>% map(case_when(is.na(.) ~ -9999, TRUE ~ . ))
#> Error: must be type double, not list

Created on 2018-11-28 by the reprex package (v0.2.1)


#2

In map, you need to begin with ~ to let map know that a formula is coming, and then .x, which is the "pronoun" that refers back to the data frame in each list element of y. Then, the case_when has to be wrapped in mutate_at.

y %>% map(~ .x %>% mutate_at(vars(-X1), funs(case_when(is.na(.) ~ -9999, 
                                                       TRUE ~ . ))))

#3

Thank you so much! I never would have figured out the syntax on my own...

:slight_smile:


#4

Just wanted to add that tidyr::replace_na() can do the work you are currently doing with case_when. Alternatively, you could also use base replace if you don't want/need to specify columns to operate on.

library(tidyverse)

x <- tibble(X1 = seq.Date(
  from = as.Date("2018-01-01"),
  to = as.Date("2018-01-10"),
  by = "1 day"
  )) %>%
  mutate(X2 = as.numeric(sample(
    c(11:13, NA), size = 10, replace = T
  ))) %>%
  mutate(X3 = as.numeric(sample(
    c(14:16, NA), size = 10, replace = T
  ))) %>%
  mutate(X4 = as.numeric(sample(
    c(17:19, NA), size = 10, replace = T
  )))

y <- list(x, x)

map(y, ~ mutate_at(.x, vars(-X1), replace_na, -9999))
#> [[1]]
#> # A tibble: 10 x 4
#>    X1            X2    X3    X4
#>    <date>     <dbl> <dbl> <dbl>
#>  1 2018-01-01    12    14    19
#>  2 2018-01-02 -9999    14 -9999
#>  3 2018-01-03 -9999    16 -9999
#>  4 2018-01-04    11    14    17
#>  5 2018-01-05    11 -9999    18
#>  6 2018-01-06    11    14 -9999
#>  7 2018-01-07    11    15 -9999
#>  8 2018-01-08    13    16    19
#>  9 2018-01-09    12    14    17
#> 10 2018-01-10    13    15    17
#> 
#> [[2]]
#> # A tibble: 10 x 4
#>    X1            X2    X3    X4
#>    <date>     <dbl> <dbl> <dbl>
#>  1 2018-01-01    12    14    19
#>  2 2018-01-02 -9999    14 -9999
#>  3 2018-01-03 -9999    16 -9999
#>  4 2018-01-04    11    14    17
#>  5 2018-01-05    11 -9999    18
#>  6 2018-01-06    11    14 -9999
#>  7 2018-01-07    11    15 -9999
#>  8 2018-01-08    13    16    19
#>  9 2018-01-09    12    14    17
#> 10 2018-01-10    13    15    17

map(y, ~ replace(.x, is.na(.x), -9999))
#> [[1]]
#> # A tibble: 10 x 4
#>    X1            X2    X3    X4
#>    <date>     <dbl> <dbl> <dbl>
#>  1 2018-01-01    12    14    19
#>  2 2018-01-02 -9999    14 -9999
#>  3 2018-01-03 -9999    16 -9999
#>  4 2018-01-04    11    14    17
#>  5 2018-01-05    11 -9999    18
#>  6 2018-01-06    11    14 -9999
#>  7 2018-01-07    11    15 -9999
#>  8 2018-01-08    13    16    19
#>  9 2018-01-09    12    14    17
#> 10 2018-01-10    13    15    17
#> 
#> [[2]]
#> # A tibble: 10 x 4
#>    X1            X2    X3    X4
#>    <date>     <dbl> <dbl> <dbl>
#>  1 2018-01-01    12    14    19
#>  2 2018-01-02 -9999    14 -9999
#>  3 2018-01-03 -9999    16 -9999
#>  4 2018-01-04    11    14    17
#>  5 2018-01-05    11 -9999    18
#>  6 2018-01-06    11    14 -9999
#>  7 2018-01-07    11    15 -9999
#>  8 2018-01-08    13    16    19
#>  9 2018-01-09    12    14    17
#> 10 2018-01-10    13    15    17

Created on 2018-11-28 by the reprex package (v0.2.1)


#5

Thank you, that’s a great shortcut!


#6

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.