Is there a tidyverse shortcut for something like `replace(x, is.na(x), filler)`?


#1

Is there a tidyverse shortcut for replace(df, is.na(df), filler) ?

I’m tempted to write this:

fill_na <- function(x, filler = 0) {
  replace(x, is.na(x), filler)
}

#2

Hi @mauro_lepore, I think dplyr::coalesce() may be the closest to what you’re looking for. It takes the first non-NA value passed in the (…). The only catch, I see, is that you’ll have to coerce the variable to match the default value’s type:

> df <- NA
> coalesce(as.double(df), 1)
[1] 1
> df <- 3
> coalesce(as.double(df), 1)
[1] 3

#3

Thanks!!! Here I summarize the discussion.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union



# With dplyr 

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

x <- data.frame(x = c(NA, 1), y = c("a", NA), stringsAsFactors = FALSE)

coalesce(x, list(0))
#>   x y
#> 1 0 a
#> 2 1 0
coalesce(x, list("missing"))
#>         x       y
#> 1 missing       a
#> 2       1 missing



# With my wrapper

fill_na <- function(x, filler = 0) {
  x[is.na(x)] <- filler
  x
}

x <- data.frame(x = c(NA, 1), y = c("a", NA), stringsAsFactors = FALSE)

a_dataframe <- x
fill_na(a_dataframe)
#>   x y
#> 1 0 a
#> 2 1 0
fill_na(a_dataframe, "")
#>   x y
#> 1   a
#> 2 1
fill_na(a_dataframe, "missing")
#>         x       y
#> 1 missing       a
#> 2       1 missing

a_matrix <- as.matrix(x)
fill_na(a_matrix)
#>      x    y  
#> [1,] "0"  "a"
#> [2,] " 1" "0"

a_vector <- x$x
fill_na(a_vector)
#> [1] 0 1

a_list <- list(x, x, x)
lapply(a_list, fill_na)
#> [[1]]
#>   x y
#> 1 0 a
#> 2 1 0
#> 
#> [[2]]
#>   x y
#> 1 0 a
#> 2 1 0
#> 
#> [[3]]
#>   x y
#> 1 0 a
#> 2 1 0

#4

There is also some useful tool in tidyr. you will find a replace_na function

library(dplyr, warn.conflicts = F)
library(tidyr)

tab <- data.frame(x = c(NA, 1), y = c("a", NA), stringsAsFactors = FALSE)
tab
#>    x    y
#> 1 NA    a
#> 2  1 <NA>
# can control value for each column
tidyr::replace_na(tab, list(x = 0, y = 0))
#>   x y
#> 1 0 a
#> 2 1 0
tidyr::replace_na(tab, list(x = 0, y = "missing"))
#>   x       y
#> 1 0       a
#> 2 1 missing
tidyr::replace_na(tab, list(x = "missing", y = "missing"))
#>         x       y
#> 1 missing       a
#> 2       1 missing
# for same value for all columns
filler <- purrr::rerun(length(tab), "missing") %>% purrr::set_names(names(tab))
tidyr::replace_na(tab, filler)
#>         x       y
#> 1 missing       a
#> 2       1 missing

Created on 2018-02-13 by the reprex package (v0.2.0).