Is there a way to skip NAs when binding columns?

Let's say I have 2 tables with columns of unequal length that I want to bind.

Here are the tables:

    library(dplyr, warn.conflicts = FALSE)
    
    table_1 <- tibble(var_1 = c(1, 2, NA, 3))
    
    table_1
#> # A tibble: 4 x 1
#>   var_1
#>   <dbl>
#> 1     1
#> 2     2
#> 3    NA
#> 4     3
    
    table_2 <- tibble(var_2 = c("a", "b", "c"))
    
    table_2
#> # A tibble: 3 x 1
#>   var_2
#>   <chr>
#> 1 a    
#> 2 b    
#> 3 c

Is there a way I can bind these columns where the binding skips over the NAs? I guess that would mean substituting an NA in the shorter table where there is an NA in the longer table.

So here's what I want to have in the end:

#> A tibble: 4 x 2
#>  var_1 var_2
#>  <dbl> <chr>
#> 1     1 a    
#> 2     2 b    
#> 3    NA NA   
#> 4     3 c    

Does that make any sense? There's probably a term for the type of joining I'm trying to do.

I'm sure (like everything else) there's a way to do this, but out of curiosity, what sort of data do you have where this makes sense? Seems like it may mess up your observations. Wouldn't it be better to keep all NAs? Do your original datasets have some sort of common key?

E.g:

library(dplyr, warn.conflicts = FALSE)

table_1 <- tibble(var_1 = c(1, 2, NA, 3))
table_2 <- tibble(var_2 = c("a", "b", "c"))

table_1$id <- 1:4
table_2$id <- 1:3
full_join(table_1, table_2) %>% select(-id)
#> Joining, by = "id"
#> # A tibble: 4 x 2
#>   var_1 var_2
#>   <dbl> <chr>
#> 1     1 a    
#> 2     2 b    
#> 3    NA c    
#> 4     3 <NA>

Created on 2018-12-30 by the reprex package (v0.2.1)

A more hacky solution that gives you what you want (but without NAs), but I'm not sure how this would work with larger datasets.

library(dplyr, warn.conflicts = FALSE)

table_1 <- tibble(var_1 = c(1, 2, NA, 3))
table_2 <- tibble(var_2 = c("a", "b", "c"))

table_1$id <- 1:nrow(table_1)
table_2$id <- 1:nrow(table_2)

inner_join(table_1, table_2) %>% 
  mutate(var_1 = if_else(is.na(var_1), id, as.integer(var_1))) %>% 
  select(-id)
#> Joining, by = "id"
#> # A tibble: 3 x 2
#>   var_1 var_2
#>   <int> <chr>
#> 1     1 a    
#> 2     2 b    
#> 3     3 c

Created on 2018-12-30 by the reprex package (v0.2.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.