Replacing certain cells in column with cells from another column

Hi all, I have a dataframe as follows

ID Date Date2
1 1/1/2000 1/1/2000
2 NA 2/10/2000
3 2/2/1999 2/2/1999
4 NA 3/08/2020

The two date columns are dates obtained from two different dataframes. Because of the way the whole process works, sometimes a date is only entered into one of the dataframes.

What i did was merge the dataframes and above is an example of the result. My dataframes is a lot more than just 4 id's however. What i would like to do is replace the NAs with the dates of the column Date2, and leave the other dates alone.

I am dealing with a very large dataframe so typing out each individual cell doesn't work, i was wondering if there is a function that will do it in bulk based off the NA value, matching it to the corresponding Date2 column.

Thanks for the help everyone!

A base R solution:

to_replace <- is.na(df$Date)
df$Date[to_replace] <- df$Date2[to_replace]

with is.na(df$Date) you create a vector of which cells to replace, and you replace only these cells with the corresponding Date2.

Considering your tag, here is a tidyverse solution that might be less efficient:

df |>
  mutate(Date = if_else(is.na(Date), Date2, Date))

And the reprex:

library(tidyverse)

df <- read_table("ID Date Date2
1 1/1/2000 1/1/2000
2 NA 2/10/2000
3 2/2/1999 2/2/1999
4 NA 3/08/2020")
df
#> # A tibble: 4 × 3
#>      ID Date     Date2    
#>   <dbl> <chr>    <chr>    
#> 1     1 1/1/2000 1/1/2000 
#> 2     2 <NA>     2/10/2000
#> 3     3 2/2/1999 2/2/1999 
#> 4     4 <NA>     3/08/2020


df |>
  mutate(Date = if_else(is.na(Date), Date2, Date))
#> # A tibble: 4 × 3
#>      ID Date      Date2    
#>   <dbl> <chr>     <chr>    
#> 1     1 1/1/2000  1/1/2000 
#> 2     2 2/10/2000 2/10/2000
#> 3     3 2/2/1999  2/2/1999 
#> 4     4 3/08/2020 3/08/2020


to_replace <- is.na(df$Date)
df$Date[to_replace] <- df$Date2[to_replace]
df
#> # A tibble: 4 × 3
#>      ID Date      Date2    
#>   <dbl> <chr>     <chr>    
#> 1     1 1/1/2000  1/1/2000 
#> 2     2 2/10/2000 2/10/2000
#> 3     3 2/2/1999  2/2/1999 
#> 4     4 3/08/2020 3/08/2020

Created on 2022-07-13 by the reprex package (v2.0.1)

1 Like

coalesce() works too

library(dplyr)

# Sample data in a copy/paste friendly format
sample_df <- data.frame(
  stringsAsFactors = FALSE,
                ID = c(1, 2, 3, 4),
              Date = c("1/1/2000", NA, "2/2/1999", NA),
             Date2 = c("1/1/2000", "2/10/2000", "2/2/1999", "3/08/2020")
)

# Relevant code
sample_df %>% 
    mutate(Date = coalesce(Date, Date2))
#>   ID      Date     Date2
#> 1  1  1/1/2000  1/1/2000
#> 2  2 2/10/2000 2/10/2000
#> 3  3  2/2/1999  2/2/1999
#> 4  4 3/08/2020 3/08/2020

Created on 2022-07-12 by the reprex package (v2.0.1)

2 Likes

Thanks so much, works !!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.