Converting age in years --> months --> days with NA

graceahey · March 8, 2021, 4:09pm

Hi all- Trying to convert age in years to months & days, accounting for some 'NA' values in respective columns (age in months only recorded for children < 5 years): The following isn't working for me, for some reason.

##Creating an age in years variable for children whose age in months was recorded (preferred; more precise); 
df_temp<-df_temp %>% 
  mutate(age_years_correct = ifelse(floor(age_months/12)!=age_years,
                                    floor(age_months/12),
                                   age_years))
## Trying to substitute age in years recorded for those without an age_years_correct value. 
df_temp<-df_temp %>% 
  mutate(age_years_correct = ifelse(age_years_correct==NA,
                                    age_years,
                                    age_years_correct))
#age in months for those missing 
is.na(df_temp$age_months)
#GHHelp also not working. 
df_temp<-df_temp %>% 
  mutate(age_months_correct = ifelse(age_months==NA,
                                    age_years*12,
                                    age_months))
#Creating an age in days based on age_months
df_temp$age_days<-df_temp$age_months * (365.25 / 12)

joels · March 8, 2021, 4:19pm

For starters, change age_years_correct==NA to is.na(age_years_correct) and change age_months==NA to is.na(age_months) and let us know if that resolves the problem.

graceahey · March 8, 2021, 5:13pm

No, it does not solve the problem- already tried this.

joels · March 8, 2021, 5:28pm

Can you provide a reproducible example with a small dataset that reproduces the errors you're getting?

graceahey · March 8, 2021, 6:04pm

Is this how you're suggesting to use is.na?

df_temp<-df_temp %>%
mutate(is.na(age_years_correct) = ifelse(is.na(age_years_correct==NA),
is.na(age_years),
is.na(age_years_correct)))

joels · March 8, 2021, 6:52pm

My first post was intended to show how to test for missing values. Here's part of your original example:

df_temp<-df_temp %>% 
  mutate(age_years_correct = ifelse(age_years_correct==NA,
                                    age_years,
                                    age_years_correct))

In order to properly test for NA, age_years_correct==NA should be changed to is.na(age_years_correct). So the code should be:

df_temp<-df_temp %>% 
  mutate(age_years_correct = ifelse(is.na(age_years_correct),
                                    age_years,
                                    age_years_correct))

Here's an example of what's going wrong when you test for NA using == vs. is.na():

x = c(1, 3, NA, 5)

x==NA
#> [1] NA NA NA NA

is.na(x)
#> [1] FALSE FALSE  TRUE FALSE

^{Created on 2021-03-08 by the reprex package (v1.0.0)}

Also, given your examples, I'm not certain, but you might be able to use the coalesce function to save some typing:

df_temp  <- dt_temp %>% 
    mutate(age_years_correct = coalesce(floor(age_months/12), age_years))

The code above will set age_years_correct to floor(age_months/12) when that value is not missing, or age_years otherwise. A coalesce example:

library(tidyverse)

x = c( 1,  3, NA,  5, NA,  7)
y = c(NA, 30, NA, 50, 60, NA)

coalesce(x,y)
#> [1]  1  3 NA  5 60  7

coalesce(y,x)
#> [1]  1 30 NA 50 60  7

^{Created on 2021-03-08 by the reprex package (v1.0.0)}

system · March 29, 2021, 6:52pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.