creating a new variable based on NA in others

I have a set of 5 questions, that can be answered each with a score from 1-5. Some are not answered, resulting in NA. I want to create a new variabel, that is "1" when all questions are answered, i.e. no NA in any of the 5 questions, and "0" when one or more questions are not answered (=NA). Could anyone help?

The *apply and map* functions will loop on the columns of a data.frame, making it quite easy.

ex_df <- tribble(~qu1,    ~qu2,   ~qu3,
                 4,        5,      1,
                 1,        4,      1,
                 1,        4,      2,
                 1,        2,      2,
                 NA,        3,      2,
                 2,        5,      1,
                 6 ,       5,      2,
                 2 ,       5,      3,
                 1 ,       4,      1,
                 NA ,       3,      1,
                 4 ,       2,      1,
                 2 ,       2,      2,
                 3 ,       4,      3,
                 3 ,       3   ,   3)

map_lgl(ex_df, ~any(is.na(.)))
#  qu1   qu2   qu3 
# TRUE FALSE FALSE 

Or you can force the output to be integer with map_int().
Base R equivalent:

sapply(ex_df, function(x) 1L*any(is.na(x)))

Multiplying by 1L (the integer 1) converts to integer.

Thx! But this just gives a list, not a new variable in the data frame?

Tried like this:

data.17q$part_complete[!is.na(data.17q$bd_hindrer) |
!is.na(data.17q$bd_styrer) |
!is.na(data.17q$bd_spiser) |
!is.na(data.17q$bd_matreg) |
!is.na(data.17q$bd_bekymre) |
!is.na(data.17q$bd_andre) |
!is.na(data.17q$bd_insulin)|
!is.na(data.17q$bd_male) |
!is.na(data.17q$bd_utstyr) |
!is.na(data.17q$bd_planleg)] <- "complete" # new variable defined by no NA

data.17q$part_complete[is.na(data.17q$bd_hindrer) |
is.na(data.17q$bd_styrer) |
is.na(data.17q$bd_spiser) |
is.na(data.17q$bd_matreg) |
is.na(data.17q$bd_bekymre) |
is.na(data.17q$bd_andre) |
is.na(data.17q$bd_insulin)|
is.na(data.17q$bd_male) |
is.na(data.17q$bd_utstyr) |
is.na(data.17q$bd_planleg)] <- "incomplete" # new variable defined by at least one NA


library(tidyverse)

(ex_df <- tribble(~qu1,    ~qu2,   ~qu3,
                  4,        5,      1,
                  1,        4,      1,
                  1,        4,      2,
                  1,        2,      2,
                  NA,        3,      2,
                  2,        5,      1,
                  6 ,       5,      2,
                  2 ,       5,      3,
                  1 ,       4,      1,
                  NA ,       3,      1,
                  4 ,       2,      1,
                  2 ,       2,      2,
                  3 ,       4,      3,
                  3 ,       3   ,   3))

( vars_to_consider <- c("qu1","qu2","qu3"))

(result_df <- rowwise(ex_df) %>% mutate(
  qsum = !is.na(sum(!!!syms(vars_to_consider), na.rm=FALSE))) %>% ungroup)
# # A tibble: 14 x 4
# qu1   qu2   qu3 qsum 
# <dbl> <dbl> <dbl> <lgl>
#  1     4     5     1 TRUE 
#  2     1     4     1 TRUE 
#  3     1     4     2 TRUE 
#  4     1     2     2 TRUE 
#  5    NA     3     2 FALSE
#  6     2     5     1 TRUE 
#  7     6     5     2 TRUE 
#  8     2     5     3 TRUE 
#  9     1     4     1 TRUE 
# 10    NA     3     1 FALSE
# 11     4     2     1 TRUE 
# 12     2     2     2 TRUE 
# 13     3     4     3 TRUE 
# 14     3     3     3 TRUE

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.