# creating a new variable based on NA in others

I have a set of 5 questions, that can be answered each with a score from 1-5. Some are not answered, resulting in NA. I want to create a new variabel, that is "1" when all questions are answered, i.e. no NA in any of the 5 questions, and "0" when one or more questions are not answered (=NA). Could anyone help?

The *apply and map* functions will loop on the columns of a data.frame, making it quite easy.

``````ex_df <- tribble(~qu1,    ~qu2,   ~qu3,
4,        5,      1,
1,        4,      1,
1,        4,      2,
1,        2,      2,
NA,        3,      2,
2,        5,      1,
6 ,       5,      2,
2 ,       5,      3,
1 ,       4,      1,
NA ,       3,      1,
4 ,       2,      1,
2 ,       2,      2,
3 ,       4,      3,
3 ,       3   ,   3)

map_lgl(ex_df, ~any(is.na(.)))
#  qu1   qu2   qu3
# TRUE FALSE FALSE
``````

Or you can force the output to be integer with `map_int()`.
Base R equivalent:

``````sapply(ex_df, function(x) 1L*any(is.na(x)))
``````

Multiplying by `1L` (the integer 1) converts to integer.

Thx! But this just gives a list, not a new variable in the data frame?

Tried like this:

data.17q\$part_complete[!is.na(data.17q\$bd_hindrer) |
!is.na(data.17q\$bd_styrer) |
!is.na(data.17q\$bd_spiser) |
!is.na(data.17q\$bd_matreg) |
!is.na(data.17q\$bd_bekymre) |
!is.na(data.17q\$bd_andre) |
!is.na(data.17q\$bd_insulin)|
!is.na(data.17q\$bd_male) |
!is.na(data.17q\$bd_utstyr) |
!is.na(data.17q\$bd_planleg)] <- "complete" # new variable defined by no NA

data.17q\$part_complete[is.na(data.17q\$bd_hindrer) |
is.na(data.17q\$bd_styrer) |
is.na(data.17q\$bd_spiser) |
is.na(data.17q\$bd_matreg) |
is.na(data.17q\$bd_bekymre) |
is.na(data.17q\$bd_andre) |
is.na(data.17q\$bd_insulin)|
is.na(data.17q\$bd_male) |
is.na(data.17q\$bd_utstyr) |
is.na(data.17q\$bd_planleg)] <- "incomplete" # new variable defined by at least one NA

``````
library(tidyverse)

(ex_df <- tribble(~qu1,    ~qu2,   ~qu3,
4,        5,      1,
1,        4,      1,
1,        4,      2,
1,        2,      2,
NA,        3,      2,
2,        5,      1,
6 ,       5,      2,
2 ,       5,      3,
1 ,       4,      1,
NA ,       3,      1,
4 ,       2,      1,
2 ,       2,      2,
3 ,       4,      3,
3 ,       3   ,   3))

( vars_to_consider <- c("qu1","qu2","qu3"))

(result_df <- rowwise(ex_df) %>% mutate(
qsum = !is.na(sum(!!!syms(vars_to_consider), na.rm=FALSE))) %>% ungroup)
# # A tibble: 14 x 4
# qu1   qu2   qu3 qsum
# <dbl> <dbl> <dbl> <lgl>
#  1     4     5     1 TRUE
#  2     1     4     1 TRUE
#  3     1     4     2 TRUE
#  4     1     2     2 TRUE
#  5    NA     3     2 FALSE
#  6     2     5     1 TRUE
#  7     6     5     2 TRUE
#  8     2     5     3 TRUE
#  9     1     4     1 TRUE
# 10    NA     3     1 FALSE
# 11     4     2     1 TRUE
# 12     2     2     2 TRUE
# 13     3     4     3 TRUE
# 14     3     3     3 TRUE``````

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.