In the example above, cancer, tb, and malaria are diseases. How could I mutate a variable disease such that it takes 1 when an observation has at least one disease otherwise 0. In my real data set, there are many more diseases. Instead of using cancer == "Yes"| tb == "Yes"| malaria == "Yes", is there a better alternative?
library(dplyr)
df <- tibble(
id = c(1:3),
age = c(14, 15, 16),
city = c("A", "B", "C"),
cancer = c("Yes", "No", "No"),
tb = c("No", "No", "No"),
malaria = c("No", "Yes", "No"),
)
library(dplyr)
library(tidyr)
library(stringr)
df <- tibble(
id = c(1:3),
age = c(14, 15, 16),
city = c("A", "B", "C"),
cancer = c("Yes", "No", "No"),
tb = c("No", "No", "No"),
malaria = c("No", "Yes", "No"),
)
df %>%
unite(disease, cancer:malaria, remove = FALSE) %>%
mutate(disease = as.numeric(str_detect(disease, "Yes")))
#> # A tibble: 3 x 7
#> id age city disease cancer tb malaria
#> <int> <dbl> <chr> <dbl> <chr> <chr> <chr>
#> 1 1 14 A 1 Yes No No
#> 2 2 15 B 1 No No Yes
#> 3 3 16 C 0 No No No