I have tested a solution with a small dataset and it works. However, now I have 100 columns and my solution is not scalable. Can you please help me to achieve the result. Solution based on tidyverse would be lovely!
I want to Tag each row based on following condition:
Tags condition
Tag1 = col1 must contain either one value from the User_list and rest of the columns must not include any values from the User_list
Tag2 = col1 must NOT contain any value from the User_list and rest of the columns must have at least one value from the User_list
Tag3 = col1 must contain either one value from the User_list and rest of the columns must have at least one value from the User_list
final_result
#> # A tibble: 8 x 7
#> name col1 col2 col3 Tag1 Tag2 Tag3
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 John cricket football basketball NA NA 3
#> 2 Scottish running football <NA> NA 2 NA
#> 3 Kobe <NA> <NA> basketball NA 2 NA
#> 4 jordan baseball basketball <NA> NA 2 NA
#> 5 shakira singing dancing acting NA NA NA
#> 6 Chris <NA> <NA> <NA> NA NA NA
#> 7 Tendulkar cricket <NA> <NA> 1 NA NA
#> 8 Axl basketball guitar cricket NA NA 3
Below is the sample data
library (tidyverse)
sample_data <- tibble::tribble(
~name, ~col1, ~col2, ~col3,
"John", "cricket", "football", "basketball",
"Scottish", "running", "football", NA,
"Kobe", NA, NA, "basketball",
"jordan", "baseball", "basketball", NA,
"shakira", "singing", "dancing", "acting",
"Chris", NA, NA, NA,
"Tendulkar", "cricket", NA, NA,
"Axl", "basketball", "guitar", "cricket"
)
This is the user list
User_list <- tibble(vars = c("basketball", "football", "cricket"))
This is my codes:
sample_data %>%
mutate(Tag1 = case_when ((col1 %in% codes$vars) & (!col2 %in% codes$vars) & (!col3 %in% codes$vars) ~ 1),
Tag2 = case_when (!(col1 %in% codes$vars) & ((col2 %in% codes$vars) | (col3 %in% codes$vars)) ~ 2),
Tag3 = case_when (col1 %in% codes$vars & ((col2 %in% codes$vars) | (col3 %in% codes$vars)) ~ 3))
Above code surprising work! However I have 100 columns. How can I achieve the above result with the same Tags condition which is column 1 is col1 and rest of the columns refers to column 2 to column 100