Create a conditional column

Hi everyone,

I want to create a column which is conditional-if a certain condition is met, I want the column to have "1" in a cell, and if it is not met, I want it to have "0".

There are a few columns that provide qualitative data, and then there are columns which are titled according to drug names. Under each drug named column, you can either have nothing/no value, NA, C, c, P, or p.

If, in a given row, you see "c" or "p" more than once (i.e. multiple drug columns are filled in), I'd like poly to say 1.

Any suggestions on how to do this? Thanks! Below, is a very large amount of code created via 'dput'.

structure(list(Year = c(2020, 2020, 2020, 2020, 2020, 2020, 2020, 
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 
2020), RecID = c("20201", "20202", "20203", "20204", "20205", 
"20206", "20207", "20208", "20209", "202010", "202011", "202012", 
"202013", "202014", "202015", "202016", "202017", "202018", "202019", 
"202020", "202021", "202022", "202023", "202024", "202025", "202026", 
"202027", "202028", "202029", "202030"), ID = c(40178, 40179, 
40180, 40181, 40182, 40183, 40184, 40185, 40186, 40187, 40188, 
40189, 40190, 40191, 40192, 40193, 40194, 40195, 40196, 40197, 
40198, 40199, 40200, 40201, 40202, 40203, 40204, 40205, 40206, 
40207), `Case Number` = c("MLA 20-0020", "MLA 20-0040", "MLA 20-0022", 
"MLA 20-0073", "MLA 20-0001", "MLA 20-0156", "MLA 20-0170", "MLA 20-0120", 
"MLA 20-0017", "MLA 20-0002", "MLA 20-0204", "MLA 20-0110", "MLA 20-0124", 
"MLA 20-0138", "MLO 20-0113", "MLA 20-0111", "MLA 20-0092", "MLA 20-0207", 
"MLA 20-0102", "MLA 20-0014", "MLA 20-0217", "MLA 20-0006", "MLA 20-0100", 
"MLO 20-0007", "MLA 20-0031", "MLA 20-0051", "MLA 20-0052", "MLA 20-0121", 
"MLA 20-0123", "MLA 20-0126"), District = c(1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1), Month = c(1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 2, 
1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2), County = c(46, 
46, 17, 17, 46, 46, 17, 46, 46, 57, 17, 17, 46, 46, 66, 17, 46, 
17, 46, 57, 17, 57, 17, 57, 17, 17, 17, 66, 17, 57), `Manner of Death` = c("A", 
"A", "S", "A", "S", "S", "H", "N", "H", "A", "A", "A", "A", "A", 
"A", "S", "A", "A", "A", "A", "S", "A", "A", "A", "S", "H", "A", 
"A", "N", "S"), `Cause of Death` = c("Complications of Multidrug Toxicity", 
"Multidrug Toxicity", "Hanging.", "Multidrug Toxicity", "Gunshot Wound of Head", 
"Hanging", "Gunshot Wound of Head", "Dilated Cardiomyopathy - Alcholic Type", 
"Multiple Blunt Traumatic Injuries.", "Acute Fentanyl Toxicity", 
"Multidrup toxicity", "Polysubstance toxicity (Cocaine and Fentanyl)   CC: Hypertensive Cardivascular disease", 
"Methamphetamine toxicity   CC: Chronic obtstructive pulmonary disease", 
"Multidrug toxicity   CC:Atherosclerotic Cardiovascular disease", 
"Multiple Blunt Traumatic Injuries", "Gunshot Wound of Head", 
"Combined toxic effects of Heroin, Fentanyl, Hydrocodone and Methamphetamine", 
"Multidrug toxicity  CC: Hypertensive cardiovascular disease, Diabetes mellitus", 
"Polysubstance toxicity (oxycodone, Methadone, Diazepam)   CC: Hypertensive Cardiovascular Disease", 
"Methamphetamine toxicity", "Hanging", "Acute Fentanyl and Methamphetamine toxicity  CC: Recent Cocaine Use", 
"Acute Fentanyl toxicity", "Acute Ethanol toxicity  CC: Hypertensive cardiovascular disease. Chronic obstructive pulmonary disease", 
"Butalbital toxicity  CC: Cerebrovascular Atherosclerosis, Cardiac valvular disease", 
"Multiple gunshot wounds", "Synthetic opiod toxicity  CC: Cocaine Abuse", 
"Multiple blunt traumatic injuries", "Atherosclerotic and hypertensive cardiovascular disease  CC: Chronic obstructive pumonary disease", 
"Gunshot wound of the head"), Age = c(26, 44, 28, 56, 32, 42, 
33, 66, 52, 55, 38, 47, 67, 42, 38, 32, 63, 73, 51, 20, 37, 37, 
34, 61, 90, 29, 24, 42, 83, 36), Sex = c("M", "M", "M", "M", 
"M", "M", "M", "M", "F", "M", "F", "M", "F", "F", "M", "F", "M", 
"M", "F", "M", "M", "M", "M", "M", "F", "M", "F", "F", "F", "F"
), Race = c("W", "W", "W", "W", "H", "W", "B", "W", "W", "W", 
"B", "W", "W", "W", "W", "W", "W", "B", "W", "W", "W", "W", "W", 
"W", "W", "W", "W", "W", "W", "W"), Alprazolam = c("C", NA, "P", 
"C", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, NA), Amphetamine = c("C", 
"C", "P", NA, NA, NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, 
"P", NA, NA, "P", NA, "P", NA, NA, NA, NA, NA, NA, NA, NA), Buprenorphine = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_), 
    Cannabinoids = c("P", NA, "P", NA, NA, "P", "P", "P", NA, 
    NA, NA, NA, NA, NA, NA, NA, "P", "P", NA, "P", NA, "P", "P", 
    "P", NA, NA, NA, NA, NA, NA), CarisoprodolMeprobamate = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), Cathinones = c(NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_
    ), Chlordiazepoxide = c(NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_), Clonazepam = c("C", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, "P", NA, NA, NA, NA, NA, NA, NA, NA), 
    Cocaine = c(NA, "C", "P", "C", NA, NA, NA, NA, NA, "C", "C", 
    "C", NA, NA, NA, NA, "P", "C", NA, NA, NA, "C", NA, NA, NA, 
    "P", "C", NA, NA, NA), Codeine = c(NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Diazepam = c(NA, 
    NA, NA, "C", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, "C", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA), Ethanol = c(NA, NA, "P", NA, "P", "P", NA, NA, "P", 
    "P", NA, NA, NA, NA, "P", "P", NA, "C", NA, NA, "P", NA, 
    NA, "C", NA, NA, NA, "P", "P", "P"), Fentanyl = c("C", "C", 
    NA, NA, NA, NA, NA, NA, NA, "c", "C", "C", "P", "C", NA, 
    NA, "C", "C", NA, NA, NA, "C", "C", NA, NA, NA, "C", NA, 
    NA, NA), FentanylAnalogs = c(NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, "C", NA, NA, NA), Gabapentin = c(NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, "P", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), GHB = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), HallucinogenicPhenethylaminesPiperazine = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), HallucinogenicTryptamines = c(NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), HalogenatedInhalants = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), Heroin = c(NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, "C", NA, NA, "C", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA), HydrocarbonInhalants = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    Hydrocodone = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, "C", NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), Hydromorphone = c(NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Ketamine = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), Lorazepam = c(NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_
    ), Methadone = c(NA, NA, NA, "C", NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "C", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA), Methamphetamine = c("C", NA, "P", 
    "C", NA, NA, NA, NA, NA, NA, "C", NA, "C", NA, NA, NA, "C", 
    NA, NA, "C", NA, "C", NA, NA, NA, NA, NA, NA, NA, NA), Midazolam = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), Mitragynine = c(NA, "C", NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), Morphine = c("C", NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "C", NA, NA, 
    "P", NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, NA, NA), 
    Nordiazepam = c(NA, NA, NA, "C", NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA), Oxazepam = c(NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_
    ), Oxycodone = c(NA, NA, NA, NA, NA, NA, "P", NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "C", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA), Oxymorphone = c(NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "P", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), PCPandPCPAnalogs = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    SympathomimeticAmines = c(NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA), SyntheticCannabinoids = c(NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_, NA_character_, 
    NA_character_), Temazepam = c(NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "P", NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), Tramadol = c(NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "C", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    U47700 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA), Zolpidem = c(NA, "C", NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA), IllicitFentanyl = c("Yes", "Yes", 
    NA, NA, NA, NA, NA, NA, NA, NA, "Yes", "Yes", "Yes", "Yes", 
    NA, NA, "Yes", "No", NA, NA, NA, "Yes", "Yes", NA, NA, NA, 
    "Yes", NA, NA, NA), Analytes = c(NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, "Fentanyl Analogs = Acetyl Fentanyl", 
    NA, NA, NA), AgeGroupCensus = c("25-34", "35-44", "25-34", 
    "55-64", "25-34", "35-44", "25-34", "65-74", "45-54", "55-64", 
    "35-44", "45-54", "65-74", "35-44", "35-44", "25-34", "55-64", 
    "65-74", "45-54", "15-24", "35-44", "35-44", "25-34", "55-64", 
    "85+", "25-34", "15-24", "35-44", "75-84", "35-44"), AgeGroup = c("26-34", 
    "35-50", "26-34", "51-107", "26-34", "35-50", "26-34", "51-107", 
    "51-107", "51-107", "35-50", "35-50", "51-107", "35-50", 
    "35-50", "26-34", "51-107", "51-107", "51-107", "18-25", 
    "35-50", "35-50", "26-34", "51-107", "51-107", "26-34", "18-25", 
    "35-50", "51-107", "35-50"), RaceGroup = c("W", "W", "W", 
    "W", "H", "W", "B", "W", "W", "W", "B", "W", "W", "W", "W", 
    "W", "W", "B", "W", "W", "W", "W", "W", "W", "W", "W", "W", 
    "W", "W", "W"), MOD = c("Accident", "Accident", "Suicide", 
    "Accident", "Suicide", "Suicide", "Homicide", "Natural", 
    "Homicide", "Accident", "Accident", "Accident", "Accident", 
    "Accident", "Accident", "Suicide", "Accident", "Accident", 
    "Accident", "Accident", "Suicide", "Accident", "Accident", 
    "Accident", "Suicide", "Homicide", "Accident", "Accident", 
    "Natural", "Suicide")), row.names = c(NA, -30L), class = c("tbl_df", 
"tbl", "data.frame"))

Below is one way to tackle this problem. I called your dataset df and used the pivot_longer() function to reshape the data. Then, I created the check object by grouping RecID, ID, and Case Number and summing the times "c" or "p" appear, keeping only those cases where this sum is greater than 1 (since only 1 "c" exists in the data, I set the filter to >0 so we would see one record). The column "poly" is created in the check object, and this object is left joined to the original dataset, setting any NAs in the poly column to 0.

library(tidyverse)

check = df %>%
  pivot_longer(cols = names(df[13:59])) %>%
  group_by(RecID, ID, `Case Number`) %>%
  mutate(poly_check = sum(value %in% c('c', 'p'), na.rm = T)) %>%
  ungroup() %>%
  filter(poly_check > 0) %>% # for your purposes, make this poly_check > 1
  mutate(poly = 1) %>%
  distinct(RecID, ID, `Case Number`, poly)

check
#> # A tibble: 1 × 4
#>   RecID     ID `Case Number`  poly
#>   <chr>  <dbl> <chr>         <dbl>
#> 1 202010 40187 MLA 20-0002       1

df = left_join(df, check) %>%
  mutate(poly = ifelse(is.na(poly), 0, poly))
#> Joining, by = c("RecID", "ID", "Case Number")

# one record has poly = 1
df %>% count(poly)
#> # A tibble: 2 × 2
#>    poly     n
#>   <dbl> <int>
#> 1     0    29
#> 2     1     1

Created on 2022-11-09 with reprex v2.0.2.9000

Thank you for this! So, I have two quick questions:

  1. It seems as though my coworkers are getting different results from me (the dataset I have is much larger than what I posted here)-"poly" is greater than 1 more times than I obtain using the formula. Any potential explanations for why? Their numbers are the same using python as when they do the manual calculations in excel.

  2. Is there any reason you decided to do the long pivot, rather than just sum the number of "c"s and "ps" across each row? is there a way to do that?

Thanks!

In the example, I set the check to be > 0 (just to get a result from the data provided). Did you change it to > 1 for your purposes? If not, then you would likely get more "poly" cases. This is the first thing that comes to mind.

The second thing I would check is the dataset. The one provided had 59 columns, which is what I specify in pivot_longer. If your dataset is "much larger" in terms of columns, you will need to update this.

I decided to pivot to a long format because working on columns is generally easier than working on rows. Just a personal preference.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.