Hi there.
(First post, first time with reprex & dput - hope I've done right. If not, code is at https://gist.github.com/joeschofield0/99669d0850539e3dd2333c844813bb63)
I'm trying to melt a df that contains data (drug name, dose, date etc) on multiple treatment episodes per patient. I would like to end up with one row per treatment episode per patient. I am able to melt down based on one id.var, but struggling to get the measure.vars of interest. The R script below generates a sample df and shows where I've got to with melt.
Here is an example of the structure I am aiming for:
structure(list(patient = c(1L, 1L, 2L, 2L, 3L, 3L), gender = structure(c(2L,
2L, 1L, 1L, 1L, 1L), .Label = c("F", "M"), class = "factor"),
age = c(39L, 39L, 29L, 29L, 59L, 59L), treatment = c(1L,
2L, 1L, 2L, 1L, 2L), drug = structure(c(1L, 3L, 1L, 2L, 1L,
3L), .Label = c("A", "B", "D"), class = "factor"), dose = c(83L,
19L, 84L, 111L, 38L, 55L), dosecat = structure(c(3L, 3L,
2L, 3L, 1L, 2L), .Label = c("high", "low", "med"), class = "factor"),
date = structure(c(3L, 1L, 4L, 2L, 6L, 5L), .Label = c("01/12/2019",
"08/01/2020", "09/01/2020", "10/11/2019", "23/11/2019", "29/11/2019"
), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
I've read the melt man page, checked several multi-column melt posts here and on StackExchange, but struggling to get the desired format output. Grateful for any pointers.
library(reshape2)
set.seed(3)
patient <- c(1:10)
gender <- sample(c("M", "F"), size = 10, replace = TRUE)
age <- sample(20:60, size = 10, replace = TRUE)
drug1 <- sample(c("A", "B", "C"), 10, replace = TRUE)
dose1 <- sample(10:200, size = 10, replace = TRUE)
dosecat1 <- sample(c("low", "med", "high"), size = 10, replace = TRUE)
date1 <- Sys.Date() + sample(-1:-90, 10)
drug2 <- sample(c("B", "D", "E"), 10, replace = TRUE)
dose2 <- sample(10:200, size = 10, replace = TRUE)
dosecat2 <- sample(c("low", "med", "high"), size = 10, replace = TRUE)
date2 <- Sys.Date() + sample(-1:-90, 10)
df <- data.frame(patient, gender, age, drug1, dose1, dosecat1, date1, drug2, dose2, dosecat2, date2)
# -- Melt attempt 1 --
# Warning: Attributes are not identical across variables (drug = chr, dose = int)
# df_long <- melt(df, id.vars = "patient", measure.vars = c("drug1", "dose1", "drug2", "dose2"))
# -- Melt attempt 2 --
# Getting closer, but losing gender, age, dose, date columns
df_long <- melt(df, id.vars = "patient", measure.vars = c("drug1", "drug2"))
#> Warning: attributes are not identical across measure variables; they will be
#> dropped
df_long[order(df_long$patient), ]
#> patient variable value
#> 1 1 drug1 A
#> 11 1 drug2 D
#> 2 2 drug1 A
#> 12 2 drug2 B
#> 3 3 drug1 A
#> 13 3 drug2 D
#> 4 4 drug1 B
#> 14 4 drug2 B
#> 5 5 drug1 B
#> 15 5 drug2 B
#> 6 6 drug1 A
#> 16 6 drug2 D
#> 7 7 drug1 B
#> 17 7 drug2 B
#> 8 8 drug1 C
#> 18 8 drug2 D
#> 9 9 drug1 B
#> 19 9 drug2 E
#> 10 10 drug1 C
#> 20 10 drug2 E
Created on 2020-01-18 by the reprex package (v0.3.0)