I have the following dataframe with columns with the same name (e.g. AA and AA.1) indicating a feature with two outputs. I'd like to make one hot encoding of this dataframe as follows:
Original dataframe:
data.table::data.table(
AA = c("12", "11", "13"),
AA.1 = c("11", "7", "13"),
BB = c("3", "4", "7"),
BB.1 = c("8", "9", "3")
)
Final dataframe:
data.table::data.table(
AA.7 = c(0, 1, 0),
AA.11 = c(1, 1, 0),
AA.12 = c(1, 0, 0),
AA.13 = c(0, 0, 1),
BB.3 = c(1, 0, 1),
BB.4 = c(0, 1, 0),
BB.7 = c(0, 0, 1),
BB.8 = c(1, 0, 0),
BB.9 = c(0, 1, 0)
)
I tried to use dplyr and tidyr but I don't know how to deal with such duplex output.