I have 2 dataframe. Now, I want to get the common and uncommon rows of this two dataframe.
Here, common rows can be vice versa
. This means I am searching for common between the two datasets that can have changed in the order of matching columns. For example, row number 2
of the df_1
is ID_2 ID_3
and df_2
is ID_3 ID_2
. They are not common in terms of column value but they are common if we do not consider the column value.
After getting the common and uncommon I want to add another column label
as 1 for common
and 0 for uncommon
.
The code I am using
result <- df_2 %>%
dplyr::inner_join(df_1, by = c("nodeA" = "query", "nodeB" = "target")) %>%
dplyr::mutate(GROUP = 1) %>%
dplyr::union(df_2 %>%
dplyr::inner_join(df_1, by = c("nodeB" = "query", "nodeA" = "target")) %>%
dplyr::mutate(GROUP = 2)) %>%
mutate(label=1)
The output I am getting
nodeA nodeB new_ssp ssp GROUP label
1 ID_1 ID_2 0.50 0.50 1 1
2 ID_3 ID_4 0.80 0.80 1 1
3 ID_3 ID_2 0.90 0.90 2 1
4 ID_1 ID_6 0.09 0.09 2 1
Expected output (here, ? means I don't know. It can be 1 or 2 does not matter)
nodeA nodeB new_ssp ssp GROUP label
1 ID_1 ID_2 0.50 0.50 1 1
2 ID_3 ID_4 0.80 0.80 1 1
3 ID_3 ID_2 0.90 0.90 2 1
4 ID_1 ID_6 0.09 0.09 2 1
5 ID_4 ID_5 0.70 0.70 ? 0
6 ID_40 ID_50 0.70 0.70 ? 0
7 ID_5 ID_3 0.50 0.50 ? 0
8 ID_15 ID_31 0.50 0.50 ? 0
Reproducible Data
query <- c("ID_1", "ID_2", "ID_3", "ID_4", "ID_5", "ID_6")
target <- c("ID_2", "ID_3", "ID_4", "ID_5", "ID_3", "ID_1")
ssp <- c(0.5, 0.9, 0.8, 0.7, 0.5, 0.09)
df_1 <- data.frame(query, target, ssp)
nodeA <- c("ID_1", "ID_3", "ID_3", "ID_40", "ID_15", "ID_1")
nodeB <- c("ID_2", "ID_2", "ID_4", "ID_50", "ID_31", "ID_6")
new_ssp <- c(0.5, 0.9, 0.8, 0.7, 0.5, 0.09)
df_2 <- data.frame(nodeA, nodeB, new_ssp)