Hello,
I want to create a new dataframe from 2 dataframes (after matching only similar kinds of data).
My dataframe contains query, target, and weight 3 columns. If we have the same name in the query, and target for the same row then I want to copy this row and add it to the new dataframe.
Please, be noted, name order does not matter. Sometimes, the name can be vise versa. Like the 3rd and 5th rows of dataframe-1 and dataframe-2. Dataframe-1 contains A3 A1 0.75 but Dataframe-2 contains A1 A3 0.75. I would like to take these 2 rows as the same and want to copy in the new dataframe from the Dataframe-1 (not from Dataframe-2).
Dataframe-1 looks like
query target weight
1 A1 A2 0.60
2 A2 A5 0.50
3 A3 A1 0.75
4 A4 A5 0.88
5 A5 A3 0.99
6 A14 A15 0.88
7 A35 A3 0.99
Dataframe-2 looks like
query target weight
1 A1 A2 0.600
2 A2 A5 0.500
3 A1 A3 0.750 // Missing in Output
4 A4 A5 0.880
5 A3 A5 0.990 // Missing in Output
6 (+)-1(10),4-Cadinadiene Falcarinone 0.090
7 Leucodelphinidin (+)-1(10),4-Cadinadiene 0.876
8 Lignin (2E,7R,11R)-2-Phyten-1-ol 0.778
9 A1 (+)-1(10),4-Cadinadiene 1.000
10 A2 Lignin 1.000
11 A3 (2E,7R,11R)-2-Phyten-1-ol 1.000
12 Falcarinone A6 1.000
13 A4 Leucodelphinidin 1.000
I am trying to use the semi_join to perform the task. Code is given below
output <- semi_join(Dataframe-1, Dataframe-2)
query target weight
1 A1 A2 0.60
2 A2 A5 0.50
3 A4 A5 0.88
But, the challenges I am facing is, my code is not working for 3rd and 5th rows of dataframe-1 and dataframe-2 because the order is not matching.
I know semi_join is worked only for matched data. I also tried the in and merge function. But, nothing is working perfectly for my expected output like the given one.
query target weight
1 A1 A2 0.60
2 A2 A5 0.50
3 A3 A1 0.75
4 A4 A5 0.88
5 A5 A3 0.99
Any kind of suggestions is appreciated.