How to match similar type data (without considering the columns value order) from 2 dataframes

Hello,

I want to create a new dataframe from 2 dataframes (after matching only similar kinds of data).

My dataframe contains query, target, and weight 3 columns. If we have the same name in the query, and target for the same row then I want to copy this row and add it to the new dataframe.

Please, be noted, name order does not matter. Sometimes, the name can be vise versa. Like the 3rd and 5th rows of dataframe-1 and dataframe-2. Dataframe-1 contains A3 A1 0.75 but Dataframe-2 contains A1 A3 0.75. I would like to take these 2 rows as the same and want to copy in the new dataframe from the Dataframe-1 (not from Dataframe-2).

Dataframe-1 looks like

  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
3    A3     A1   0.75
4    A4     A5   0.88
5    A5     A3   0.99
6   A14     A15   0.88
7    A35     A3   0.99

Dataframe-2 looks like

                     query                    target weight
1                       A1                        A2  0.600
2                       A2                        A5  0.500
3                       A1                        A3  0.750  // Missing in Output
4                       A4                        A5  0.880
5                       A3                        A5  0.990 // Missing in Output
6  (+)-1(10),4-Cadinadiene               Falcarinone  0.090
7         Leucodelphinidin   (+)-1(10),4-Cadinadiene  0.876
8                   Lignin (2E,7R,11R)-2-Phyten-1-ol  0.778
9                       A1   (+)-1(10),4-Cadinadiene  1.000
10                      A2                    Lignin  1.000
11                      A3 (2E,7R,11R)-2-Phyten-1-ol  1.000
12             Falcarinone                        A6  1.000
13                      A4          Leucodelphinidin  1.000

I am trying to use the semi_join to perform the task. Code is given below
output <- semi_join(Dataframe-1, Dataframe-2)

  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
3    A4     A5   0.88

But, the challenges I am facing is, my code is not working for 3rd and 5th rows of dataframe-1 and dataframe-2 because the order is not matching.

I know semi_join is worked only for matched data. I also tried the in and merge function. But, nothing is working perfectly for my expected output like the given one.

  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
3    A3     A1   0.75
4    A4     A5   0.88
5    A5     A3   0.99

Any kind of suggestions is appreciated.

Most probably, I found the answer (though I am not sure). Now, I am using in, or and and operation and it is working!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.