how to find some row in a dataframe with respect of another dataframe

I want to merge 2 data set. one is a subset of bigger one. I want to choose some more rows from the bigger one whose have same value in 2 column of smaller one.

I want to get rows in df2 whose have the same value in SAMPNO and PERNO

    smaller one df1
>    SAMPN     PERNO    other.column
>      1         2          10
>      1         3           9
>      2         2           8
>      2         4           7
    bigger one df2
  SAMPN     PERNO     other.column
>      1         2            10
>      1         2             7
>      1         2             9
>      1         3             7
>      1         3             9
>      1         4             7
>      1         5             5
>      1         6             9
>      2         2             9
>      2         2             8
>      2         4             3
>      2         3             1
>      3         3             0
output

> SAMPN     PERNO    other.column
>          1         2           10
>          1         2           7
>          1         2           9
>          1         3           9
>          1         3           7
>          2         2           8
>          2         2           9
>          2         4           7
>          2         4           3
suppressPackageStartupMessages(library(dplyr))
df1 =data.frame(SAMPN =c(1,1,2,2), PERNO = c(2,3,2,4), other.column = 10:7)
df2 = data.frame(SAMPN = c(1,1,1,1,1,1,1,1,2,2,2,2,3),
                 PERNO = c(2,2,2,3,3,4,5,6,2,2,4,3,3),
                 other.column = c(10,7,9,7,9,7,5,9,9,8,3,1,0))
df2Filtered <- semi_join(df2, df1, by = c("SAMPN", "PERNO"))
AllDat <- rbind(df2Filtered, df1)
AllDatUnique <- unique(AllDat)
AllDatUnique
#>    SAMPN PERNO other.column
#> 1      1     2           10
#> 2      1     2            7
#> 3      1     2            9
#> 4      1     3            7
#> 5      1     3            9
#> 6      2     2            9
#> 7      2     2            8
#> 8      2     4            3
#> 12     2     4            7

Created on 2019-09-18 by the reprex package (v0.2.1)

they don't have the same size to rbind

.
I am sorry, I do not understand what you want to accomplish.

The semi-join() function can be used to get all the rows in df2 that have a SAMPN/PERNO combination that appears in df1. It seemed that you also wanted to add to the output the one row in df1 that does not appear in df2. That is why I used the rbind(). If the two data frames have a different number of columns, what do you want to do with the extra row in df1?

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.