Filter and remove rows of data

Hello all, need your support please, I have a long database, this file named "market" is just an extract

  1. From the "market file” I wanted to create a database that filters rows from source1 for which the observations are identical with respect to source2 or source3,
  2. From the "market file” I wanted to create another database which deletes all the data from sources1 as well as the previously filtered data found in source2 and 3

Hello everyone, does anyone have an idea about these problems?

I am not sure I have understood your requirements. Is this what you want?

#> Warning: package 'tibble' was built under R version 4.1.2
DAT <- data.frame(Source=c('s1','s1','s1','s2','s2','s2','s2',
                  Territoire = c('t1','t2','t3','t3', 't2','t2', 't3',

S2and3 <- DAT |> filter(Source %in% c('s2', 's3'))
S1 <- DAT |> filter(Source == 's1')
S1_keep <- anti_join(S1,S2and3, by =c("Province", "Territoire", "Marche"))
S1_remove <- semi_join(S1,S2and3, by =c("Province", "Territoire", "Marche"))

DAT2 <- rbind(S1_keep,S2and3)
#>    Source Province Territoire Marche
#> 1      s1       p1         t1     m1
#> 2      s2       p4         t3     m4
#> 3      s2       p2         t2     m5
#> 4      s2       p2         t2     m6
#> 5      s2       p3         t3     m3
#> 6      s3       p2         t2     m2
#> 7      s3       p1         t4     m7
#> 8      s3       p2         t5     m8
#> 9      s3       p1         t1     m9
#> 10     s2       p2         t6    m10

DAT3 <- anti_join(S2and3,S1_remove, by =c("Province", "Territoire", "Marche"))
#>   Source Province Territoire Marche
#> 1     s2       p4         t3     m4
#> 2     s2       p2         t2     m5
#> 3     s2       p2         t2     m6
#> 4     s3       p1         t4     m7
#> 5     s3       p2         t5     m8
#> 6     s3       p1         t1     m9
#> 7     s2       p2         t6    m10

Created on 2022-05-23 by the reprex package (v2.0.1)

Thank you FJCC for your feedback. According to what I want the results will look like this:

  1. For the first database: the scripts filter in the "market" file the same data from source1 which are found at the same time in source2 or source3 with respect to these three variables (Province, Territoire and Marche)
  1. For the second database, the scripts must delete in the "market" file all the lines of sources1 and the lines of source2 or source3 which were in common with source1 with respect to these three variables (Province, Territoire et Marche)

The first result is named S1_remove in my code and the second is named DAT3.

Hello FJCC, you are right when I checked by running the scripts one by one I realized everything was fine. Your support was successful
Thanks again