I'm more of a rookie in R and currently writing my bachelor's thesis. For this I use several datasets and now I am looking for a way to read out only certain rows in one of these datasets.

Let me explain what exactly I mean:
I'm working with the ParlGov dataset and the Seki-Williams dataset about government cabinets. In the latter one (Seki-Williams) I have used "drop_na(variable)" to take out rows which I cannot use (because NA). The dataset also contains references to the individual government cabinets in form of "ParlGov cabinet IDs". Now this ParlGov dataset logically also contains these IDs.

Now I am looking for a way to filter only the rows in ParlGov which IDs are still contained in the Seki-Williams dataset where useless rows have already been sorted out.

So is there a way to filter certain rows over the characeristic attributes of a variable from another dataset?
Up to now I only have known filter() which, as far as I know, only works in one dataset itself.

Something like this should work:


p_g_filt <- p_g %>%
  filter(id %in% s_w$id)

you have my thanks! It worked out perfectly.

Another way of doing this is with dplyr::semi_join. You can use the by argument to match columns that have different names in the two data frames.


df1 <- tibble(
  month = rep(month.abb, 2),
  somenum = sample(100, length(month))

df2 <- tibble(
  month = month.abb %>% 

df1 %>% 
#> Joining, by = "month"
#> # A tibble: 6 x 2
#>   month somenum
#>   <chr>   <int>
#> 1 Apr        63
#> 2 Aug        70
#> 3 Nov        57
#> 4 Apr        18
#> 5 Aug        99
#> 6 Nov        40

