Hello everybody, I am new in R, and I do have to filter two Datasets by a ID-Name, because on one of the Dataset has not all ID's of the other one. So I have to filter the fist Dataset (which has 82 Rows) to the second Dataset (which has 64 Rows) and have it identical to the other one.
In the end I should have in both Datasets 64 Rows and the same ID's
Here are two ways to filter DF1 so that it has the same IDs as DF2.
DF1 <- data.frame(ID = LETTERS[1:8], Value = 1:8)
DF1
#> ID Value
#> 1 A 1
#> 2 B 2
#> 3 C 3
#> 4 D 4
#> 5 E 5
#> 6 F 6
#> 7 G 7
#> 8 H 8
DF2 <- data.frame(ID = c("A", "C", "D", "F", "H"), Value2 = 11:15)
DF2
#> ID Value2
#> 1 A 11
#> 2 C 12
#> 3 D 13
#> 4 F 14
#> 5 H 15
#with standard subsetting
DF1cln <- DF1[DF1$ID %in% DF2$ID,]
DF1cln
#> ID Value
#> 1 A 1
#> 3 C 3
#> 4 D 4
#> 6 F 6
#> 8 H 8
#with semi_join
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
DF1cln2 <- semi_join(DF1, DF2, by = "ID")
#> Warning: Column `ID` joining factors with different levels, coercing to
#> character vector
DF1cln2
#> ID Value
#> 1 A 1
#> 2 C 3
#> 3 D 4
#> 4 F 6
#> 5 H 8