So I am trying to filter a data set by rows that have don't have a specific string. I want to filter against rows that contain the string "biological process", but not ones that have something like "negative regulation of biological process". When I use the following code it gets rid of both rows because they both contain that phrase. Is there a way to do an exact match to do this?
(d <- data.frame(
upper = LETTERS,
lower = letters)) |> head()
#> upper lower
#> 1 A a
#> 2 B b
#> 3 C c
#> 4 D d
#> 5 E e
#> 6 F f
d[which(d[1] == "A"),]
#> upper lower
#> 1 A a
d[which(d[1] != "A"),] |> head()
#> upper lower
#> 2 B b
#> 3 C c
#> 4 D d
#> 5 E e
#> 6 F f
#> 7 G g
To filter a data set and exclude rows that contain the string "biological process" but not those with phrases like "negative regulation of biological process," you can use the grepl function with word boundaries. Here's an example code snippet that achieves this:
The \\b represents word boundaries in regular expressions, ensuring an exact match for the phrase "biological process" without including partial matches or variations. By using grepl instead of str_detect, you achieve the desired result of filtering out only the exact match rows you specified. Also with simliar process i made rows on my gaming website!