Your example will match the case where "probable" or "possible" are followed by one space and then "diabetes". Here are a couple of additional examples. I'm not an expert with regular expressions and there may be better or more efficient ways to implement these.
Fake data
library(tidyverse)
dd = tibble(var=c('probable for diabetes',
'diabetes probable',
'Diabetes probable',
'diabetes is the most probable',
'possible for diabetes',
'diabetes is unlikely but insulin resistance is probable',
'probable',
'diabetes',
'something else'))
Diabetes must appear anywhere before or after "probable" or "possible". (?i) makes it case insensitive (which you can also do with the ignore.case=TRUE argument):
dd %>%
filter(grepl("(?i)diabetes.*(probable|possible)", var) |
grepl("(?i)(probable|possible).*diabetes", var))
#> # A tibble: 6 x 1
#> var
#> <chr>
#> 1 probable for diabetes
#> 2 diabetes probable
#> 3 Diabetes probable
#> 4 diabetes is the most probable
#> 5 possible for diabetes
#> 6 diabetes is unlikely but insulin resistance is probable
Diabetes must appear within four words before or after "probable" or "possible".
-
\\w+ means any number of word characters (letters, numbers, dashes)
-
\\W+ means any number of non-word characters (e.g., white space)
-
{0,4} means zero to 4 repetitions of the pattern
-
(?:...) means a non-capturing group (that is, it skips zero to 4 words without capturing them as part of the match
dd %>%
filter(grepl("(?i)(probable|possible)\\W+(?:\\w+\\W+){0,4}?diabetes", var) |
grepl("(?i)diabetes\\W+(?:\\w+\\W+){0,4}?(probable|possible)", var))
#> # A tibble: 5 x 1
#> var
#> <chr>
#> 1 probable for diabetes
#> 2 diabetes probable
#> 3 Diabetes probable
#> 4 diabetes is the most probable
#> 5 possible for diabetes
Created on 2020-07-10 by the reprex package (v0.3.0)