Extract strings using fuzzy LR patterns in R

I am struggling for long time.

I manage to extract everything between my Right and Left patterns in a string as you can see in the following example.


data=c("everything will be ok one day")

str_extract(string = data, pattern = "(?<=thing).*(?=ok one)")
#> [1] " will be "

As you notice in the code, I extract everything between "thing" and "ok one".

I need to incorporate the possibility of mismatches inside these patterns.
I want to allow a maximum of two mismatches and consider indels and insertions.

This is just a simplified example. My actual data does not contain gaps, and it's complicated. I am looking forward to receiving your help and guidance.

I haven't tried these, but there is base::agrep() for approximate string matching, plus packages like fuzzyjoin that you could try: GitHub - dgrtwo/fuzzyjoin: Join tables together on inexact matching


This thread may be helpful.


