I am comparing two texts. t1 is model and t2 has misspelling. I want to remove all same characters that appears in t1 and t2 which leaves with the misspelled characters. I am struggling to achieve this. Below is an example script that I have been working on.
t1 <- "This is a test. Weather is fine"
t2 <- "This text is a test. This wuither is fine. This blabalba That "
t1<-str_split(t1, "(?<=\\.)\\s")
t1<- lapply(t1,tolower)
t2<-str_split(t2, "(?<=\\.)\\s")
t2<- lapply(t2,tolower)
write.table(t1, file = "t1.txt", col.names = FALSE)
write.table(t2, file = "t2.txt", col.names = FALSE)
library(tools)
y<-Rdiff('t1.txt','t2.txt',Log = TRUE)
y<- as.character(y$out)
y<-strsplit(y,"\\s")
commonWords <- intersect(t1, t2)
y2<-removeWords(y,commonWords)