How do I run str_replace (or similar) between defined boundaries?

Dear All,

I'd like to find and replace strings within defined boundaries, e.g., to turn "/AString/a b c/ADiffString/" into "/AString/abc/ADiffString/".

Any ideas please?

Thanks,

James.

How are the boundaries defined? Do you know what /Asting/ and /ADiffString/ will be, or can they be anything? is a b c always the same or can it be anything with whitespaces?

Thanks for the reply. I know what /AString/ and /BString/ will be. What's in between them could be anything.

Here is one method.

library(stringr)
x <- "/AString/a b c/ADiffString/"
Target <- str_extract(x, "(?<=/AString/).+(?=/ADiffString/)")
Trimmed <- str_replace_all(Target, " ", "")
paste0("/AString/", Trimmed, "/ADiffString/")
#> [1] "/AString/abc/ADiffString/"

Created on 2019-06-28 by the reprex package (v0.2.1)

Thank you. That’s almost it. Annoying, though, I have oversimplified the situation. (Apologies.) My string may be of the form "This is a sentence with /AString/c o nte nt/BString/ and /AString/o t h er stuff/BString/ in it"

This is probably too inflexible to handle anything other than your example but I hope it will get you started.

library(stringr)
x <- "This is a sentence with /AString/c o nte nt/BString/ and /AString/o t h er stuff/BString/ in it"
Target <- str_extract_all(x, "AString/[^/]+/BString/")
Target <- Target[[1]]
Trimmed <- str_replace_all(Target, " ", "")
for (i in seq_along(Target)){
  x <- str_replace(x, Target[i], Trimmed[i])
}
x
#> [1] "This is a sentence with /AString/content/BString/ and /AString/otherstuff/BString/ in it"

Created on 2019-06-28 by the reprex package (v0.2.1)

Thank you!! The impression I'm getting here is that there's no generic way to apply a str_replace type function within specific (detected) bounds; you simply have to break a sentence up into piece yourself and then selectively apply str_replace to it.

I think you're right, but you may be able to do it with gregexpr, regmatches and lapply. I found this, and modified for your use it would be someting like:

x <- "This is a sentence with /AString/content/BString/and/AString/otherstuff/BString/ in it"
gsubf <- function(pattern, x) {
    m <- gregexpr(pattern, x)
    regmatches(x, m) <- lapply(regmatches(x, m), gsub, pattern=" ", replacement="")
   x   
}
gsubf(x)

Of course, this only works if there are no spaces in /Astring/ and /Bstring/, then you would have to use capture groups

Hold on, there are some issues with greedy matches that has to be fixed.

OK, I also managed to type the gsubf call wrong. Here's the right way, with added ? to make matching non-greedy:

x <- "This is a sentence with /AString/c o nte nt/BString/ and /AString/o t h er stuff/BString/ in it"
gsubf <- function(pattern, x) {
    m <- gregexpr(pattern, x)
    regmatches(x, m) <- lapply(regmatches(x, m), gsub, pattern=" ", replacement="")
   x   
}
gsubf("/AString/.*?/BString/", x)

Still doesn't work if /AString/ or /BString/ contain spaces. And you may want to change the pattern parameter in the lapply call to a whitespace ([:blank:] or [:space:]) instead of a hard space.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.