How to use regex to match upto third forward slash in R using gsub?

So this question is relating to specifically how R handles regex - I would like to find some regex in conjunction with gsub to extract out the text all but before the 3rd forward slash.

Here are some string examples:

/google.com/images/video 
/msn.com/bing/chat
/bbc.com/video

I would like to obtain the following strings only:

/google.com/images
/msn.com/bing
/bbc.com/video

So it is not keeping the information after the 3rd forward slash.

I cannot seem to get any regex working along with using gsub to solve this!

The closest I have got is:

gsub(pattern = "/[A-Za-z0-9_.-]/[A-Za-z0-9_.-]*$", replacement = "", x = the_data_above )

I think R has some issues regarding forward slashes and escaping them.

You can do this with stringr::str_extract:

text = c('/google.com/images/video', 
'/msn.com/bing/chat', 
'/bbc.com/video')

stringr::str_extract(text, '/.+?/[^/]*')
#> [1] "/google.com/images" "/msn.com/bing"      "/bbc.com/video"
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.