Remove specific string

# toy data
df <- tibble(
  v1 = c(
    "average of blah",
    "average of clah",
    "Dlah"
  )
)
v1
average of blah
average of clah
Dlah

How do I remove the string "average of" and make the first letters of "blah" and "clah" capitalized?

Desired output:

v1
Blah
Clah
Dlah

I think this should do it, assuming your words will always be at the end of each cell in the dataframe, using the stringr package:

library(tidyverse)
df <- tibble(
    v1 = c(
        "average of blah",
        "average of clah",
        "Dlah"
    )
)

df <- data.frame(v1 = str_sub(df$v1, -4))
df$v1 <- str_to_title(df$v1)
df 
 #    v1
#1 Blah
#2 Clah
#3 Dlah

Thanks a lot @jasongeslois !

I am posting a slightly modified version of the code just for future reference. In the actual data set, the last words do not have the length. If you can think of a better solution, do post!

library(tidyverse)
df <- tibble(
  v1 = c(
    "average of blah",
    "average of clah",
    "Dlah"
  )
)


df %>% 
  mutate(
    v2 = if_else(stringr::str_detect(v1, "average of"),
                 stringr::str_sub(v1, 12), v1),
    v2 = stringr::str_to_title(v2)
  )
#> # A tibble: 3 x 2
#>   v1              v2   
#>   <chr>           <chr>
#> 1 average of blah Blah 
#> 2 average of clah Clah 
#> 3 Dlah            Dlah

Created on 2021-09-30 by the reprex package (v2.0.0)

There are many possible solutions using regular expressions

library(tidyverse)

df <- tibble(
    v1 = c(
        "average of blah",
        "average of clah",
        "Dlah"
    )
)

df %>% 
    mutate(v2 = str_extract(v1, "(?<=\\s?)\\w+?$") %>% str_to_title())
#> # A tibble: 3 × 2
#>   v1              v2   
#>   <chr>           <chr>
#> 1 average of blah Blah 
#> 2 average of clah Clah 
#> 3 Dlah            Dlah

Created on 2021-09-30 by the reprex package (v2.0.1)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.