string function

tjcnnl1 · September 23, 2020, 7:37pm

I have a column in a data frame that contains something like student id, name, score, subject, quarter. These items are connected by hyphen “- “ for example

Student$info: 1-john-80-math-4q19, 2-linda-90-art-4q10….

I want to extract subject item, i.e. after 3rd hyphen like “math” , “art”… how to do it? Is there a substring function based on pattern's position in r?

jmcvw · September 23, 2020, 8:10pm

The following regex will work. The greedy nature of the first instance of .+ means it will replace everything up to the second last hyphen.

s <- c('1-john-80-math-4q19', '2-linda-90-art-4q10')
sub('.+-(.+)-.+', '\\1', s)

An alternative might be to use strsplit()?

strsplit(s, '-') %>% 
  map(4) %>% 
  unlist()

FJCC · September 23, 2020, 8:20pm

Here are two ways to separate the info column into its components.

TEXT <- c("1-john-80-math-4q19", "2-linda-90-art-4q10")
DF <- data.frame(OtherCol = c(23, 45), info = TEXT)
DF
#>   OtherCol                info
#> 1       23 1-john-80-math-4q19
#> 2       45 2-linda-90-art-4q10
library(stringr)
TextMat <- str_split(DF$info,"-", simplify = TRUE)
DF$Subject <- TextMat[, 4]
DF
#>   OtherCol                info Subject
#> 1       23 1-john-80-math-4q19    math
#> 2       45 2-linda-90-art-4q10     art

#with tidyr
library(tidyr)
DF <- data.frame(OtherCol = c(23, 45), info = TEXT)
DF <- separate(DF, info, into = c("ID", "name", "score", "subject", "quarter"), sep = "-")
DF
#>   OtherCol ID  name score subject quarter
#> 1       23  1  john    80    math    4q19
#> 2       45  2 linda    90     art    4q10

^{Created on 2020-09-23 by the reprex package (v0.3.0)}

tjcnnl1 · September 23, 2020, 8:40pm

Both work fine. Thank you!

tjcnnl1 · September 23, 2020, 8:49pm

Thank you very much. The first one doesn't work for me. The alternative doesn't work neighter if the column value doesn't have hyphen.

system · September 30, 2020, 8:49pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.