Need to use sapply and str.split to rename sequence files

Hi,

I am working on microbiome analysis and I need to rename a list of sequences with names of sampleID in the metadata file.
So to split the sequence name, I used the following:
sample.names1 <- sapply(strsplit(fns, "_"), function(x) paste(x[9]))
sample.names1
and I got the names below
[1] reverse.P1" "reverse.P2" "reverse.P4b" "reverse.P5" "reverse.P6"
[6] "reverse.P7" "reverse.P8" "reverse.P9" "reverse.P10" "reverse.P11"
So, I need to remove the text prior to the period (reverse.) and keep Pn

when I repeated the step using
sample.names2 <- sapply(strsplit(sample.names1, "."), function(x1) paste(x1[2]))
sample.names2
I got this:
[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
[36] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""

Any help, please?

The split argument of sresplit() is a regular expression, so "." means "any character". Try

sample.names2 <- sapply(strsplit(sample.names1, "\\."), function(x1) paste(x1[2]))

The \\ tells R the . is a literal ., not a regular expression.

Yes, it works! Much thanks!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.