Hi FJCC,
Thank you so much for your advice!
I gave a try on your example but found that the same content won't be separated into the same column(In the example case it is promoter2000, one in row1, column F1, another one in row2, column F2)....
> DF <- data.frame(Chrom = c("chr1", "chr1"), Start = c(136895, 181286),
+ End = c(136895, 181286), Dataset = "AML",
+ Feature = c("promoter_2000,promoter_1000,cpg_shore", "exon,promoter_2000"))
> DF
Chrom Start End Dataset Feature
1 chr1 136895 136895 AML promoter_2000,promoter_1000,cpg_shore
2 chr1 181286 181286 AML cpg_shore,promoter_2000
> DF <- DF %>% separate(col = Feature, into = c("F1", "F2", "F3"), fill = "right", sep = ",")
> DF
Chrom Start End Dataset F1 F2 F3
1 chr1 136895 136895 AML promoter_2000 promoter_1000 cpg_shore
2 chr1 181286 181286 AML exon promoter_2000 <NA>
I found another clumsy way to do that. I am using concat.split.multiple function in splitstackshape package to split concatenated cells and reshape the outputs. In this case, it reshapes the feature column and creates new rows for each string separated by a comma in the feature column.
> DF <- data.frame(Chrom = c("chr1", "chr1"), Start = c(136895, 181286),
+ End = c(136895, 181286), Dataset = "AML",
+ Feature = c("promoter_2000,promoter_1000,cpg_shore", "exon,promoter_2000"))
> DF
Chrom Start End Dataset Feature
1 chr1 136895 136895 AML promoter_2000,promoter_1000,cpg_shore
2 chr1 181286 181286 AML exon,promoter_2000
> DF <- concat.split.multiple(DF, split.col="Feature", sep=",",direction="long")
> DF
Chrom Start End Dataset Feature
1: chr1 136895 136895 AML promoter_2000
2: chr1 136895 136895 AML promoter_1000
3: chr1 136895 136895 AML cpg_shore
4: chr1 181286 181286 AML exon
5: chr1 181286 181286 AML promoter_2000