non-unique values when setting 'row.names'

It keep getting this problem.
I don't know why the heck I'm getting this error.
What's wrong with my command?
I've tried several methods, but I keep getting duplicate "row.names" errors.

cst <-read.csv("~/R/210825_DH_deseq.csv", header = T, sep = ",", row.names = 1)
cst

               EXP.DH_R1.ReadCount EXP.DH_R2.ReadCount EXP.DH_R3.ReadCount

001707F_pilonCG40 102 145 169
002423F_pilonCG40 22 43 34
002551F_pilonCG40 56 76 46
002551F_pilonCG60 2,422 4,133 3,981
002656F_pilonCG40 189 981 522
002917F_pilonCG10 0 0 0
002917F_pilonCG20 0 0 0
EXP.DH_B1.ReadCount EXP.DH_B2.ReadCount EXP.DH_B3.ReadCount
001707F_pilonCG40 326 202 258
002423F_pilonCG40 60 28 48
002551F_pilonCG40 99 28 70
002551F_pilonCG60 1 0 0
002656F_pilonCG40 292 245 200
002917F_pilonCG10 0 0 0
002917F_pilonCG20 0 0 0
EXP.DH_L1.ReadCount EXP.DH_L2.ReadCount EXP.DH_L3.ReadCount
001707F_pilonCG40 147 687 306
002423F_pilonCG40 67 107 80
002551F_pilonCG40 77 64 64
002551F_pilonCG60 1 0 0
002656F_pilonCG40 0 1 0
002917F_pilonCG10 0 0 0
002917F_pilonCG20 0 0 0

metaData = data.frame(

  • row.names = colnames(cst),
  • condition = c('R1','R2','R3','B1','B2','B3','L1','L2','L3'),
  • libType = c("single-end","single-end","single-end","single-end","single-end","single-end","single-end","single-end","single-end"))

metaData

                condition    libType

EXP.DH_R1.ReadCount R1 single-end
EXP.DH_R2.ReadCount R2 single-end
EXP.DH_R3.ReadCount R3 single-end
EXP.DH_B1.ReadCount B1 single-end
EXP.DH_B2.ReadCount B2 single-end
EXP.DH_B3.ReadCount B3 single-end
EXP.DH_L1.ReadCount L1 single-end
EXP.DH_L2.ReadCount L2 single-end
EXP.DH_L3.ReadCount L3 single-end

dds <- DESeqDataSetFromMatrix(countData=cst,

  •                           colData=metaData,
    
  •                           design=~condition, tidy = TRUE)
    

Error in .rowNamesDF<-(x, value = value) : **
** duplicate 'row.names' are not allowed

In addition: Warning message:
non-unique values when setting 'row.names': ‘0’, ‘4’

Hi @ioer0417,
Welcome to the RStudio Community Forum.

It is best to avoid storing "important information" from your data in the data frame row names - better to keep it in a named column for future use (e.g. splitting it to form factors). Row names, by default, are allocated numbers from 1, so remain unique.

So, try reading your data like this:

cst  <- read.csv("~/R/210825_DH_deseq.csv", header = TRUE, sep = ",")
cst

If your header row has only 3 fields while the data has 4, then I suggest you ignore the header row when reading, and add it later manually:

cst <- read.csv("~/R/210825_DH_deseq.csv", header = FALSE, skip = 1, sep = ",")
cst

names(cst) <- c("id", "var1", "var2", "var3")
1 Like

Thank you for answer.
However, this problem still occurs.
How to solve it?

Error in .rowNamesDF<- (x, value = value) : **
** duplicate 'row.names' are not allowed

In addition: Warning message:
non-unique values when setting 'row.names': ‘0’, ‘4’

The DESeqDataSetFromMatrix() function probably requires a matrix as input not a dataframe - I'm not familiar with this package but I suggest its time for a very close read of the documentation and examples:

help("DESeqDataSetFromMatrix")

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.