non-unique values when setting 'row.names'

ioer0417 · September 7, 2021, 5:45am

It keep getting this problem.
I don't know why the heck I'm getting this error.
What's wrong with my command?
I've tried several methods, but I keep getting duplicate "row.names" errors.

cst <-read.csv("~/R/210825_DH_deseq.csv", header = T, sep = ",", row.names = 1)
cst

               EXP.DH_R1.ReadCount EXP.DH_R2.ReadCount EXP.DH_R3.ReadCount

001707F_pilonCG40 002423F_pilonCG40 002551F_pilonCG40 002551F_pilonCG60 002656F_pilonCG40 002917F_pilonCG10 002917F_pilonCG20 EXP.DH_B1.ReadCount 001707F_pilonCG40 002423F_pilonCG40 002551F_pilonCG40 002551F_pilonCG60 002656F_pilonCG40 002917F_pilonCG10 002917F_pilonCG20 EXP.DH_L1.ReadCount 001707F_pilonCG40 002423F_pilonCG40 002551F_pilonCG40 002551F_pilonCG60 002656F_pilonCG40 002917F_pilonCG10 002917F_pilonCG20 102 145 169
22 43 34
56 76 46
2,422 4,133 3,981
189 981 522
0 0 0
0 0 0
EXP.DH_B2.ReadCount EXP.DH_B3.ReadCount
326 202 258
60 28 48
99 28 70
1 0 0
292 245 200
0 0 0
0 0 0
EXP.DH_L2.ReadCount EXP.DH_L3.ReadCount
147 687 306
67 107 80
77 64 64
1 0 0
0 1 0
0 0 0
0 0 0

metaData = data.frame(

row.names = colnames(cst),
condition = c('R1','R2','R3','B1','B2','B3','L1','L2','L3'),
libType = c("single-end","single-end","single-end","single-end","single-end","single-end","single-end","single-end","single-end"))

metaData

                condition    libType

EXP.DH_R1.ReadCount R1 single-end
EXP.DH_R2.ReadCount R2 single-end
EXP.DH_R3.ReadCount R3 single-end
EXP.DH_B1.ReadCount B1 single-end
EXP.DH_B2.ReadCount B2 single-end
EXP.DH_B3.ReadCount B3 single-end
EXP.DH_L1.ReadCount L1 single-end
EXP.DH_L2.ReadCount L2 single-end
EXP.DH_L3.ReadCount L3 single-end

dds <- DESeqDataSetFromMatrix(countData=cst,

                          colData=metaData,

                          design=~condition, tidy = TRUE)

Error in .rowNamesDF<-(x, value = value) : **
** duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘0’, ‘4’

DavoWW · September 7, 2021, 8:30am

Hi @ioer0417,
Welcome to the RStudio Community Forum.

It is best to avoid storing "important information" from your data in the data frame row names - better to keep it in a named column for future use (e.g. splitting it to form factors). Row names, by default, are allocated numbers from 1, so remain unique.

So, try reading your data like this:

cst  <- read.csv("~/R/210825_DH_deseq.csv", header = TRUE, sep = ",")
cst

If your header row has only 3 fields while the data has 4, then I suggest you ignore the header row when reading, and add it later manually:

cst <- read.csv("~/R/210825_DH_deseq.csv", header = FALSE, skip = 1, sep = ",")
cst

names(cst) <- c("id", "var1", "var2", "var3")

ioer0417 · September 7, 2021, 8:37am

Thank you for answer.
However, this problem still occurs.
How to solve it?

Error in .rowNamesDF<- (x, value = value) : **
** duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘0’, ‘4’

DavoWW · September 7, 2021, 9:04am

The DESeqDataSetFromMatrix() function probably requires a matrix as input not a dataframe - I'm not familiar with this package but I suggest its time for a very close read of the documentation and examples:

help("DESeqDataSetFromMatrix")

system · September 28, 2021, 9:04am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.