Read csv file leading error

I am using Seurat and trying to analyze the data set that installed from GEO. But I am getting the following error message: In storage.mode(from) <- "double" : NAs introduced by coercion. Please tell me how to work it. Thanks

library(dplyr)
library(Seurat)
library(ggplot2)
cancer.rna <- read.csv2(file = "/Users/desktop/GSE84133/GSM2230757_human1_umifm_countsnew3.csv", sep = ",",header = TRUE, row.names = NULL)
dim(cancer.rna)
[1] 16381 1651
cancer.rna[1:5, 1:5]
X TGCCTCAC.TGGGATTC AGAGACTA.GATTGCGA GAGGGAGGTA.GAAGGCTT GAGAGAGTAT.CCTATTCA
1 A1BG 0 0 0 0
2 A1CF 0 0 0 2
3 A2M 4 0 0 0
4 A2ML1 0 0 0 0
5 A4GALT 0 0 0 0
cancer <- CreateSeuratObject(counts = cancer.rna, project = "GSE84133", min.cells = 3, min.features = 200)
Warning message:
In storage.mode(from) <- "double" : NAs introduced by coercion

Just to clarify, this is a warning, as opposed to an error. A warning means that your code ran, but it is giving you extra information that you might want to attend to.

In this case, it looks like the warning is not coming when you read the file in, but, rather when you go to use CreateSeuratObject().

From the documentation, it looks like the input expected is:

Create a Seurat object from a feature (e.g. gene) expression matrix. The expected format of the input matrix is features x cells.

This section on Explicit Coercion explains it in more detail, but the gist of it is that the code is trying to turn one type of variable (e.g. a character string) into another (e.g. a double) in a way that is not compatible.

Thank for the advice. It's exactly as you said, I'm trying to create the t-SNE plot using a matrix of genes and cells.
I think that I need the improvements to this Warning message Because the next step including the NormalizeData and the FindVariableFeatures was received an error message.
Is there a way to improve "double" and "NA"?

cancer <- CreateSeuratObject(counts = cancer.rna, project = "GSE84133", min.cells = 3, min.features = 200)
Warning message:
In storage.mode(from) <- "double" : NAs introduced by coercion
cancer
An object of class Seurat
11768 features across 1650 samples within 1 assay
Active assay: RNA (11768 features)
cancer <- NormalizeData(cancer)
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Error: Attempting to add a different number of cells and/or features
cancer <- FindVariableFeatures(cancer)
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Error: Cannot add more or fewer meta.features information without values being named with feature names

I think that would have to happen in the seurat package (with which, I should note, I'm not terribly familiar).

There seem to be some posts/tutorials out there with worked examples, and you might try going through one of those to see where what you're diverging with your own data. For example, in the post below, it looks like a matrix with rownames is passed into CreateSeuratObject(), as opposed to having the first column as the name (which it looks like you have in your cancer.ma object).

It's also a helpful way to figure out if there's a bug in the code itself — though it's always good to check the NEWS to make sure that doesn't reflect an intentional breaking change inside the package.

https://davetang.org/muse/2017/08/01/getting-started-seurat/

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.