Hi,
I am analysing large datasets in R, and came across an issue. For instance, I have a column in a dataframe(df) with blank cells, here the -which
command works fine and as expected. (see example below)
dput(df)
structure(list(Sample_title = c("Healthy Control, biological rep1",
"Healthy Control, biological rep2", "Healthy Control, biological rep3",
"Healthy Control, biological rep4"), Sample_accession = c("GSM542941",
"GSM542942", "GSM542943", "GSM542944"), Patients = c("control",
"control", "control", "control"), strain = c("none", "none",
"none", "none"), Tissue = c("Blood", "", "", "Blood")), class = "data.frame", row.names = c(NA,
-4L))
df = df[-which(df$Tissue == ""),]
dim(df)
[1] 2 5
I have a column in a another dataframe (df_1) without any blank cells and when I run the -which
command, all the data gets deleted. How to I avoid deleting and proceed to next step without any changes to the dataframe? (see example below)
dput(df_1)
structure(list(Sample_title = c("Healthy Control, biological rep1",
"Healthy Control, biological rep2", "Healthy Control, biological rep3",
"Healthy Control, biological rep4"), Sample_accession = c("GSM542941",
"GSM542942", "GSM542943", "GSM542944"), Patients = c("control",
"control", "control", "control"), strain = c("none", "none",
"none", "none"), Tissue = c("Blood", "Blood", "Blood", "Blood"
)), class = "data.frame", row.names = c(NA, -4L))
df_1 = df_1[-which(df_1$Tissue == ""),]
dim(df_1)
[1] 0 5
OR, checking the blank cells before using -which
and removing is the only method?
sum(df$Tissue=="")
which(df$Tissue=="", arr.ind=TRUE)
Thank you,
Toufiq