How to exclude microarray outlier data after PCA?

I have done the quality controls like PCA and hierarchical clustering and found an outlier, which is Cancer5.CEL. How can I remove this outlier for Differential Gene Expression Analysis? I don't know the code, please help. Thank you!

Here is my code:
crc.fac<- factor(c(rep("Cancer", 7),rep("Healthy",6)))
crc.df <- data.frame(crc = crc.fac,row.names = paste(crc.fac, rep(1:13, 1), sep = ''))
crc.mData <- data.frame(labelDescription = c("gene regulation"))
crc.mData
crc.pData <- new("AnnotatedDataFrame", data = crc.df, varMetadata = crc.mData)
validObject(crc.pData)
[1] TRUE
list.files(path = ".", pattern = ".CEL")
crc.df <- data.frame(crc.fac, filename = list.files(path =".",pattern=".CEL"),row.names =paste(crc.fac, rep(1:13, 1), sep = '
'))
crc.affy <- read.affybatch(filename = list.files(path =".",pattern=".CEL", full.names = TRUE),
+ phenoData = crc.pData)
View(crc.affy)
crc_calls.eSet <- mas5calls.AffyBatch(crc.affy)
crc_calls.mx <- exprs(crc_calls.eSet)
crc.eSet <- rma(crc.affy)
crc_log2.mx <- exprs(crc.eSet)
head(crc_log2.mx)
boxplot(as.data.frame(crc_log2.mx), xlab = "", ylab = "Log2 rma signal", las = 2, main = "Sample Distributions")
crc_P_rate.nv <- apply(crc_calls.mx == "P", 2, sum) / nrow (crc_calls.mx)

4. quality controls

check potential physical defects in the arrays

image(crc.affy[, 1])

PCA
pca <- prcomp(t(crc_log2.mx))
eigs <- pca$sdev^2
varexplained <- eigs/sum(eigs)
varexplained
barplot(varexplained * 100, ylab="% variance explained", xlab="principal components")
box()
plot(pca$x[, 1], pca$x[, 2], col=rep(rainbow(2), each=7,6), xlab="PC1", ylab="PC2", cex=3)
text(pca$x[, 1], pca$x[, 2], labels = colnames(crc_log2.mx))

In total I have 13 .CEL file including Halthy.CEL and Cancer.CEL

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.