Hello,
I am trying to filter out peptides with more than 45% zero values. I already performed imputations for NA values, but now want to subset data with < 45% "zeros".
Here is an example of what my data looks like (P on the left stands for participant).
#> Peptide 1 Peptide 2 Peptide 3 Peptide 4 Peptide 5
#> P1 0.06637717 0.22459742 0.05149364 0.12442481 0.23367631
#> P2 0.09303097 0.23616882 0.04413919 0.17940463 0.05303563
#> P3 0. 0.16519945 0.17175571 0.24797652 0.16291844
#> P4 0 0.15727851 0.09602593 0.09500879 0.03138877
#> P5 0 0.01544657 0.19246035 0.19436131 0.06680517
FJCC
May 7, 2020, 4:35pm
2
Here is a way to do it using functions from base R.
#Make some data
DF <- data.frame(Subject = paste0("P", 1:10), stringsAsFactors = FALSE)
DataMat <- matrix(sample(0:1, size = 100, replace = TRUE), nrow = 10)
colnames(DataMat) <- paste0("Peptide", 1:10)
DF <- cbind(DF, as.data.frame(DataMat))
DF
#> Subject Peptide1 Peptide2 Peptide3 Peptide4 Peptide5 Peptide6 Peptide7
#> 1 P1 1 0 0 1 1 1 0
#> 2 P2 1 0 0 0 1 1 0
#> 3 P3 0 1 1 0 1 1 0
#> 4 P4 1 0 0 1 0 0 0
#> 5 P5 1 1 0 1 0 0 1
#> 6 P6 1 1 1 1 1 1 1
#> 7 P7 1 0 1 0 1 0 0
#> 8 P8 0 0 1 1 1 1 0
#> 9 P9 0 0 0 0 1 0 1
#> 10 P10 0 1 0 0 0 0 1
#> Peptide8 Peptide9 Peptide10
#> 1 1 0 1
#> 2 1 1 1
#> 3 0 1 1
#> 4 1 1 1
#> 5 0 1 1
#> 6 1 1 1
#> 7 0 0 0
#> 8 1 1 1
#> 9 1 0 0
#> 10 1 1 1
#COunt the zeros in each row
CountZeros <- function(x) sum(x == 0)
Zeros <- apply(X = DF[, 2:11], 1, CountZeros)
Zeros
#> [1] 4 4 4 5 4 0 7 3 7 5
#Filter the rows with fewer than 5 zeros
DF_filtered <- DF[Zeros < 5, ]
Created on 2020-05-07 by the reprex package (v0.3.0)
Thank you very much for your help.
system
Closed
May 29, 2020, 12:41pm
4
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.