Hi,
I work a dataset with 1 500 00 rows and I used the old method to delete NA : I use replace NA by 0 + loop for. It's too long (very very !!)
I would like to know if I can replace this method by a function from dplyr.
description :
If I have NA in 4 columns I have put 0 to replace NA of each cells
If I have NA in 2 columns (e.g X2,Y2) I have to put 0 in cells and put also to 0 in (X1,Y1)
If (X1,Y1) and (X2,Y2) different of NA I keep the values.
Thanks in advance to your help !
tab.na<-replace(tab,is.na(tab),0)
for (i in 1:nrow(tab.na)){
if(tab.na$X1[i]!=0 | tab.na$Y1[i]!=0 | tab.na$X2[i]!=0 | tab.na$Y2[i]!=0)
{
tab.na$X1[i] <-0
tab.na$Y1[i] <- 0
tab.na$X2[i] <- 0
tab.na$Y2[i] <- 0
}
}
|X1 Y1 |X2| Y2|
|NA |NA |NA| NA|
|NA| NA| NA| NA|
|NA |NA| NA |NA|
|6002 |6002 |NA| NA|
|5000 |500| 4000 |5440|
|5001 |501| 4001 |5441|
|5002 |502| 4002| 5442|
|5003 |503 |4003| 5443|
|5004 |504| 4004| 5444|