lapply (.SD, mean) returns NAs

Delphine · February 1, 2023, 3:46pm

Hi!

I am trying to obtain the mean value of several traits per genotype.
My table as a lot of NAs but it should have enough values to give me a result with

> dt_short
         EVA ID EC_CRA EC_DC EC_FW
 1: EVA_DC_0001      0     0     3
 2: EVA_DC_0001     NA    NA     3
 3: EVA_DC_0001     NA     0     3
 4: EVA_DC_0001     NA    NA    NA
 5: EVA_DC_0001     NA    NA    NA
 6: EVA_DC_0001      0     0    NA
 7: EVA_DC_0001      5     0     3
 8: EVA_DC_0002      0     0     5
 9: EVA_DC_0002      0     0    NA
10: EVA_DC_0002     NA    NA     7
11: EVA_DC_0002      0     0    NA
12: EVA_DC_0002      0    NA    NA
13: EVA_DC_0002      0     0     5
14: EVA_DC_0002     NA    NA    NA
15: EVA_DC_0002      0     0     7
16: EVA_DC_0002      0     0     7
17: EVA_DC_0002      3     0    NA
18: EVA_DC_0002     NA    NA     7
19: EVA_DC_0002     NA    NA    NA
20: EVA_DC_0003      0     0     3
21: EVA_DC_0003      3     0    NA
22: EVA_DC_0003     NA    NA    NA
23: EVA_DC_0003      0    NA    NA
24: EVA_DC_0003     NA    NA    NA
25: EVA_DC_0003     NA    NA    NA
26: EVA_DC_0003      0     0    NA
27: EVA_DC_0003     NA    NA    NA
28: EVA_DC_0003     NA    NA    NA

geno <-  dt_short[ , lapply(.SD, mean), by = `EVA ID`]
geno

> geno
        EVA ID EC_CRA EC_DC EC_FW
1: EVA_DC_0001     NA    NA    NA
2: EVA_DC_0002     NA    NA    NA
3: EVA_DC_0003     NA    NA    NA

Do you know why this problem happens?

I can try to replace NA values with the missDMA package, but I'm affraid this only makes sense with PCAs.

Thank a lot for your help!

FactOREO · February 1, 2023, 3:51pm

You just have to pass the additional argument na.rm=TRUE inside the lapply() call. This should fix the NA value as return, since it will ignore missing values in the calculation.

Delphine · February 1, 2023, 4:00pm

You're a life savier, thanks! (love the pseudo)

system · February 8, 2023, 4:00pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.