Aggregate by sum of group in the dataframe

Hi,

I was trying to aggregate by sum of group in the dataframe, getting an error message (see below). I could work with the aggregate by mean, however, sum function does not work.

dput(Features)
structure(list(Symbol = c("GeneA", "GeneA", "GeneB", "GeneC", 
                          "GeneM", "GeneM"), Sample_1 = c(9L, 6L, 5L, 1843L, 348L, 8L), 
               Sample_2 = c(6L, 3L, 0L, 199L, 49L, 3L), Sample_3 = c(6L, 
                                                                     2L, 1L, 460L, 84L, 1L), Sample_4 = c(18L, 19L, 81L, 103L, 
                                                                                                          30L, 154L), Sample_5 = c(49L, 13L, 0L, 2157L, 522L, 6L)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                        -6L))
#>   Symbol Sample_1 Sample_2 Sample_3 Sample_4 Sample_5
#> 1  GeneA        9        6        6       18       49
#> 2  GeneA        6        3        2       19       13
#> 3  GeneB        5        0        1       81        0
#> 4  GeneC     1843      199      460      103     2157
#> 5  GeneM      348       49       84       30      522
#> 6  GeneM        8        3        1      154        6

Data_Mean = aggregate(Features,FUN = mean,by=list(Features$Symbol))

dput(Data_Mean)

structure(list(Group.1 = c("GeneA", "GeneB", "GeneC", "GeneM"
), Symbol = c(NA_real_, NA_real_, NA_real_, NA_real_), Sample_1 = c(7.5, 
                                                                    5, 1843, 178), Sample_2 = c(4.5, 0, 199, 26), Sample_3 = c(4, 
                                                                                                                               1, 460, 42.5), Sample_4 = c(18.5, 81, 103, 92), Sample_5 = c(31, 
                                                                                                                                                                                            0, 2157, 264)), row.names = c(NA, -4L), class = "data.frame")
#>   Group.1 Symbol Sample_1 Sample_2 Sample_3 Sample_4 Sample_5
#> 1   GeneA     NA      7.5      4.5      4.0     18.5       31
#> 2   GeneB     NA      5.0      0.0      1.0     81.0        0
#> 3   GeneC     NA   1843.0    199.0    460.0    103.0     2157
#> 4   GeneM     NA    178.0     26.0     42.5     92.0      264

Data_Sum = aggregate(Features,FUN = sum,by=list(Features$Symbol))

`Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument`

Created on 2021-09-22 by the reprex package (v2.0.1)

Thank you,
Toufiq

Try formula version:

aggregate(. ~ Symbol, Features, sum)

If you do this for mean as well, you can get rid of that NA column.

@Yarnabrina, thank you very much. This worked.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.