Mean function returns NA

#Hi ! does anyone have any idea why (the hell) this code doesn't work?:

penguins %>% filter(species == "Gentoo") %>% select(bill_length_mm) %>% mean(na.rm = TRUE)

# R generates this output:

[1] NA
Warning message:
In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA

# And why this code works good?:
adelie_penguins <- filter(penguins, species == "Adelie")
mean(adelie_penguins$bill_length_mm, na.rm = TRUE)
penguins %>% filter(species == "Gentoo") %>% select(bill_length_mm) %>% max(na.rm = TRUE)

Returns:

[1] 38.79139 (mean)

[1] 59.6 (max)

#Thanks a lot! hope it was clear

$ bill_length_mm : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42

Yes, it is because mean() doesn't coerce a single column data frame into a vector. I think it is better to explicitly pass a vector to a function that requires it instead of hoping the function to do it for you (like max() does).

You can use dplyr::pull() to extract a vector from a data frame

library(dplyr)
library(palmerpenguins)

penguins %>% 
    filter(species == "Gentoo") %>% 
    pull(bill_length_mm) %>% 
    mean(na.rm = TRUE)
#> [1] 47.50488

Created on 2022-08-07 by the reprex package (v2.0.1)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.