problems with tibble size?

SebastianC.Fortuno · March 25, 2020, 10:17am

I want to count the number of existing zeros in each column of a database of 181,000 rows and 330 variables.

i just run the following code:

db %>%
summarise_all(funs(sum(. == 0)))

the result is a full vector of NA, but I do the same procedure with a fragment of the database, and it works correctly.

It should be noted that I do the same procedure, only looking for the NA values

db %>%
summarise_all(funs(sum(is.na(.))))

and in this case it works perfectly.

nirgrahamuk · March 25, 2020, 10:21am

library(tidyverse)
db <- mtcars %>% select_if(is.numeric)
#put an NA in there, in the first row of the vs column
db$vs[[1]] <- NA

db %>%
  summarise_all(funs(sum(. == 0)))

db %>%
  summarise_all(funs(sum(. == 0,na.rm = TRUE)))

siddharthprabhu · March 25, 2020, 10:25am

Do your columns contain any NA values? If yes, then you need to also supply the na.rm = TRUE argument to sum().

SebastianC.Fortuno · March 25, 2020, 10:26am

I have indeed fallen into forgetting that detail, thanks

system · April 1, 2020, 10:35am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.