percentage of NA

hello everybody
I am new to r and I know it may seem such a stupid question.
I use this code to have the number of NA for a specific variable.

sum(is.na(df$SureAboutHeight))

but I want to know the percentage of NA for each variable.

what should i do?

sum(is.na(df$SureAboutHeight))/nrow(df$SureAboutHeight)

1 Like
suppressPackageStartupMessages(library("dplyr"))

# Duplicate mtcars dataset as example
mtcars_na <- mtcars

# Add in some missing data

mtcars_na[8, c(1, 3, 4)] <- NA
mtcars_na[9, c(1, 2, 3)] <- NA

# In base R
vapply(mtcars_na, function(x) {
  100 * sum(is.na(x), na.rm = TRUE) / length(x)
  }, FUN.VALUE = double(1L))
#>   mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb 
#> 6.250 3.125 6.250 3.125 0.000 0.000 0.000 0.000 0.000 0.000 0.000

# Using tidyverse approach

# Specify the function you want applied to each column
fns <- list(pct_missing  = ~ 100 * sum(is.na(.), na.rm = TRUE) / length(.))

# For all columns
(mtcars_summary <- mtcars_na %>%
  summarise(across(, .fns = fns)))
#>   mpg_pct_missing cyl_pct_missing disp_pct_missing hp_pct_missing
#> 1            6.25           3.125             6.25          3.125
#>   drat_pct_missing wt_pct_missing qsec_pct_missing vs_pct_missing
#> 1                0              0                0              0
#>   am_pct_missing gear_pct_missing carb_pct_missing
#> 1              0                0                0

Created on 2021-10-02 by the reprex package (v2.0.1)

1 Like

Another option is to use the mean function. is.na(x) returns TRUE if a value is NA and FALSE otherwise. mean treats TRUE as equal to 1 and FALSE as equal to 0, so the mean is the fraction of values that are NA. For example:

x = c(1,2,3,NA,NA)

is.na(x)
#> [1] FALSE FALSE FALSE  TRUE  TRUE

# Fraction NA
mean(is.na(x))
#> [1] 0.4

# Percentage NA
mean(is.na(x)) * 100
#> [1] 40
2 Likes

Many thanks for your help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.