Visualize missing values in your data

rene_at_coco · October 4, 2022, 9:03pm

I saw an example of a way to visualize missing data in Python and I thought, how would this work in R? Of course there are several examples, but now there is also this example.

r
library(tidyverse)

# create random data 
df <- tibble(
  x = factor(sample(
    x = seq(1,9,1),
    size = 1000,
    replace = TRUE
  )),
  y = factor(sample(
      x = seq(1,9,1),
      size = 1000,
      replace = TRUE
    )),
  z = factor(sample(
    x = seq(1,9,1),
    size = 1000,
    replace = TRUE
  ))
)

df %>%
  mutate(
    id = row_number() # id serves as y value
  ) %>%
  pivot_longer(
    cols = -id,
    names_to = "variable", # serves as x value
    values_to = "value"
  ) %>%
  mutate( # create a new variable for fill
    na_value = na_if(
      x = value,
      y = 9), # in this example, 9 is missing
    isna = is.na(na_value)
  ) %>%
  ggplot(
    mapping = aes(
      x = variable,
      y = id,
      fill = isna
    )
  ) +
  geom_tile() +
  scale_fill_viridis_d() +
  labs(
    title = "Missing Data",
    x = "Variable",
    y = "Record",
    fill = "Missing"
  )

^{Created on 2022-10-04 with reprex v2.0.2}

M_AcostaCH · October 4, 2022, 9:11pm

Maybe this library could help you.

https://cran.r-project.org/web/packages/naniar/vignettes/naniar-visualisation.html

smouksassi · October 9, 2022, 10:39am

what is reallly you question ?

R has many libraries to deal with missing data viz
example

https://cran.r-project.org/web/packages/VIM/vignettes/VisualImp.html

rene_at_coco · October 11, 2022, 4:17pm

Not a question, but thanks for these resources. I appreciate the input.

system · November 22, 2022, 4:17pm

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.