Plotting the "concordance" of two categorical binomial variables



Alright, so this might be a little bit abstract (or maybe completely obvious?), but it would be interesting to get an opinion nonetheless.

  1. Does it make any sense to plot this?
  2. if question 1 == TRUE, how would I go about it? What am I looking for?

Let's say I have two one-dimensional data frames.

df1 <- data.frame(a = 0, b = 0, c = 1, d = 1)
df2 <- data.frame(a = 0, b = 1, c = 1, d = 0)

In words (since I have no idea how to do this):
I'd like to visualise how each value in df 1matches the values in df2, and vice versa. E.g. if df1["a"] == df2["a"] add something to the plot. If they are unequal, add something else that displays if either df1 or df2 was 1 or 0.

My initial thought was something in the lines of a geom_point() plot using the size argument for matches, but I guess that would just cluster the values at 0 and 1, unless I'm able to somehow set the x and y axes to a, b, c and d?

Thank you!


It's possible to do exactly as you describe, to set the x axes to categorical variables of a, b, c, and d.

library(ggrepel) # handy for placing data labels

df1 <- data.frame(a = 0, b = 0, c = 1, d = 1)
df2 <- data.frame(a = 0, b = 1, c = 1, d = 0)

combo <- 
  rbind(df1, df2) %>%
  rowid_to_column(var = "source") %>%
  # This will put it into a tidy format
  gather(category, value, a:d)

combo %>% 
  # ...but for this particular case it helps to spread by data source
  spread(source, value) %>%
  ggplot(aes(`1`, `2`, label = category)) + 
    geom_point() +