how to visualize frequencies for a factor with many levels?

Andrea · February 26, 2020, 10:30am

Suppose I have a factor with a lot of levels, such as

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

combinations <- expand_grid(a=LETTERS, b=LETTERS) %>%
  unite(combinations, sep = "") %>%
  slice(1:50) %>%
  pull(combinations)

probs <- sin(seq(0, pi, length.out = length(combinations)))/pi
combination <- sample(combinations, replace = TRUE, prob = probs)

dframe <- as.data.frame(table(combination))
dframe
#>    combination Freq
#> 1           AC    1
#> 2           AD    1
#> 3           AE    1
#> 4           AG    1
#> 5           AH    2
#> 6           AI    1
#> 7           AJ    1
#> 8           AL    1
#> 9           AM    1
#> 10          AN    1
#> 11          AO    2
#> 12          AP    2
#> 13          AR    2
#> 14          AS    3
#> 15          AT    3
#> 16          AU    2
#> 17          AV    1
#> 18          AW    1
#> 19          AX    2
#> 20          AY    2
#> 21          BA    2
#> 22          BB    2
#> 23          BD    1
#> 24          BF    3
#> 25          BI    2
#> 26          BL    1
#> 27          BM    1
#> 28          BO    3
#> 29          BP    1
#> 30          BQ    1
#> 31          BS    1
#> 32          BT    1

^{Created on 2020-02-26 by the reprex package (v0.3.0)}

(maybe the example is not amazing, but it's all I could come up with quickly). The contingency table is not wonderful to read, because it's quite long. Maybe a plot could be better. But which visualization would you use?

nirgrahamuk · February 26, 2020, 10:32am

Possibly a word cloud?

Andrea · February 26, 2020, 11:02am

I've always been hesitant to use word clouds because in my mind they're associated to "hype" or "bad presentation practices" (such as, for example, pie charts). But it may be time to get out of my comfort zone how do I make one in R? This is going in a RMarkdown doc, so it has better look nice in a RMarkdown doc

nirgrahamuk · February 26, 2020, 11:23am

I've never used them myself for similar reasons

mara · February 26, 2020, 1:32pm

Also wordcloud2! The R Graph Gallery is generally a really nice resource for finding packages for different types of visualization:

Andrea · March 4, 2020, 1:46pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.