Drawing "Linked Diagrams" in R

I am using R and am trying to re-create something like this picture: https://i.redd.it/lrjdj45xrpo21.png

I did some research and saw that the "ggforce" library (Create Parallel Sets diagrams — geom_parallel_sets • ggforce) in R allows the user to make similar styles of plots (plots using the "first name", "middle name" and "last name" - these plots show that the "first name" is really common, the "first name and the middle name" is a bit less common, and the "first name, middle name and last name" is much less common):

library(ggforce)
library(reshape2)

 name_data <- data.frame(
    "City" = c("Paris", "Paris", "Paris", "Paris", "Paris", "London", "London", "London", "Paris", "London", "Paris"),
 
    "First_Name" = c("John", "John", "John", "John", "John", "John", "James", "James", "Adam", "Adam", "Henry"),
    "Middle_Name" = c("Claude", "Claude", "Claude", "Smith", "Smith", "Peters", "Stevens", "Stevens", "Ford", "Tom", "Frank"),
    "Last Name " = c("Tony", "Tony", "Frank", "Carson", "Phil", "Lewis", "Eric", "David", "Roberts", "Scott", "Xavier")
)

name_data$ID <- seq.int(nrow(name_data))
 
data <- reshape2::melt(name_data)
data <- gather_set_data(name_data)

ggplot(name_data, aes(x, id = ID, split = First_Name, value = value)) +
  geom_parallel_sets(aes( alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white'))

But this returns the following error:

Error in FUN(X[[i]], ...) : object 'x' not found

Can someone please show me what am I doing wrong? Is it possible to show the labels of each group on this graph and the numbers in each group?

Thanks

Some advice, in general its a mistake to review the errors your script throws in reverse order.
Its logical that errors to the bottom of your code may well be caused by errors above in your code, so address them in the order they appear will generally lead to better outcomes.

In your case you have an error where gather_set_data(name_data) fails to work because no 'x' was provided.

Please have a look at the example provided:

data <- reshape2::melt(Titanic)
data2 <- gather_set_data(data, 1:4)

ggplot(data2, aes(x, id = id, split = y, value = value)) +
  geom_parallel_sets(aes(fill = Sex), alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white')

(I added the data2 to differentiate the datasets and see the result of the transformation:
What you did wrong:

  1. The " gather_set_data(name_data)" should not use all columns but just the categories
  2. The ggplot function then uses the modified data, not the initial data set ( name_data).
  3. Your input doesn't contain any counts that can be shown (no value column)
name_data <- data.frame(
  "City" = c("Paris", "Paris", "Paris", "Paris", "Paris", "London", "London", "London", "Paris", "London", "Paris"),
  
  "First_Name" = c("John", "John", "John", "John", "John", "John", "James", "James", "Adam", "Adam", "Henry"),
  "Middle_Name" = c("Claude", "Claude", "Claude", "Smith", "Smith", "Peters", "Stevens", "Stevens", "Ford", "Tom", "Frank"),
  "Last_Name" = c("Tony", "Tony", "Frank", "Carson", "Phil", "Lewis", "Eric", "David", "Roberts", "Scott", "Xavier"), 
  "value" = round(runif(11, 0, 12),0)
)

data <- reshape2::melt(name_data)
data2 <- gather_set_data(name_data, 2:4) # I use 2:4 to leave city out from the plot

ggplot2::ggplot(data2, aes(x, id = id, split = y, value = value)) +
  geom_parallel_sets(aes(fill = City),
                         alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white')

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.