Plot a list of two lists containing data frames with plot_ly (boxplot)

Hi :slight_smile:
So after some work I created a list with two lists containing data frames (every data frame has only one column with numeric values but different number of rows):

df1 <- data.frame(c(1:10))
df2 <- data.frame(c(2:9))
df3 <- data.frame(c(3:14))
df4 <- data.frame(c(4:13))
df5 <- data.frame(c(1:10))
df6 <- data.frame(c(2:20))
df7 <- data.frame(c(3:16))
df8 <- data.frame(c(4:9))

list_1 <- list(df1,df2,df3,df4)
list_2 <- list(df5,df6,df7,df8)
list_3 <- list(list_1, list_2)

The first list contains data frames of femal probands, while the second list contains data frames of male probands. I want to compare df1 from the first list and df5 from the second list; df2 from the first list and df6 from the second list;...with boxplots made by plot_ly (so two boxplots in one plot side by side). Does anyone have an idea how this works? I know how it works theoretically (https://plot.ly/r/box-plots/), but I don't know how to select the right entry from the list.
So the first entry from my list_3 is list_3[[1]]. But in my case list_3[[1]] is a whole list of data frames. How do I choose the first data frame df1 whthin the first entry of my list_3?

Thank you! :slight_smile:

The specifics of the necessary code will depend on what form your data is in when you load it into R. For now, I've assumed that we're starting with list_1 and list_2.

library(tidyverse)

# Change the name of the column with the data values to "data"
# Add a column with the sex of the individuals
list_1 = list_1 %>% map(~.x %>% set_names("data") %>% mutate(sex="Male"))
list_2 = list_2 %>% map(~.x %>% set_names("data") %>% mutate(sex="Female"))

# Combine each list into a single data frame with an added column (called "source")
#  giving the number of the list element (which we'll use to plot corresponding
#  data series of male and female) and then bind the two separate data frames
#  into a single data frame
df = list(list_1, list_2) %>% 
  map_df(~bind_rows(.x, .id="source"))

Now we have a single data frame for plotting.

p = df %>% 
  ggplot(aes(source, data, colour=sex)) +
  geom_boxplot(width=0.5)

p

Rplot03

However, when converting it to a plotly plot, the paired boxplots now overlap. I'm not sure how to fix that, as I haven't used plotly very much, but hopefully someone will come along with a solution.

plotly::ggplotly(p)

As a hack for ggplotly, you can manually dodge the bars:

p = df %>% 
  mutate(source = as.numeric(source) + ifelse(sex=="Male", 0.2, -0.2)) %>% 
  ggplot(aes(source, data, colour=sex, group=interaction(sex, source))) +
  geom_boxplot()

plotly::ggplotly(p)

2 Likes

I'm not sure why you are creating list of lists of dataframes. Try just putting all your data in a single dataframe with the correct categorical variables to start with. You will see that plot_ly() takes a dataframe as input.

plotly doesn't have facets like ggplot, but it does have subplots:

Thank you guys! :slight_smile: With your help I was able to solve the Problem

I have several dataframes(40) inside a list and I need a way to create a chart that can be modified according to the selected dataframe using plotly or ggplot2 shareit vidmate