Plot a list of two lists containing data frames with plot_ly (boxplot)

Hi :slight_smile:
So after some work I created a list with two lists containing data frames (every data frame has only one column with numeric values but different number of rows):

df1 <- data.frame(c(1:10))
df2 <- data.frame(c(2:9))
df3 <- data.frame(c(3:14))
df4 <- data.frame(c(4:13))
df5 <- data.frame(c(1:10))
df6 <- data.frame(c(2:20))
df7 <- data.frame(c(3:16))
df8 <- data.frame(c(4:9))

list_1 <- list(df1,df2,df3,df4)
list_2 <- list(df5,df6,df7,df8)
list_3 <- list(list_1, list_2)

The first list contains data frames of femal probands, while the second list contains data frames of male probands. I want to compare df1 from the first list and df5 from the second list; df2 from the first list and df6 from the second list;...with boxplots made by plot_ly (so two boxplots in one plot side by side). Does anyone have an idea how this works? I know how it works theoretically (Box plots in R), but I don't know how to select the right entry from the list.
So the first entry from my list_3 is list_3[[1]]. But in my case list_3[[1]] is a whole list of data frames. How do I choose the first data frame df1 whthin the first entry of my list_3?

Thank you! :slight_smile:

The specifics of the necessary code will depend on what form your data is in when you load it into R. For now, I've assumed that we're starting with list_1 and list_2.

library(tidyverse)

# Change the name of the column with the data values to "data"
# Add a column with the sex of the individuals
list_1 = list_1 %>% map(~.x %>% set_names("data") %>% mutate(sex="Male"))
list_2 = list_2 %>% map(~.x %>% set_names("data") %>% mutate(sex="Female"))

# Combine each list into a single data frame with an added column (called "source")
#  giving the number of the list element (which we'll use to plot corresponding
#  data series of male and female) and then bind the two separate data frames
#  into a single data frame
df = list(list_1, list_2) %>% 
  map_df(~bind_rows(.x, .id="source"))

Now we have a single data frame for plotting.

p = df %>% 
  ggplot(aes(source, data, colour=sex)) +
  geom_boxplot(width=0.5)

p

Rplot03

However, when converting it to a plotly plot, the paired boxplots now overlap. I'm not sure how to fix that, as I haven't used plotly very much, but hopefully someone will come along with a solution.

plotly::ggplotly(p)

As a hack for ggplotly, you can manually dodge the bars:

p = df %>% 
  mutate(source = as.numeric(source) + ifelse(sex=="Male", 0.2, -0.2)) %>% 
  ggplot(aes(source, data, colour=sex, group=interaction(sex, source))) +
  geom_boxplot()

plotly::ggplotly(p)

2 Likes

I'm not sure why you are creating list of lists of dataframes. Try just putting all your data in a single dataframe with the correct categorical variables to start with. You will see that plot_ly() takes a dataframe as input.

plotly doesn't have facets like ggplot, but it does have subplots:

Thank you guys! :slight_smile: With your help I was able to solve the Problem

I have several dataframes(40) inside a list and I need a way to create a chart that can be modified according to the selected dataframe using plotly or ggplot2 shareit vidmate

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.