Help with making plot with multiple columns

Hi all,
I hoped someone could teach me how to make a plot with the following dataframe:

group       season 1  season 2   season 3     season 4
bananas      1              4          5         7
apples       6              10         8         2
pears        3              5          10        4

What I want to create is a bargraph with on the x-axis all the yields of season 1 for bananas, apples and pears, so three columns. I want to do the same for season 2, 3 and 4, thus ending with a bar graph with four clusters of 3 bars, totalling to 12 bars. On the y axis the amount should be stated, so that the yields of different fruits per season can be compared. I also want to give the bars of one species the same color and indicate this in a legend. I have tried solving this myself using the melt function (which I did not get to work), the tidyverse gather function (also could not get it to work) and I also tried to use barplot (but this also did not work). An example of code I used, and which I intuitively felt like would probably be the easiest of my three tryouts to fix:

ggplot(mean_all %>% gather(S1, S2, S3, S4, -group), 
       aes(x=group,y = seasons, fill = S1, S2, S3, S4)) + 
         geom_bar(stat = 'identity', position = 'dodge')
seasons <- c(mean_all$S1,mean_all$S2,mean_all$S3,mean_all$S4)

However, I as a beginner in R just cannot seem to make it work...

I have searched and tried out many tutorials and examples without solving this, soI am really hoping someone here is willing to take a look and help me out.

1 Like

It is best to do your reshaping outside of ggplot and then pipe that result into ggplot(). I saved the result in an intermediate data frame in the code below just to illustrate the result of gather().

library(tidyr)
library(ggplot2)
DF <- data.frame(group = c("bananas", "apples", "pears"),
                 season1 = c(1,6,3),
                 season2 = c(4,10,5),
                 season3 = c(5,8,10),
                 season4 = c(7,2,4))
DFtall <- DF %>% gather(key = Season, value = Value, season1:season4)
DFtall
#>      group  Season Value
#> 1  bananas season1     1
#> 2   apples season1     6
#> 3    pears season1     3
#> 4  bananas season2     4
#> 5   apples season2    10
#> 6    pears season2     5
#> 7  bananas season3     5
#> 8   apples season3     8
#> 9    pears season3    10
#> 10 bananas season4     7
#> 11  apples season4     2
#> 12   pears season4     4
ggplot(DFtall, aes(Season, Value, fill = group)) + geom_col(position = "dodge")

Created on 2020-01-27 by the reprex package (v0.3.0)

4 Likes

Thank you so much! I had tried the gather function before, but I see now that I made it too complex. The only thing I encounter now is that I want to add in error bars; I put the sd values in a dataframe with the exact same layout and dimensions as the season data, but due to the layout of my ggplot I cannot seem to get the error bars in the plot. What I have so far is:

ggplot(meanall_tall, aes(Season, Value, fill=group)) +geom_col(position = "dodge") + geom_errorbar(sdall_tall, aes(Seasonsd, sd, ymin=Value-sd, ymax=Value+sd), width=.2,position=position_dodge(.9))

Ofcourse the 'Value' variable won't be recognized since I specified a different dataframe for the geom_errorbar, I am not sure however how to make these errorbars work otherwise with this way of using ggplot

Your sdall_tall data frame should have a column named group and one named Season. You can then join meanall_tall and sdall_tall with the inner_join function from dplyr.

NewDF <- inner_join(meanall_tall, sdall_tall, by = c("group", "Season"))

NewDF will now have a Value column and an sd column and you can use NewDF as the data argument from ggplot() and set the values of ymin and ymax in geom_errorbar().

A small suggestion, the new pivot_longer() function in tidy 1.0.0 has made things easier to remember.
Here is a line replacing gather with pivot_longer() to get DFtall

DFtall <- DF %>% pivot_longer(names_to = "Season", values_to = "Value", season1:season4)
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.