ggplot with conditions

sbl_bah · October 31, 2018, 9:55am

Hello,

I tried to fill my graph with the conditions into ggplot. The result it's wrong (see the graph below)

Have you an idea about it?

Thanks in advance to your help!

tab<- read.csv2("test2.csv",header=TRUE,sep=";",stringsAsFactors=FALSE,encoding="UTF-8",dec=".",check.names=FALSE)  
perf.project<-tab
perf.project.tdy<-perf.project %>%
  gather("types","perf",6:6)

perf.project.tdy<-as.data.frame(perf.project.tdy)

p<-factor(perf.project.tdy$pas, levels = (unique(perf.project.tdy$pas)), ordered=TRUE)

barwidth = 0.95
ggplot(perf.project.tdy, aes(x=p , y=perf.project.tdy$perf,fill = types)) + 
  geom_col(fill = ifelse(perf.project.tdy$perf>0,"green",
                         ifelse(perf.project.tdy$perf<=-4,"red","orange")),
                          width = barwidth) +
  geom_text(aes(x=p , y=perf.project.tdy$perf,label=paste(perf.project.tdy$perf, "%")), vjust=1.6, color="black", size=3.5)

Capture_3

mara · October 31, 2018, 12:58pm

It's hard to tell without the data, but I think the p you're plotting might be outside of your data frame.

Could you please turn this into a self-contained reprex (short for reproducible example)? You're almost there, it's just that since no one else has your test2.csv file, we can't re-create it. It will help us help you if we can be sure we're all working with/looking at the same stuff.

You can probably just include the tidied dataframe, since the focus of your problem is on ggplot.

install.packages("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ.

cderv · November 1, 2018, 1:57pm

Personally I would create a new variable in the table (a fill_col column) with value according to your filter. Then you can use this variable on the fill aesthetic.

It is a wild guess as we need a reprex to be on the same page (follow @mara advice, but i would modify you code like this

library(tidyverse)
perf.project <- read.csv2("test2.csv",header=TRUE,sep=";",stringsAsFactors=FALSE,encoding="UTF-8",dec=".",check.names=FALSE)  
perf.project.tdy <- perf.project %>%
  gather("types", "perf", 6:6) %>%
  # Create the variable you need for the plot
  mutate(pas = factor(pas, levels = unique(pas), ordered=TRUE),
         fill_col = case_when(
           perf > 0   ~ "green",
           perf <= -4 ~ "red",
           TRUE       ~ "orange"
         ))

barwidth <- 0.95
# ggplot work on a data.frame and you need to provide variables (columns) of that DF (without any $)
ggplot(perf.project.tdy, aes(x = pas , y = perf)) + # you provide fill after 
  geom_col(fill = fill_col,
           width = barwidth) +
  # aes x and y are already provided
  geom_text(aes(label=paste(perf, "%")), vjust=1.6, color="black", size=3.5)

It may not work as I don't have your data to test. Basically, what I wanted to show:

tidyverse has tools like dplyr to efficiently work with data. (like case_when for recoding more than 2 level and mutate to modify variables. )
ggplot2 works a data.frame (provide in data argument), then you map variables (columns) to aesthetics and scales. You don't need to provide vector value with $. Just the column name.
Knowing that, put everythink in your data. You can add a column with value corresponding to color for fill aesthetics.

library(ggplot2)
df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2), fill_col = c("red", "green", "blue"))
ggplot(df, aes(trt, outcome)) +
  geom_col(aes(fill = fill_col))

^{Created on 2018-11-01 by the reprex package (v0.2.1)}

Hope it helps.

sbl_bah · November 1, 2018, 5:53pm

Hello, thanks you very much that's help me a lot!

Anyway, there is an error message

Error in layer(data = data, mapping = mapping, stat = "identity", geom = GeomCol,  : 
  object 'fill_col' not found

In perf.project.tdy there is a column fill_col which created with mutate.

You will find hereafter a copy of my data

I'll try to do reprex.

sbl_bah · November 1, 2018, 6:25pm

your method it's ok and run very well!see the graph.
BUT it's run with this program configuration.
Have you an idea ?

library(tidyverse)

perf.project <- read.csv2("test2.csv",header=TRUE,sep=";",stringsAsFactors=FALSE,encoding="UTF-8",dec=".",check.names=FALSE)  
perf.project.tdy <- perf.project %>%
  gather("types", "perf", 6:6) %>%
  # Create the variable you need for the plot
  mutate(pas = factor(pas, levels = unique(pas), ordered=TRUE),
         fill_col = case_when(
           perf > 0   ~ "green",
           perf <= -4 ~ "red",
           TRUE       ~ "orange"
         ))
barwidth <- 0.95
ggplot()+
  geom_col(data = perf.project.tdy,
           mapping = aes(x=perf.project.tdy$pas, y=perf.project.tdy$perf),
           fill = perf.project.tdy$fill_col,
           #stat ="count", 
           width = barwidth) +
  geom_text(aes(x=perf.project.tdy$pas, y=perf.project.tdy$perf,label=paste(perf.project.tdy$perf, "%")), vjust=1.6, color="white", size=3.5)

sbl_bah · November 1, 2018, 7:24pm

I see there is always a problem with the graph vs value in perf.project.tdy.

pas Heures_commandes Heures_engagees RAF Bilan_heures Nombre_projet types perf fill_col
below my data

pas Heures_commandes Heures_engagees RAF Bilan_heures Nombre_projet types perf fill_col


1	200	13253.73	10810.25	1996.16	447.32	141	Performance	3.0	green
2	400	21706.72	19270.50	2640.04	-203.82	75	Performance	-1.0	orange
3	600	13496.70	14048.86	1747.95	-2300.11	28	Performance	-5.0	red
4	800	15523.93	13565.75	2189.82	-231.64	22	Performance	10.0	green
5	1000	9728.80	8530.09	1890.20	-691.49	11	Performance	-10.0	red
6	1200	5465.05	3330.15	1550.89	584.01	5	Performance	10.7	green
7	> 1200	54568.57	43955.41	17818.27	-7205.11	20	Performance	10.8	green

cderv · November 1, 2018, 9:05pm

You need to put the data in ggplot call and stop using $ - just use the column name to map variable from data to layers in ggplot. Put in ggplot call the common layer too. the geom and their specific layer. They could also go into the first ggplot call too.
You need also to map fill_col variable to fill aesthetic, so in a aes() call. if not it won't map the variable but only use the value. Your issue may come from here.
You need to follow my example and try to understand it.

ggplot(data = perf.project.tdy, aes(x=pas, y=perf))+
  geom_col(aes(fill = fill_col),
           #stat ="count", 
           width = barwidth) +
  geom_text(aes(label=paste(perf, "%")), vjust=1.6, color="white", size=3.5)

Read ggplot2 documentation to understand how to use it.
R for datascience book and chapter on data visualisation is a good place to start:

Please try this and see if it fix the color. Not mapping fill_col in a aes call could be the source of error. it is not the same : fill=fill_col is different from aes(fill=fill_col).

sbl_bah · November 2, 2018, 9:35am

Hello, thanks to your help.
So, you will find hereafter the ggplot function and the graph.
Anyway, there is now a problem : the text of the legend don't match with the color.
The color is defined with the value of the variable fill_col (it's seem correct)

barwidth = 0.95
ggplot(data = perf.project.tdy, aes(x=pas, y=perf))+
  geom_col(aes(fill = fill_col),
               width = barwidth) +
  geom_text(aes(label=paste(perf, "%")), vjust=1.6, color="white", size=3.5)

Rplot

mara · November 2, 2018, 11:08am

The way you currently have things, fill_col is just another variable, as opposed to an _ aesthetic value_ that's evaluated by ggplot as a fill colour. See, for example the StackOverflow thread, below:

There's an example included in the community thread, below, that shows you how to use a colour palette with specific colours matched to specific values using a named vector:

cderv · November 3, 2018, 9:43am

Oh yes, I forgot about that! Thanks also @mara for the reminder.
There a function inside ggplot2 for this kind of task where the variable contains aesthetic values. In your example, this is the case as you have the color vlaue in the fill_col variable, as I adviced. I was just missing a piece: add scale_fill_identify() to your ggplot. see the documentation for detail, for example, you need to add guide="legend" if you want to keep the legend.

This is example

library(ggplot2)
df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2), fill_col = c("red", "green", "blue"))
ggplot(df, aes(trt, outcome)) +
  geom_col(aes(fill = fill_col)) +
  scale_fill_identity()

ggplot(df, aes(trt, outcome)) +
  geom_col(aes(fill = fill_col)) +
  scale_fill_identity(guide = "legend")

^{Created on 2018-11-03 by the reprex package (v0.2.1)}

so

ggplot(data = perf.project.tdy, aes(x=pas, y=perf))+
  geom_col(aes(fill = fill_col),
               width = barwidth) +
  geom_text(aes(label=paste(perf, "%")), vjust=1.6, color="white", size=3.5) +
scale_fill_identity()

sbl_bah · November 3, 2018, 10:57am

Hello,
Thank you so much to your help, it's ok !

Rplot02

mara · November 3, 2018, 12:18pm

If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).

Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.

Thanks

system · November 10, 2018, 12:18pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.