ggplot2 - problem with code

LdelK · February 17, 2021, 3:11am

`Preformatted text

install.packages("gridExtra")
install.packages("grid")
install.packages("extrafont")
install.packages("showtext", dependencies = TRUE)
install.packages("RColorBrewer")
install.packages("scales")

#packages for graphs
library(ggplot2)
library(tidyverse)
library(grid)
library(gridExtra)
library(showtext)
library(extrafont)
library(RColorBrewer)
library(scales)

#######################
#Treatment x Male vs Female Graphs 
#######################
a1 <- read.csv('Data.csv')

Treatment.Sex = ddply(a1,~Treatment*Sex,summarize,meanQ1=100*mean(Q1,na.rm=T)+
  Treatment.Sex$Treatment = factor(Treatment.Sex$Treatment,levels=c("CONTROL","A","B","C","D","E"))+

###Plot for Question Q1
ggplot(data=Treatment.Sex,aes(x=Treatment,y=meanQ1,fill=Sex))+
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),panel.background = element_blank(), axis.line = element_line(colour = "black"))+
  geom_bar(stat="identity",position=position_dodge(),colour="black")+
  geom_smooth(method='lm',color="BLACK")+
  scale_fill_manual(values=c("coral","cornflowerblue"))+
  xlab("Treatments")+
  ylab("Proportion of Correct Responses (%)")+
  theme(legend.title=element_blank())+
  ggtitle("Q1: Correct Responses by Sex")+
  theme(text=element_text(size=16))```

##############
Hi there!  Can anyone help with my buggy code to generate ggplot2 
I'm looking to create two versions of these plots: 
1) One graph - showing male and female sexes' responses to Question 1 on the one graph
2) A second example with two graphs- male  and female sexes' responses to Question 1 on separate graphs. 

[Data for GGPLOT2](https://drive.google.com/drive/folders/1bGFKI89dU1NpKIVM2HaXAmRznLHGz0yE?usp=sharing)
###################
-->

FJCC · February 17, 2021, 3:51am

Can you post your data or a subset of it so we can run your ggplot code against the data you are working with? You can post the output of the dput() function. Use either

dput(Treatment.Sex)

or

dput(head(Treatment.Sex, 20))

Please put a line containing only three back ticks, ```, before and after the pasted output, like this:
```
Your output here
```

LdelK · February 17, 2021, 5:09am

Hey thanks so much for that. I've just anonymized the data (I'll find out how to attach here?)

For the output - do you mean to cut and past the output in R Studio? Cheers, L

FJCC · February 17, 2021, 5:23am

Yes, run the dput() function and copy and paste the output into a response here. Remember to put back ticks before and after the pasted output.

LdelK · February 18, 2021, 12:02am

Hi there thanks for that. I've tried to run the dput () function - for output. But my code is so buggy it's not allowing me to do that. Does it work just to have the code and the data file link? Thanks so much!

FJCC · February 18, 2021, 1:06am

If you can post the data somewhere, that will work too.

LdelK · February 18, 2021, 1:11am

Oh great! I've linked that data to google drive too now if that works? Thanks.
[https://drive.google.com/drive/folders/1bGFKI89dU1NpKIVM2HaXAmRznLHGz0yE?usp=sharing]

FJCC · February 18, 2021, 2:33am

The link you provided is giving me an html page when I try to read it from R and if I go to the link with my browser, it says I do not have access.

LdelK · February 18, 2021, 2:56am

HI sorry about that. It's a google drive link - I noticed on other threads people linked data in this way? I've changed the access now to viewable by anyone with a link. Would that work do you think? https://drive.google.com/drive/folders/1bGFKI89dU1NpKIVM2HaXAmRznLHGz0yE?usp=sharing

FJCC · February 18, 2021, 4:11am

Here is a partial solution. The colors of the fit lines are not right but I think you know how to fix that. The main problem was with your attempt to summarize the data with ddply and chaining that to other functions with +. You cannot generally chain functions together like that. You can do that in ggplot2 but that is a special case. There is a pipe operator in magrittr, %>%, that can chain functions together and I used that and the functions from dplyr. I have not used ddply in a long time and I didn't want to wrestle with it.

a1 <- openxlsx::read.xlsx("~/R/Play/Data.xlsx")

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
Treatment.Sex = a1 %>% group_by(Treatment, Sex) %>% 
  summarize(meanQ1 = 100*mean(Q1,na.rm=T))
#> `summarise()` regrouping output by 'Treatment' (override with `.groups` argument)

Treatment.Sex$Treatment = factor(Treatment.Sex$Treatment,
                                 levels=c("CONTROL","A","B","C","D","E"))
                        
                        ###Plot for Question Q1
library(ggplot2)
ggplot(data=Treatment.Sex,aes(x=Treatment,y=meanQ1,fill=Sex))+
      theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),panel.background = element_blank(), axis.line = element_line(colour = "black"))+
      geom_bar(stat="identity",position=position_dodge(),colour="black")+
      geom_smooth(aes(group=Sex, color = Sex), method='lm',se=F)+
      scale_fill_manual(values=c("coral","cornflowerblue"))+
      xlab("Treatments")+
      ylab("Proportion of Correct Responses (%)")+
      theme(legend.title=element_blank())+
      ggtitle("Q1: Correct Responses by Sex")+
      theme(text=element_text(size=16))
#> `geom_smooth()` using formula 'y ~ x'

^{Created on 2021-02-17 by the reprex package (v0.3.0)}

LdelK · February 18, 2021, 4:47am

That is fantastic. Thank you so much!

Is dplyr a much more user friendly updated version of dpply? Good to know about chaining functions together as I had done. For the pipe operator how did you locate that and know of its existence ? Under ? help did you look for "Chaining functions"? Thank you so much - I'm now working on getting one graph for females and one for males. Great solution - trend lines easy fix as you say!

FJCC · February 18, 2021, 5:07am

I think dplyr is easier to work with but use whatever works for you.

I can't say how or when I learned about the %>% operator. An excellent place to learn about using dplyr and the rest of the tidyverse in R for Data Science. There is a lot of information available about R but it is hard to find things if you don't know they exist. I frequently find new things and wish I had known about them earlier.

system · March 11, 2021, 5:07am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.