Using aggregate with ggplot


#1

Hello,
I am trying to use aggregate command to get the mean of some of my data. My data is created by for loop. I have to find the correlation when the intercept from 1-5 and the coef is from 1:5 as well, so the data is repeated 5 times per each intercept. Now, I have 3 columns ( int, coef, and B00.tchsc) the intercept is repeated 5 times per each coef. I have tried to use the following command

aggregate(B01.tchsc~coef, result, mean)

and I got this error

Error in get(as.character(FUN), mode = "function", envir = envir) : 
  object 'FUN' of mode 'function' was not found

any idea?


#2

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.


#3

Without a reproducible example, I can't be sure what you're trying to do, but here's a generic example of how you can use stat_summary to calculate and plot means of your data within ggplot itself. The example below also includes 95% confidence intervals.

library(tidyverse)

ggplot(iris %>% gather(key, value, -Species), aes(Species, value)) +
  stat_summary(fun.data=mean_cl_boot, geom="errorbar", width=0.1, colour="red") +
  stat_summary(fun.y=mean, geom="point", colour="blue") +
  facet_grid(. ~ key) +
  theme_classic()

If you want to calculate means outside of ggplot, here's a tidyverse approach:

iris %>% 
  gather(Variable, value, -Species) %>% 
  group_by(Species, Variable) %>% 
  summarise(N=n(),
            mean=mean(value))

      Species     Variable  N  mean
1      setosa Petal.Length 50 1.462
2      setosa  Petal.Width 50 0.246
3      setosa Sepal.Length 50 5.006
4      setosa  Sepal.Width 50 3.428
5  versicolor Petal.Length 50 4.260
6  versicolor  Petal.Width 50 1.326
7  versicolor Sepal.Length 50 5.936
8  versicolor  Sepal.Width 50 2.770
9   virginica Petal.Length 50 5.552
10  virginica  Petal.Width 50 2.026
11  virginica Sepal.Length 50 6.588
12  virginica  Sepal.Width 50 2.974

Using aggregate, you could do:

aggregate(. ~ Species, data=iris, mean)

     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1     setosa        5.006       3.428        1.462       0.246
2 versicolor        5.936       2.770        4.260       1.326
3  virginica        6.588       2.974        5.552       2.026

#4

Joel, thank you so much. That is helpful. I used gather instead of aggregate and it worked for me.


#5

Mara, I have learned how to contain reprex. However, I could not find a similar data to mine yet. I woder if I can send my code as it contains the way I created the data?


#6

If your data is completely created by code, then including the code is a fantastic way to include your data in a reproducible example.

The thing to watch out for is when your code depends on reading files on your local computer that other people won't have access to.