Adding more data (to Y axis) to a graph on ggplot

I want to add the year to my scatter graph, but i already have data on both axis' can someone advise me on what to do to add the year as well? i can send my programming so far its pretty simple but im new :slight_smile: thx!

It is very difficult to give advice without having more details. Could you post a reproducible example, reprex, of your data and code? At least give more details of the kind of plot you have and the data.

ok, here is what i have so far:
library(tidyverse)
data <- read_csv("Giraffe_Data.csv")
head(data)
ggplot(data = data, mapping = aes(x = Habitat, y = Herd_size))
ggplot(data = data, mapping = aes(x = Habitat, y = Herd_size)) +
geom_point() +
geom_point(mapping = aes(colour = Habitat))

image

Thank you, the plot image and code help a lot.
I understand you want to add information about the year. Does each point in the current plot have a single year associated with it? How many different years are there? If there are six or fewer, you could try changing the color of the points according to the year. You would probably want to treat the year as a factor in that case so the colors would be clearly distinct.
If you want to plot the year on an axis, please explain what data the year would be paired with. It is not clear to me how that would work.

the data that i am importing for the graph is on excel, i believe each point has a year associated with it? but it seemed to condense it on the scatter graph?
Giraffe_Data_publicexposure.csv

Looking at your data, I see there are two years, 2018 and 2019 and only two of the habitats, Plateau and Hill_slope have more than three data points. The scatter plot seems to show fewer points than appear in the data set because many of the points have identical y values. For example, there are several points with Habitat = Plateau and Herd_size = 3 but they are all plotted on top of each other.
It would only make sense to compare the heard sizes vs. year for the two habitats that have more than three data points. I would do that like this. The plot clearly needs cosmetic adjustments but I think it conveys the idea.

library(ggplot2)
library(dplyr)

data <- read.csv("~/R/Play/Giraffe.csv", stringsAsFactors = FALSE)
Giraffes_summary2 <- filter(data, Habitat %in% c("Hill_slope", "Plateau")) %>% 
  group_by( Habitat, Year)
Giraffes_summary2 <- summarise(Giraffes_summary2, mean_Giraffe = mean(Herd_size),
                              sd_Giraffe = sd(Herd_size),
                              se_Giraffe = sd(Herd_size)/sqrt(n()))

ggplot(data = Giraffes_summary2, 
       mapping = aes(x = interaction(Habitat, Year), y = mean_Giraffe, fill = Habitat)) +
  geom_col() +
  geom_errorbar(aes(ymin = mean_Giraffe - sd_Giraffe, 
                    ymax = mean_Giraffe + sd_Giraffe), width = 0.3) 

Created on 2020-07-07 by the reprex package (v0.3.0)

Oh my, that's amazing!! TYSM!! Also, I was wondering whether u had any coding i can copy and enter my data for to make a pie chart or a different kind of chart, (i tried a line graph but it looked like a train route lol). Thanks again!

Also, sorry to be a pain, I tried copying that coding into my script but it creates an error, I did make some adjustments to fit my coding but i am not sure what to do to fix it now?

I need to see your code and the error to make any suggestions. Place a line containing only three back ticks, ```, before and after any code so it gets formatted nicely.

```
Your code here.
```

Pie charts can be done in ggplot but it takes some work because they are not a good way to represent data.
You might consider making histograms of the Plateau and Hill_slope data using geom_histogram.

Ok, perfect! I think i will stick with the graphs i have because its my first day and i dont want to get too ott.
This is the error that shows up:
Error in ggplot(data = Giraffes_summary2, mapping = aes(x = interaction(Habitat, :
object 'Giraffes_summary2' not found

Error: attempt to use zero-length variable name

Did you run the equivalent of these lines of code before trying the ggplot command? The read.csv() command should be modified to however you are getting your data into R.

data <- read.csv("~/R/Play/Giraffe.csv", stringsAsFactors = FALSE)
Giraffes_summary2 <- filter(data, Habitat %in% c("Hill_slope", "Plateau")) %>% 
  group_by( Habitat, Year)
Giraffes_summary2 <- summarise(Giraffes_summary2, mean_Giraffe = mean(Herd_size),
                              sd_Giraffe = sd(Herd_size),
                              se_Giraffe = sd(Herd_size)/sqrt(n()))

I modified the .csv to mine, Giraffe_Data.csv but its still coming up with the 'zero length variable' error thing?

Hi, so im the same person with a different account because i had reached my max replies for my first day whoops.
So, i fixed that issue by removing the ticks, but now i have this issue:
Error in interaction(Habitat, Year) : object 'Year' not found

It seems Giraffes_summary2 does not have a column named Year. Did your group_by() include both Habitat and Year?

Yes, here ill just copy the code as its probably easier for you to understand:
##Bar chart with years
data <- read.csv("Giraffe_Data.csv", stringsAsFactors = FALSE)
Giraffes_summary <- filter(data, Habitat %in% c("Hill_slope", "Plateau")) %>%
group_by( Habitat, Year)
Giraffes_summary <- summarise(Giraffes_summary, mean_Giraffe = mean(Herd_size),
sd_Giraffe = sd(Herd_size),
se_Giraffe = sd(Herd_size)/sqrt(n()))

ggplot(data = Giraffes_summary,
mapping = aes(x = interaction(Habitat, Year), y = mean_Giraffe, fill = Habitat)) +
geom_col() +
geom_errorbar(aes(ymin = mean_Giraffe - sd_Giraffe,
ymax = mean_Giraffe + sd_Giraffe), width = 0.3)

I've now got this issue, oh gosh im sorry haha :sweat_smile:
Error: Problem with summarise() input mean_Giraffe.
x object 'Herd_size' not found
i Input mean_Giraffe is mean(Herd_size).
Run rlang::last_error() to see where the error occurred.

The code you pasted works for me after I changed the file name to match my file. Are you sure you are reading the correct file? I have to go to work now. If you are still getting errors, check the content of the data at each step. You can use the summary function to see the name, data type and summary of each column, summary(Giraffes_summary)

This worked for me.

data <- read.csv("Giraffe.csv", stringsAsFactors = FALSE)
Giraffes_summary <- filter(data, Habitat %in% c("Hill_slope", "Plateau")) %>%
  group_by( Habitat, Year)
Giraffes_summary <- summarise(Giraffes_summary, mean_Giraffe = mean(Herd_size),
                              sd_Giraffe = sd(Herd_size),
                              se_Giraffe = sd(Herd_size)/sqrt(n()))

ggplot(data = Giraffes_summary,
       mapping = aes(x = interaction(Habitat, Year), y = mean_Giraffe, fill = Habitat)) +
  geom_col() +
  geom_errorbar(aes(ymin = mean_Giraffe - sd_Giraffe,
                    ymax = mean_Giraffe + sd_Giraffe), width = 0.3)

Its still not working, idk what the issue is, but thank you i have ur image and that will be ok :slight_smile:

Hi,
so hopefully u will see this after work? The issue im having is that i need to add a title etc to the graph, however, it keeps stating that the year is not recognized despite my coding being the same as urs?
This is my error:
Error in interaction(Habitat, year) : object 'year' not found
Thx