Bargraph By Date Purchased

I have a dataset that looks very similar to something like this, I would like to organize a graph (Where I can Choose the variable) that outputs the data in the form of a graph to display how many of each fruit where produced for each year. Meaning, If I just wanted Apples, It would show Apples in a bar graph for the Years 2016, 2017, 2018 & 2019. This would use the date in colum Date Gathered and What Shipped you can disregard the column Date Shipped it serves another purpose.

So far what I have brainstormed is a way to organize the dates in order chronologically, then search for the string "Apple" and start a count from there . However some help on this would be greatly appreciated.

Welcome to R and the RStudio Community! We're glad you're here. :slightly_smiling_face: You're in luck! The solution you've proposed would be a very reasonable (yet complex) solution in a spreadsheet environment like Excel, but it is trivially easy to do in R, using packages like dplyr and ggplot2.

A couple notes:

  1. I really appreciate your clear problem statement, and the fact that you've put some thought into solving this before posting.
  2. For future reference, it is really helpful if you can post code that generates your sample data. I did it for you in this case because it was pretty straightforward, but you'll have the best chance to get help with your problem if you're not asking people to re-create your problem themselves first. See here for some tips on how to do this.
  3. Finally, your solution!


# Simulate data
dates <- seq.Date(as.Date("2016/1/1"), as.Date("2019/6/1"), by = "month")

df_data <- data.frame(
  date_gathered = dates,
  what_shipped = sample(
    c("apples", "oranges", "grapes"),
    replace = TRUE

# Enter your desired category here
type <- "apples"

# Mutate the data to get a year column
df_year <- df_data %>% 
  mutate(yr = year(date_gathered)) %>% 
  filter(what_shipped == type)

# Generate the plot
gg_fruit <- ggplot(df_year) +
  geom_bar(aes(yr), stat = "count")


One last note! The solution to a question like this, along with loads of other useful stuff, can be found in hadley's fantastic book, R for Data Science.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.