# Gglot for time series of many, many lines?

#1

I am trying to make a graph showing the time series progression of rainfall for 1000 or so regions. My current plot looks like this:

As you can see, this is way too many lines to have on a plot. Ideally, I would like to plot the progression of deciles across time, so that we have only 9 lines on the graph.

I'm not 100% what the code for this would look like. I need a function that goes from `N -> 10` and is applied across a dataframe that is grouped by year.

Instead of worrying about this hypothetical function, I thought it would be best to first ask the community if there is a `ggplot2` function that does this for me. Most time-series graphs focus on having only one or two lines. Of the responses that focus on having many different observations per year, people suggest two thing.

1. Do what I do above and set both `size` and `alpha` to be very low. This results in pretty graphs, but it is difficult for me to get a true sense of the distribution of rainfall.
2. Plot a series of box and whisker plots across time. This seems like an inelegant solution, since it can get crowded with many years.
3. A ridge plot, but this also gets awkward with many years of data.

#2

Here is my solution, using `do()` with `group_by`

1. Make a function that returns a dataframe with 10 rows and 21 columns.
``````get_quantiles <- function(x) {
values <- x\$rainfall_per_square_meter
t = quantile(values, seq(.1,.9,.1))
df <- data.frame(quantile_value = t, quantile = seq(.1,.9,.1))
return(df)
}
``````
1. Use `group_by` with `do()` to make an output dataframe that has `get_quantiles` mapped across years and appended.
``````df2 <- df %>% group_by(year) %>%
do(get_quantiles(.))
``````

Next steps are to make my `get_quantiles` function more extensible to use different variables and sets of quantiles.

My new plot now looks like this:

#3

Here's another option for calculating and plotting quantiles.

``````library(tidyverse)

# Fake data
set.seed(2)
d = replicate(1000, data.frame(year=2000:2018, value=cumsum(rnorm(19))), simplify=FALSE) %>%
bind_rows(.id="location")

# Summarise to get quantiles by year
prob=seq(0,1,0.1)
ds = d %>% group_by(year) %>%
summarise(lab = list(paste0(prob*100, "%")),
q = list(quantile(value, prob))) %>%
unnest

ggplot(ds, aes(year, q, colour=lab)) +
geom_line() +
geom_text(data=ds %>% filter(year==max(year)), aes(label=lab, x=max(ds\$year)),
position=position_nudge(x=0.2), hjust=0, size=3) +
theme_classic() +
guides(colour=FALSE) +
expand_limits(x=2018.6)
``````