Advice on possible plots and stats

I am analyzing 10-year data by region. The variable of interest is median (service delay in days).
One possible way of analyzing the data is to generate line plots, which I have already done in the example below.
I am looking for more options and ideas on how to analyze the medians-- It cold be new plots/ stats/ tables. Thanks!

toy_data <- tibble(
  year = rep(2008:2017, times = 4),
  region = rep(c("A", "B", "C", "D"), each = 10),
  median = c(rnorm(20, mean = 30, sd = 7), rnorm(20, mean = 35, sd = 2))

toy_data %>% 
  ggplot(aes(x = year, y = median, color = region)) +

Created on 2020-11-18 by the reprex package (v0.3.0)

Can you elaborate a bit more about what you hope to learn from the data? I think that will direct what sort of visualization/analysis would be appropriate.

You can try Clustered Column Chart

#>Clustered Column Chart 
ggplot(toy_data,aes(x=year,y=median)) +
 geom_bar(aes(fill=region),stat = "identity",position="dodge") + 

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

It might be worthwhile to look at region differences on a map. Something like (x - \bar{x})/\sigma where x is the median days to diagnosis.

If you are unfamiliar with these sorts of plots, this post might be of interest.

Thank you for your response.

Here are some info:
median : days -- how long it takes to diagnose a disease by hospitals
region: belong to a particular country
goal: to analyze the delays in temporal context by region

Let me know if you need more info. Thanks!

Looking for more ideas from you all. Thanks!

Thanks for your feedback. It looks interesting.
I hope others will share their thoughts too.

If it's time series on 10 years, you might be able to extract seasonal components and long-term components (e.g. in Winter things get worse, but over the years they get better). Here are some examples with additive models.

You could also look for components that co-vary across regions, and others that are region-specific (e.g. all hospitals get better over time except when COVID hits, but that particular country had a war that degraded hospital response time for a few years). Not sure what the right statistical approach would be, perhaps a simple PCA could be a good start?

1 Like