Advice on possible plots and stats

I am analyzing 10-year data by region. The variable of interest is median (service delay in days).
One possible way of analyzing the data is to generate line plots, which I have already done in the example below.
I am looking for more options and ideas on how to analyze the medians-- It cold be new plots/ stats/ tables. Thanks!

suppressWarnings(suppressMessages(library(tidyverse)))
set.seed(1234)
toy_data <- tibble(
  year = rep(2008:2017, times = 4),
  region = rep(c("A", "B", "C", "D"), each = 10),
  median = c(rnorm(20, mean = 30, sd = 7), rnorm(20, mean = 35, sd = 2))
)

toy_data %>% 
  ggplot(aes(x = year, y = median, color = region)) +
  geom_line()

Created on 2020-11-18 by the reprex package (v0.3.0)

Can you elaborate a bit more about what you hope to learn from the data? I think that will direct what sort of visualization/analysis would be appropriate.

Thank you for your response.

Here are some info:
median : days -- how long it takes to diagnose a disease by hospitals
region: belong to a particular country
goal: to analyze the delays in temporal context by region

Let me know if you need more info. Thanks!

It might be worthwhile to look at region differences on a map. Something like (x - \bar{x})/\sigma where x is the median days to diagnosis.

If you are unfamiliar with these sorts of plots, this post might be of interest.

Thanks for your feedback. It looks interesting.
I hope others will share their thoughts too.

Looking for more ideas from you all. Thanks!

You can try Clustered Column Chart

#>Clustered Column Chart 
ggplot(toy_data,aes(x=year,y=median)) +
 geom_bar(aes(fill=region),stat = "identity",position="dodge") + 
  scale_y_continuous(name="Median")

If it's time series on 10 years, you might be able to extract seasonal components and long-term components (e.g. in Winter things get worse, but over the years they get better). Here are some examples with additive models.

You could also look for components that co-vary across regions, and others that are region-specific (e.g. all hospitals get better over time except when COVID hits, but that particular country had a war that degraded hospital response time for a few years). Not sure what the right statistical approach would be, perhaps a simple PCA could be a good start?

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.