I am analyzing 10-year data by region. The variable of interest is median (service delay in days).
One possible way of analyzing the data is to generate line plots, which I have already done in the example below.
I am looking for more options and ideas on how to analyze the medians-- It cold be new plots/ stats/ tables. Thanks!
suppressWarnings(suppressMessages(library(tidyverse)))
set.seed(1234)
toy_data <- tibble(
year = rep(2008:2017, times = 4),
region = rep(c("A", "B", "C", "D"), each = 10),
median = c(rnorm(20, mean = 30, sd = 7), rnorm(20, mean = 35, sd = 2))
)
toy_data %>%
ggplot(aes(x = year, y = median, color = region)) +
geom_line()
Can you elaborate a bit more about what you hope to learn from the data? I think that will direct what sort of visualization/analysis would be appropriate.
Here are some info:
median : days -- how long it takes to diagnose a disease by hospitals
region: belong to a particular country
goal: to analyze the delays in temporal context by region
If it's time series on 10 years, you might be able to extract seasonal components and long-term components (e.g. in Winter things get worse, but over the years they get better). Here are some examples with additive models.
You could also look for components that co-vary across regions, and others that are region-specific (e.g. all hospitals get better over time except when COVID hits, but that particular country had a war that degraded hospital response time for a few years). Not sure what the right statistical approach would be, perhaps a simple PCA could be a good start?