Pivot_wider to include two names_from= to add characteristics to ggplot

Hi,

I want to add another binary variable to "MM_binary" called "CMM_binary" to make two lines on the plot. I can plot points but want to draw lines between them, adding geom_path() after geom_point() + comes up with an error.

alt_summary <- trimmed_df %>%
group_by(CarsQuintile_group, MM_binary) %>%
summarise(n = n()) %>% pivot_wider(id_cols="CarsQuintile_group",
names_from="MM_binary",
values_from="n") %>%
mutate(frac=1/(1+0),
labels=scales::percent(frac))

alt_summary

alt_summary %>%
ggplot(aes(x = CarsQuitile_group, y = frac)) +
geom_point() +
scale_y_continuous(limits = c(0, 1),
breaks = scales::pretty_breaks(),
labels = scales::percent_format(scale = 100)) +
theme(legend.position = "top") +
labs(title = "Prevalence of MM_binary",
y = "% of population",
x = "SIMD", color = "grey20", size = 20, angle = 90) -> figure_1

figure_1

Any ideas?

Thanks

Hello.
Thanks for providing code , but you could take further steps to make it more convenient for other forum users to help you.

Share some representative data that will enable your code to run and show the problematic behaviour.

You might use tools such as the library datapasta, or the base function dput() to share a portion of data in code form, i.e. that can be copied from forum and pasted to R session.

Hello, Thank you for the advice. Here is a reprex of my df. Each row is a person and variables with conditions represent whether the person has a condition or not (1/0). I want to plot age or age_group on x and prevalence of several diseases along y, one line per disease. I think I need to change the df but can't work out how to have one variable for all diseases for each person?

Thanks

conditions_df(
~ID, ~Age, ~Sex, ~CarsQuintile, ~age_group, ~CarsQuintile_group, ~Diabetes, ~Asthma, ~Stroke, ~Heart.attack, ~COPD, ~Hypertension, ~Eczema, ~Depression,
1L, 18L, 1L, 2L, "18 - 24", 3L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L,
2L, 77L, 1L, 1L, "65 - 74", 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L,
3L, 25L, 1L, 3L, "25 - 34", 4L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
4L, 30L, 1L, 1L, "25 - 34", 3L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
5L, 54L, 1L, 1L, "55 - 64", 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L,
6L, 78L, 1L, 5L, "75 - 84", 5L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L,
7L, 69L, 1L, 1L, "65 - 74", 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L,
8L, 62L, 1L, 1L, "55 - 64", 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
9L, 68L, 1L, 5L, "55 - 64", 1L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
10L, 63L, 1L, 1L, "55 - 64", 3L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L
)

I didn't understand your comment about CMM_binary, and your example didnt include anything called MM_binary for me to understand your intentions aside from your follow up post text that said you wanted to plot the proportion of disease cases by age. I think this delivers that:


library(tidyverse)
conditions_df <- tribble(
  ~ID, ~Age, ~Sex, ~CarsQuintile, ~age_group, ~CarsQuintile_group, ~Diabetes, ~Asthma, ~Stroke, ~Heart.attack, ~COPD, ~Hypertension, ~Eczema, ~Depression,
  1L, 18L, 1L, 2L, "18 - 24", 3L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L,
  2L, 77L, 1L, 1L, "65 - 74", 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L,
  3L, 25L, 1L, 3L, "25 - 34", 4L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
  4L, 30L, 1L, 1L, "25 - 34", 3L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
  5L, 54L, 1L, 1L, "55 - 64", 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L,
  6L, 78L, 1L, 5L, "75 - 84", 5L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L,
  7L, 69L, 1L, 1L, "65 - 74", 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L,
  8L, 62L, 1L, 1L, "55 - 64", 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
  9L, 68L, 1L, 5L, "55 - 64", 1L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L,
  10L, 63L, 1L, 1L, "55 - 64", 3L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L
)

# add more here as needed
dis_vec <- c("Diabetes", "Asthma", "Stroke")


(smry_cond <- conditions_df %>% 
    group_by(age_group) %>%
  summarise(across(
    .cols = dis_vec,
    .fns  = ~ sum(. == 1) / n()
  )))

(smry_long <- pivot_longer(smry_cond,
                            cols = -age_group
                          ) %>%
                            mutate(age_group = 
                                     factor(age_group)))

ggplot(data = smry_long) +
  aes(x = age_group, y = value,
      group = name, color = name) +
  geom_line(size = 2, alpha = .7)

This is brilliant, thank you so much! This is exactly what I was trying to achieve. Apologies about the MM_binary comment, in the reprex I changed to variable name to make it easier make an example. Really appreciate your help.