Finding the Sum and Count to make a percentage for groups

Hello All,

I am in need of your help.

I cannot seem to find a direct scenario for what I am facing.

What I need to do is find the viral suppression rates by the sum and count to get a percentage and do a group_by with gender.

In this scenario, I have 3 columns: Gender, last quantitative lab value, and viral suppression numeric.

To explain further, the viral suppression numeric column was created with a case when based off of last quantitative lab value:

Demographic.Data <- Demographic.Data %>%
mutate(Viral_Suppression_Numeric = case_when(
Last.Quantitative.Lab.Value < 199 ~ "0",
Last.Quantitative.Lab.Value > 200 ~ "1",
T ~ NA_character_
))

that is how I have 3 columns.

The end result I would like is something along these lines, but by the viral suppression percentage and gender:

Male 34%
Female 57%
Transgender MTF 87%

I know to do this I would take the sum of viral suppression column and the count of viral suppression column and divide the two to get the percentage and then group by gender. However, I keep on having errors when I try.

one error I get is: x invalid 'type' (character) of argument

All the help is greatly appreciated.

below you will find a sample data set. I don't know how to make a reproduceable table yet. I apologize in advance as that is something I am learning.

gender last_quantitative_lab_value viral_suppression_numeric
Male 20 0
Male 20 0
Male 20 0
Male 22 0
Male 20 0
Female 20 0
Male 20 0
Male 20 0
Transgender MtF 1490 1
Female 1000 1
Male 20 0
Male 205 1
Male 20 0
Male 22 0
Male 208 1
Female 20 0
Female 201 1
Female 207 1
Transgender MtF 20 0
Female 1000 1

Thank you,

-Cody

With your data:

library(tidyverse)
df <- tibble::tribble(
              ~gender, ~last_quantitative_lab_value, ~viral_suppression_numeric,
               "Male",                          20L,                         0L,
               "Male",                          20L,                         0L,
               "Male",                          20L,                         0L,
               "Male",                          22L,                         0L,
               "Male",                          20L,                         0L,
             "Female",                          20L,                         0L,
               "Male",                          20L,                         0L,
               "Male",                          20L,                         0L,
        "Transgender",                        1490L,                         1L,
             "Female",                        1000L,                         1L,
               "Male",                          20L,                         0L,
               "Male",                         205L,                         1L,
               "Male",                          20L,                         0L,
               "Male",                          22L,                         0L,
               "Male",                         208L,                         1L,
             "Female",                          20L,                         0L,
             "Female",                         201L,                         1L,
             "Female",                         207L,                         1L,
        "Transgender",                          20L,                         0L,
             "Female",                        1000L,                         1L
        )

df %>% 
  group_by(gender) %>% 
  count(viral_suppression_numeric) %>% 
  mutate(prop = n/sum(n))

# A tibble: 6 × 4
# Groups:   gender [3]
  gender      viral_suppression_numeric     n  prop
  <chr>                           <int> <int> <dbl>
1 Female                              0     2 0.333
2 Female                              1     4 0.667
3 Male                                0    10 0.833
4 Male                                1     2 0.167
5 Transgender                         0     1 0.5  
6 Transgender                         1     1 0.5  

For making it reproducible next time:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.