Comparing rates of binary test results by region

Hi, community! I have a dataset I'm trying to work with where my main goal is to compare rates of positive test results between different regions. I've already calculated the positive rate for each region on my own, but I'd like to be able to compare regions in a way that takes into account sample size and such. Here's what my dataset looks like:

Species..........ID#..............Site........Status
KAEL.........1821-21283......HPK.......positive
KAEL.........1821-21284......HPK.......negative
JAWE........1821-21285......HPK.......negative
KAAM........1821-21286......UUK.......negative

And so on and so forth. How should I proceed with comparing frequency of "positive" by region?

Hello. I'm not sure which variable refers to Region in your sample data. But here's the general tidyverse way of generating frequency tables. For the example below I'm counting Status by Site and filtering for only positive cases.

library(dplyr, warn.conflicts = FALSE)

df <- tibble(Site = c("HPK", "HPK", "HPK", "UUK"),
             Status = c("positive", "negative", "negative", "negative"))

df %>% 
  count(Site, Status) %>% 
  filter(Status == "positive")
#> # A tibble: 1 x 3
#>   Site  Status       n
#>   <chr> <chr>    <int>
#> 1 HPK   positive     1

Created on 2020-09-03 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.