Hi All,

I am learning about probabilities and chi square test and Fisher exact test. The question is as follows:

knowing that 50 people out of 300 have got disease and that 100 out of those 300 are women and assuming that Female and Male have the same same probability of having a disease:

1. What is the probability that those 50 persons will be so unevenly distributed between Females and Males?

2. If we selected 50 people at random, what is the probability that 40 or more of of them would be in the group of 200 men and only 10 or fewer in the group of 100 women?

Any help would be highly appreciated, thank you. I just want to add that this is not a homework, just a self-learning and curiosity.

What I have done so far:

```
library(Publish)
#> Warning: package 'Publish' was built under R version 4.0.5
#> Loading required package: prodlim
#> Warning: package 'prodlim' was built under R version 4.0.5
df <- data.frame(Female = c(10,90), Male = c(40,160))
rownames(df) <- c("Diseased","Not_Diseased")
df
#> Female Male
#> Diseased 10 40
#> Not_Diseased 90 160
table2x2(df)
#> _____________________________
#>
#> 2x2 contingency table
#> _____________________________
#>
#> Female Male Sum
#> Diseased 10 40 50
#> Not_Diseased 90 160 250
#> -- -- -- --
#> Sum 100 200 300
#>
#> _____________________________
#>
#> Statistics
#> _____________________________
#>
#>
#> a= 10
#> b= 40
#> c= 90
#> d= 160
#>
#> p1=a/(a+b)= 0.2
#> p2=c/(c+d)= 0.36
#>
#> _____________________________
#>
#> Risk difference
#> _____________________________
#>
#> Risk difference = RD = p1-p2 = -0.1600
#> Standard error = SE.RD = sqrt(p1*(1-p1)/(a+b)+p2*(1-p2)/(c+d)) = 0.0642
#> Lower 95%-confidence limit: = RD - 1.96 * SE.RD = -0.2858
#> Upper 95%-confidence limit: = RD + 1.96 * SE.RD = -0.03417
#>
#> The estimated risk difference is -16.0% (CI_95%: [-28.6;-3.4]).
#>
#> _____________________________
#>
#> Risk ratio
#> _____________________________
#>
#> Risk ratio = RR = p1/p2 = 0.5556
#> Standard error = SE.RR = sqrt((1-p1)/a+(1-p2)/c)= 0.5556
#> Lower 95%-confidence limit: = RR * exp(- 1.96 * SE.RR) = 0.3115
#> Upper 95%-confidence limit: = RR * exp(1.96 * SE.RR) = 0.9907
#>
#> The estimated risk ratio is 0.556 (CI_95%: [0.312;0.991]).
#>
#> _____________________________
#>
#> Odds ratio
#> _____________________________
#>
#> Odds ratio = OR = (p1/(1-p1))/(p2/(1-p2)) = 0.4444
#> Standard error = SE.OR = sqrt((1/a+1/b+1/c+1/d)) = 0.3773
#> Lower 95%-confidence limit: = OR * exp(- 1.96 * SE.OR) = 0.2122
#> Upper 95%-confidence limit: = OR * exp(1.96 * SE.OR) = 0.9311
#>
#> The estimated odds ratio is 0.444 (CI_95%: [0.212;0.931]).
#>
#> _____________________________
#>
#> Chi-square test
#> _____________________________
#>
#>
#> Pearson's Chi-squared test with Yates' continuity correction
#>
#> data: table2x2
#> X-squared = 4.107, df = 1, p-value = 0.04271
#>
#>
#> _____________________________
#>
#> Fisher's exact test
#> _____________________________
#>
#>
#> Fisher's Exact Test for Count Data
#>
#> data: table2x2
#> p-value = 0.03242
#> alternative hypothesis: true odds ratio is not equal to 1
#> 95 percent confidence interval:
#> 0.1893543 0.9610353
#> sample estimates:
#> odds ratio
#> 0.4455355
```

^{Created on 2021-04-11 by the reprex package (v2.0.0)}

I have got a bit of difficulty knowing if I go in the right direction ?