Can mean() function be used to show probability of cumulative distribution function?

I was doing my assignment, and I found something strange.
I did this code for question #1.
x <- heights$height[heights$sex=="Male"]

and the next question is like this:
"We will define a function "CDF" like following:
CDF <- function(a) {mean(x<=a)}
Explain why the CDF function is Cumulative Distribution Function."

I get the idea of the cumulative distribution function, but I don't get why function mean() is used there.
For example, CDF(70) equals 0.623..., which is the probability of cumulative distribution at 70. How does mean() function show probability in this function?

Hi, I think the key here is to think a bit step-by-step.

What will happen when you enter x<=a? You will get a logical vector of TRUE/FALSE values.

Then you take the mean of it. This gives you a proportion since TRUE is treated as 1 and FALSE is treated as 0.

set.seed(12345)

x <- rnorm(20)

# Let's say we want the probablity x <= a where a=0
x
#>  [1]  0.5855288  0.7094660 -0.1093033 -0.4534972  0.6058875 -1.8179560
#>  [7]  0.6300986 -0.2761841 -0.2841597 -0.9193220 -0.1162478  1.8173120
#> [13]  0.3706279  0.5202165 -0.7505320  0.8168998 -0.8863575 -0.3315776
#> [19]  1.1207127  0.2987237
x<=0
#>  [1] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE
#> [13] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE
mean(x<=0) # this really gives us a proportion because it is a mean of 0 and 1
#> [1] 0.5

Created on 2021-10-12 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.