 # (new to r) How can i automate the calculation of finding the relative frequency of values in a vector that is greater than a sequence of values?

Hi

In advance, I'm apologize for being a complete newbie at r

I have a dataframe from a MonteCarlo simulation, 1million rows and 8 columns (values range from 0 to 100million+)

I'm trying to find a way in which i can get the relative frequency of values in each vector that is greater than certain values.

With the function below I can specify which column and what value so that I i.e. can get the relative frequency of how many observations in "column1" have a value greater than 10,000, which is 10,6%

Relative.percent <- function(x, n){ 100*length((which(x > n))) / length(x) }
I.e.
Relative.percent(Results\$Event1,10000)
10.6

However this would take me weeks to write out all the different values i want,
as i I'm trying to get the relative frequency of all values greater than 1, 10,000, 20,000, 30,000 ... 100,000,000 so that i can get a detailed graph which would probably look similar to an s curve.

My ideal output would be:

Value__ Event1__ Event2__ Event3 ...
1_______ 0.40____ 0.36____ 0.76
10,000 __ 0.38____ 0.32____ 0.55
20,000 __ 0.27 ____0.19 ____ 0.48
...

Each Event will show the relative frequency of values that are over 1, or over 10,000 or over 20,000 etc

Maybe this will help? https://stat.ethz.ch/pipermail/r-help/2012-July/319703.html

Thank you, it helped a lot This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Does this help?

``````set.seed(seed = 44552)

no_rows <- 20
no_columns <-5

mean_sim <- 50
sd_sim <- 10

fake_data <- replicate(n = no_columns,
expr = rnorm(n = no_rows,
mean = mean_sim,
sd = sd_sim))
colnames(x = fake_data) <- LETTERS[seq_len(length.out = no_columns)]

values_sim <- seq(from = 30,
to = 70,
by = 5)

get_relative_frequency <- function(column_name, value)
{
100 * mean(x = (fake_data[, column_name] > value))
}

outer(X = colnames(x = fake_data),
Y = values_sim,
FUN = Vectorize(FUN = get_relative_frequency))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> [1,]  100   90   70   60   40   35   20    5    5
#> [2,]  100  100  100   85   70   20   15   10    5
#> [3,]   95   95   85   70   45   20   20    0    0
#> [4,]  100   90   90   70   45   15   10    5    5
#> [5,]  100  100  100   75   60   35   20   10    0
``````

Created on 2019-11-12 by the reprex package (v0.3.0)