Calculation of mean and standard deviation on variable range

my goal is to calculate a value, given by the sum of mean plus 3 standard deviations. The mean and standard deviation must be calculated over an increasing range:

  • the first values over the first 10 data points
  • the second on the first 20
    and so on, as I wrote in the code:
db_data <- iris
average1 <- mean(iris[1:10,2])
dev_std1 <- sd(iris[1:10,2])
value1 <- average + 3* dev_std

average2 <- mean(iris[1:20,2])
dev_std2 <- sd(iris[1:20,2])
value2 <- average2 + 3* dev_std2

average3 <- mean(iris[1:30,2])
dev_std3 <- sd(iris[1:30,2])
value3 <- average3 + 3* dev_std3

I need these values to make a comparison with a column in my dataset. To give an example, taking the Iris dataset I want to check when the value in the second column is greater than my calculated value. I've tried writing this function, but I can't seem to get the different values in.

controll <- function(db_data){
  average <- mean(iris[1:10,2])
  dev_std <- sd(iris[1:10,2])
  value <- average + 3* dev_std
  if (db_data$Sepal.Width > value) {
    db_data$controll <- 0
  }else{
    db_data$controll <- 1
  }
}

controll(db_data)

Is there anything better? Is there already a function in R that can do this?


some_avg_plus_3sd <- function(dataset,positions,colnum){
  
  average <- mean(dataset[positions,colnum])
  dev_std <- sd(dataset[positions,colnum])
  average + 3* dev_std
}

(v1 <- some_avg_plus_3sd(iris,1:10,2))
(v2 <- some_avg_plus_3sd(iris,1:20,2))
(v3 <- some_avg_plus_3sd(iris,1:30,2))


library(dplyr)

mutate(iris,
       test1 = Sepal.Width > v1,
       test2 = Sepal.Width > v2,
       test3 = Sepal.Width > v3)

Ok this works! But in case I have multiple ranges? For example the Iris dataset has 150 rows, so in this case I would have 150 ranges. Writing them all by hand seems useless to me, isn't there a function?


some_avg_plus_3sd <- function(dataset,positions,colnum){
  
  average <- mean(dataset[positions,colnum])
  dev_std <- sd(dataset[positions,colnum])
  average + 3* dev_std
}


wrapfunc <- function(x,n){
  x > some_avg_plus_3sd(iris,1:n,2)
}
library(purrr)
functionlist <- map(10*(1:15),
~eval(parse(text=paste0("function(x)wrapfunc(x,",.x,")"))))

library(dplyr)

(results <- mutate(iris,
       across(Sepal.Width,
              functionlist)))
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.