Error in for loop: Number of items to replace is not a multiple of replacement length

Hydrenile · February 2, 2021, 11:44am

Hi there. First post, please pardon any inaccuracies.

So I've created a function which accepts a matrix as an argument, does some computations and inputs the results into another matrix. Unfortunately, the code runs with the infamous error:
Error in result_matrix[i, ] <- b3 : **
** number of items to replace is not a multiple of replacement length

Below is the code for the function in question

descriptive_stats <- function(x){
  n <- ncol(x)
  result_matrix <- matrix(0,nrow = n, ncol=8)
  for (i in 1:n) {
    y <- x[,i]
    N <- length(y)
    M <- mean(y, na.rm = T)
    s.d <-  sd(y, na.rm = T)
    cv <- 100 * s.d/M
    Min <- min(y, na.rm = T)
    Max <- max(y, na.rm = T)
    std_error <- sd(y)/sqrt(length(y))
    range_val <- range(y, na.rm = T)
    b3 <- c(N,M,s.d,cv,Min,Max,std_error,range_val)
    result_matrix[i,] <- b3
}
colnames(result_matrix) <- c("N","Mean","sd","cv","Min","Max","Stderror","Range")
rownames(result_matrix) <- colnames(x) # row names for results = column names of data
result_matrix <- round(result_matrix,2) 
print(result_matrix)

The code that's generating the error

dataset <- matrix(rnorm(500, mean = 2, sd = 3), nrow = 100, ncol = 5)
descriptive_stats(dataset)

Any help will be deeply appreciated. Thanks in advance!

ADernild · February 2, 2021, 12:34pm

Your problem is that the range() function in R returns a numeric vector containing the minimum and maximum value, rather than a single numeric output. This means that you are trying to fit 9 columns into the matrix you created with only 8 columns, which is why you get that error. You already have a Min and Max, so what exactly do you want range_val to be?

If you want range_val to be the difference between, the highest and the lowest value, you can do this:

descriptive_stats <- function(x){
  n <- ncol(x)
  result_matrix <- matrix(0,nrow = n, ncol=8)
  for (i in 1:n) {
    y <- x[,i]
    N <- length(y)
    M <- mean(y, na.rm = T)
    s.d <-  sd(y, na.rm = T)
    cv <- 100 * s.d/M
    Min <- min(y, na.rm = T)
    Max <- max(y, na.rm = T)
    std_error <- sd(y)/sqrt(length(y))
    range_val <- diff(range(y, na.rm = T))
    b3 <- c(N,M,s.d,cv,Min,Max,std_error,range_val)
    result_matrix[i,] <- b3
  }
  colnames(result_matrix) <- c("N","Mean","sd","cv","Min","Max","Stderror","Range")
  rownames(result_matrix) <- colnames(x) # row names for results = column names of data
  result_matrix <- round(result_matrix,2) 
  print(result_matrix)
}

What I did was to add the function diff() around the range() function called in range_val.

Hope this helps

Hydrenile · February 2, 2021, 2:48pm

Before anything, thank you so much for the quick response! Didn't realize the range function was going to output values I already had (newbie stuff). Including the difference of the range is a perfect suggestion and will be very helpful. Will modify my code accordingly.

Once again, thank you so much. Really appreciated!

system · February 9, 2021, 2:48pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.