Creating a for loop to sum column values; how to increment between columns?

hayes · April 26, 2019, 10:59pm

Hello. I currently have a 1920x1080 matrix that is read into R using read.csv. The column names are all V1, V2, etc. I am trying to sum the first 10 items of each column, save it to a variable, then move to the next column, and repeat. However, I am not sure how to increment this in a for loop. I tried adding 1 to V1 (in hopes that it would take me to the next column V2), but all that does is take all the values of that column and add 1 to it. Is there a different to access these individual values / a way to increment these columns to access different ones?

FJCC · April 26, 2019, 11:31pm

Rather than using a for loop, I would use one of the functions designed to iterate over a list or matrix. Below are two solutions, one using the apply function from base R and the other using one of the map functions from the purrr package. In the apply function, setting MARGIN to 2 means the function is applied over the columns.

df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
df
#>    V1 V2 V3
#> 1   1  2  3
#> 2   2  3  4
#> 3   3  4  5
#> 4   4  5  6
#> 5   5  6  7
#> 6   6  7  8
#> 7   7  8  9
#> 8   8  9 10
#> 9   9 10 11
#> 10 10 11 12
#> 11 11 12 13
#> 12 12 13 14
#> 13 13 14 15
#> 14 14 15 16
#> 15 15 16 17

#Define function to add first 10 elements
Sum10 <- function(x) {
  sum(x[1:10])
}

#with apply()
SUMS <- apply(df, MARGIN = 2, FUN = Sum10)
SUMS
#> V1 V2 V3 
#> 55 65 75

#With map_dbl() function
library(purrr)
SUMS2 <- map_dbl(df, Sum10)
SUMS2
#> V1 V2 V3 
#> 55 65 75

^{Created on 2019-04-26 by the reprex package (v0.2.1)}

andresrcs · April 26, 2019, 11:34pm

Another option would be to use the tidyverse tools

library(dplyr)
df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
head(df, 10) %>% 
    summarise_all(sum)
#>   V1 V2 V3
#> 1 55 65 75

FJCC · April 26, 2019, 11:39pm

And to answer your original question about a for loop:

df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
df
#>    V1 V2 V3
#> 1   1  2  3
#> 2   2  3  4
#> 3   3  4  5
#> 4   4  5  6
#> 5   5  6  7
#> 6   6  7  8
#> 7   7  8  9
#> 8   8  9 10
#> 9   9 10 11
#> 10 10 11 12
#> 11 11 12 13
#> 12 12 13 14
#> 13 13 14 15
#> 14 14 15 16
#> 15 15 16 17

SUMS3 <- vector(mode = "numeric", length = 3)
for(i in 1:3) {
  SUMS3[i] <- sum(df[1:10, i])
}

^{Created on 2019-04-26 by the reprex package (v0.2.1)}

Yarnabrina · April 27, 2019, 2:27am

I'd also like add another option colSums, provided in base. I'm actually surprised that no one suggested it yet.

Here's how to use it:

# defining the same dummy dataset
df <- data.frame(V1 = 1:15, V2 = 2:16, V3 = 3:17)

# finding column sums of the reduced dataset with only 1st 10 rows
colSums(x = head(x = df, n = 10))
#> V1 V2 V3 
#> 55 65 75

Hope this helps.

Leon · April 27, 2019, 7:30am

Generally loops should be avoided, so I suggest e.g.:

# Define dummy data
m = matrix(data = rnorm(2500), nrow = 50, ncol = 50)

# Index first 10 rows and the sum column wise
colSums(m[1:10,])

Hope it helps

system · May 18, 2019, 7:30am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.