Hello. I currently have a 1920x1080 matrix that is read into R using read.csv. The column names are all V1, V2, etc. I am trying to sum the first 10 items of each column, save it to a variable, then move to the next column, and repeat. However, I am not sure how to increment this in a for loop. I tried adding 1 to V1 (in hopes that it would take me to the next column V2), but all that does is take all the values of that column and add 1 to it. Is there a different to access these individual values / a way to increment these columns to access different ones?
Rather than using a for loop, I would use one of the functions designed to iterate over a list or matrix. Below are two solutions, one using the apply function from base R and the other using one of the map functions from the purrr package. In the apply function, setting MARGIN to 2 means the function is applied over the columns.
df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
df
#> V1 V2 V3
#> 1 1 2 3
#> 2 2 3 4
#> 3 3 4 5
#> 4 4 5 6
#> 5 5 6 7
#> 6 6 7 8
#> 7 7 8 9
#> 8 8 9 10
#> 9 9 10 11
#> 10 10 11 12
#> 11 11 12 13
#> 12 12 13 14
#> 13 13 14 15
#> 14 14 15 16
#> 15 15 16 17
#Define function to add first 10 elements
Sum10 <- function(x) {
sum(x[1:10])
}
#with apply()
SUMS <- apply(df, MARGIN = 2, FUN = Sum10)
SUMS
#> V1 V2 V3
#> 55 65 75
#With map_dbl() function
library(purrr)
SUMS2 <- map_dbl(df, Sum10)
SUMS2
#> V1 V2 V3
#> 55 65 75
Created on 2019-04-26 by the reprex package (v0.2.1)
Another option would be to use the tidyverse tools
library(dplyr)
df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
head(df, 10) %>%
summarise_all(sum)
#> V1 V2 V3
#> 1 55 65 75
And to answer your original question about a for loop:
df <- data.frame(V1 = seq(1,15), V2 = seq(2,16), V3 = seq(3,17))
df
#> V1 V2 V3
#> 1 1 2 3
#> 2 2 3 4
#> 3 3 4 5
#> 4 4 5 6
#> 5 5 6 7
#> 6 6 7 8
#> 7 7 8 9
#> 8 8 9 10
#> 9 9 10 11
#> 10 10 11 12
#> 11 11 12 13
#> 12 12 13 14
#> 13 13 14 15
#> 14 14 15 16
#> 15 15 16 17
SUMS3 <- vector(mode = "numeric", length = 3)
for(i in 1:3) {
SUMS3[i] <- sum(df[1:10, i])
}
Created on 2019-04-26 by the reprex package (v0.2.1)
I'd also like add another option colSums, provided in base
. I'm actually surprised that no one suggested it yet.
Here's how to use it:
# defining the same dummy dataset
df <- data.frame(V1 = 1:15, V2 = 2:16, V3 = 3:17)
# finding column sums of the reduced dataset with only 1st 10 rows
colSums(x = head(x = df, n = 10))
#> V1 V2 V3
#> 55 65 75
Hope this helps.
Generally loops should be avoided, so I suggest e.g.:
# Define dummy data
m = matrix(data = rnorm(2500), nrow = 50, ncol = 50)
# Index first 10 rows and the sum column wise
colSums(m[1:10,])
Hope it helps
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.