Loop over columns and rows in a data frame

Hi,

I just want to create four new columns which are the function of rows and columns. For example, the value of cell ‘k’ = (row ‘j’ * column ‘i’)/row total. I was trying to loop over rows and columns using the following for loop. However, it does not work.

for (j in 1:nrow(df)) {
for (i in 1:ncol(df)) {
df[j,i] <- (rowSums(df[j,])*colSums(df[,i]))/nrow(df)
}
}

A B C D
1 0 0 1
1 0 0 1
0 1 0 1
0 1 1 0
1 0 1 0
1 0 0 0
0 1 0 1
1 1 1 0
0 0 1 1
1 0 1 1

Any help is appreciated.

Vector in R is assumed as a column vector.
rowSums() and colSums() are applied to matrix not vector.
df[j,] and df[,i] are not matrix but vectors such that simple sum() is used.

You can use the following code.

df[j,i] <- (sum(df[j,])*sum(df[,i]))/nrow(df)

Thank you for your quick response. I tried it but gave me a different result. The expected result should be like this

AE BE CE DE
1.2 0.8 1 1.2
1.2 0.8 1 1.2
1.2 0.8 1 1.2
1.2 0.8 1 1.2
1.2 0.8 1 1.2
0.6 0.4 0.5 0.6
1.2 0.8 1 1.2
1.8 1.2 1.5 1.8
1.2 0.8 1 1.2
1.8 1.2 1.5 1.8

The result I got was below:

A B C D
1.8 1.52 2.66 4.788
2.72 3.15744 6.80011904 15.34519479
4.26 8.03530944 23.56321143 96.19134481
7.668 21.47618593 129.8275665 1944.652933
14.3136 73.51275576 1565.501162 342275.064
23.63312 325.0803378 61413.38755 2126916869
50.755248 2598.050796 16776332.59 3.56933E+15
117.8649648 38766.51975 65494427002 2.34E+25
246.4164261 1075932.512 7.05026E+15 1.65E+40
611.5607665 69697249.11 4.91E+22 8.10E+61

This is not what I am expecting.

df1 <-data.frame(
           A = c(1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L),
           B = c(0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L),
           C = c(0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L),
           D = c(1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L)
)
nr <- nrow(df1)
nc <- ncol(df1)
(m_r <- matrix(rep(rowSums(df1),nc),ncol=nc))
(m_c <-  matrix(rep(colSums(df1),nr),nrow=nr,byrow = TRUE) )                

m_r*m_c/nr

It is initialization issue.
Let's use df2 instead of df. :slight_smile:

df <- rbind(
    c(1,0,0,1),
    c(1,0,0,1),
    c(0,1,0,1),
    c(0,1,1,0),
    c(1,0,1,0),
    c(1,0,0,0),
    c(0,1,0,1),
    c(1,1,1,0),
    c(0,0,1,1),
    c(1,0,1,1))

#> initialization
df2 <- df*0

for (j in 1:nrow(df)) {
    for (i in 1:ncol(df)) {
        df2[j,i] <- (sum(df[j,])*sum(df[,i]))/nrow(df)
    }
}

#> print
df2

#> output
> df2
      [,1] [,2] [,3] [,4]
 [1,]  1.2  0.8  1.0  1.2
 [2,]  1.2  0.8  1.0  1.2
 [3,]  1.2  0.8  1.0  1.2
 [4,]  1.2  0.8  1.0  1.2
 [5,]  1.2  0.8  1.0  1.2
 [6,]  0.6  0.4  0.5  0.6
 [7,]  1.2  0.8  1.0  1.2
 [8,]  1.8  1.2  1.5  1.8
 [9,]  1.2  0.8  1.0  1.2
[10,]  1.8  1.2  1.5  1.8

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.