Sapply, lapply and indexing inside for loops

Andrzej · March 21, 2020, 11:23am

Hi All,
I stumbled upon this post while learning R: https://r.789695.n4.nabble.com/Loop-with-variable-index-td845011.html
The qustion was as follows: " I have a list of 20 values. The first time through a loop I want to find the mean and stnd.dev. of the first two values; the second time through the loop I want to find the mean and stnd. dev. of the first 3 values, etc. until the last time through the loop I want to find the mean and stnd. dev. of all 20 values, so I end up with 19 means and stnd. deviations.
How would I construct such a loop? "
And two answers were presented:

Solution, which I personnaly think is brilliant:

x <- rnorm(20)

 sapply(c("sd", "mean"), function(fun)lapply(lapply(lapply(2:20, seq,
from=1), function(.x)x[.x]), fun))

But trying to understand the code I am stuck:
Generally I understand what "for loops" are made for but I was able to break it down only into this:

lapply(2:20, seq, from=1)

So I would like to understand how all that code works and why there are three lapplies in a row in it ? What do those " (.x), [.x] " mean, and how is it possible to recreate it in tidyverse maybe in a simpler way ?

Solution:

Let the vector be ``x''.
       
        mns <- list()
        sds <- list()
        for(i in 2:20) {
                mns[[i-1]] <- mean(x[1:i])
                sds[[i-1]] <- sd(x[1:i])
        }
        mns <- unlist(mns)
        sds <- unlist(sds)

which I try to understand as well, especially subsetting in here like: " mns[[i-1]] ", because sometimes in loops we have " vec[i] ", or " vec[[i]] ".
And finally why here:

mns[[i-1]] <- mean(x[1:i])

on the left hand side of an assignment arrow, we have "i-1" in brackets and on the right hand side of <- we have "x[1:i]" ? What does it do ?

Any help regarding how to understand it correctly would be much appreciated.
Thank you in advance.

Andrzej · March 21, 2020, 6:22pm

Thank you @Yarnabrina for your detailed explanation.

I read it carefully and I would like to clarify a few things:
this works:

 for (i in 2:20) {
  mns[[i - 1]] <- mean(x[1:i])
  sds[[i - 1]] <- sd(x[1:i])
}

but this does not:

for (i in x[2:20]) {
  mns[[i - 1]] <- mean(x[1:i])
  sds[[i - 1]] <- sd(x[1:i])
}

If x gets x <- rnorm(20), does x[2:20] represent 19 numbers from vector x „for loop” iterates on ?
Why does it throw an error then ?

About indexing/subsetting inside for loops please have a look at the example below :

n = 10

log.vec = vector(length=n, mode="numeric")

for (i in 1:n) {
  log.vec[i] = log(i)
}

for (i in 1:n) {
  log.vec[[i]] = log(i)
}

log.vec

Regardless of using [ or [[ the results are the same. I suppose that it is not always like this.

Obviously I will study a topic from SO you pointed out in your reply.

Best regards.

dromano · March 21, 2020, 6:53pm

I would recommend the following: 1) Open a new R file, 2) place the code below in it (your code wrapped in a dummy function), 3) source the file, 4) click to the left of line 2 to create a red break point, and then 5) run test_function() in the console. This puts you in debug mode, and all the values being created an assigned will appear in the upper right 'Environment' pane, and you can step through the function by hitting 'Enter'.

test_function <- function(n = 2){
  x <- rnorm(20)
  mns <- list()
  sds <- list()
  for(i in 2:n) {
    mns[[i-1]] <- mean(x[1:i])
    sds[[i-1]] <- sd(x[1:i])
  }
  for (i in x[2:n]) {
    mns[[i - 1]] <- mean(x[1:i])
    sds[[i - 1]] <- sd(x[1:i])
  }
}

Is that helpful?

nirgrahamuk · March 21, 2020, 7:39pm

is because i in 2:20 will take values 2,3,4 up to 20 and these are valid indexes into a vector
whereas i in x[2:20] gives contents of the random vector x , the 2nd, 3rd, 4th up to 20th, these are not valid indexes as they are decimal fraction. As I is being used as an index, these fractions are meaningless.

nirgrahamuk · March 21, 2020, 7:42pm

The difference with a vector is moot, but for a list using single square brackets [ ] will return the value wrapped in a list, whereas double square brackets [[ ]] will return it naked
This is not really of any consequence when on the left side of an assignment, its more about when the list is being access on the right side of an assignment.
When on the left, both forms of square brackets will let you modify the value at the index with the right hand side value.
When on the right , you will be setting whatever is on the left to be a list containing the value indexed out of your list if you use single brackets, or you will be setting whatever is on the left to be the naked value of what you have indexed into.

Hope it helps !

Andrzej · March 21, 2020, 8:40pm

Thank you @dromano and @nirgrahamuk,
that was extremely helpful.
I really appreciate you taking the time to respond.
Best regards.

Andrzej · March 28, 2020, 7:42am

Hi @Yarnabrina and All,
I am still learning "for loops".
I try to resolve this very simple task:

x <- c(1:5)

for (i in x)
{ y <- i*i
print(y)
}

using sapply and lappy solution presented in my first question:

 sapply(c("sd", "mean"), function(fun)lapply(lapply(lapply(2:20, seq,
from=1), function(.x)x[.x]), fun))

because I somehow like it, when it gives me immediately two columns with results showing each and every iteration steps.

Of course I can use the second solution presented there as well:

ID <- c(1,2,3,4,5)

DF <-  cbind(ID, expon2) %>% as.data.frame()

colnames(DF) <- c("ID", "result_of_each_Iterations")

and using reprex:

library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 3.6.3
#> Warning: package 'tidyr' was built under R version 4.0.0
#> Warning: package 'forcats' was built under R version 3.6.3
library(magrittr)
#> 
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#> 
#>     set_names
#> The following object is masked from 'package:tidyr':
#> 
#>     extract

x <- c(1:5)

expon2 <- list()

for (i in 1:5) {
  expon2[i] <- i**2
}

expon2 <- unlist(expon2)

ID <- c(1,2,3,4,5)


DF <-  cbind(ID, expon2) %>% as.data.frame()

colnames(DF) <- c("ID", "result_of_each_Iterations")

DF
#>   ID result_of_each_Iterations
#> 1  1                         1
#> 2  2                         4
#> 3  3                         9
#> 4  4                        16
#> 5  5                        25

^{Created on 2020-03-28 by the reprex package (v0.3.0)}

but if you could help me do it with sapply and lapply, please ?
And is it a way to do something in order to DF when displayed is not so clustered (header text) but looks like align center (a bit like in MS Word):
obraz

kind regards.

nirgrahamuk · March 28, 2020, 11:40am

Hi,
I can use use sapply, and apply etc, but most times I start with the purrr library for this type of work.
It has great functions and easy to use, also there are very good lessons for it.

x <- c(1:5)

#no purrr
for (i in x)
{ y <- i*i
print(y)
}


library(purrr)

purrr::walk(x,
            ~print(.^2))

nirgrahamuk · March 28, 2020, 12:36pm

sure, as the function ^2 is vectorised, thats the most elegant solution, walk is slightly more general as you could use you own non vectorise function in place. but yes Yarnabrina your solution is by far the most succint and elegant approach to printing all the square values of x in order

Andrzej · March 28, 2020, 12:56pm

Thank you @nirgrahamuk, for purrr solution, very quick.

Hi @Yarnabrina,

That solution with sapply and lapply immediately would give us (I suppose) two columns like this:
obraz
without binding columns with cbind and other indirect steps involvement.

system · April 18, 2020, 12:56pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.