 # Summing and Taking average of elements in different vectors, which are part of different matrices, which are part of a vector of matrices.

Hi there,

I have created a simulation which generates data with different lengths. Each iteration `i` creates a dataframe with `tibble(x1,x2,x3,a,b)` and puts it in a empty vector `aggiter= c()`. I am trying to figure out how I can sum up, say, all of the third elements of `x3` (that are present) across all of the matrices present in aggiter.

Just as an update in case anyone new (to this post) also comes across this, this may be helpful to illustrate what I am trying to do:

If I run the above simulation twice: I get two `aggiter` data frames `aggiter[]` and `aggiter[]`.

Now when I call `aggiter[][]` (and, the dataframe in [] is `tibble(x1,x2,x3,a,b)` I get the second element of `x1`, so `x1`.

Now, what I am trying to do (for one example) is get the second element for all `x1`'s for all [[i]] in `aggiter`, so that I can take the average of all those values, as in like "the average x1 value out of all the iterations"

Here is my setup and simulation that I am trying to aggregate the results of everything in the actual simulation works fine, I just am trying to aggregate the results of each iteration and do some averaging of elements across runs of the simulation:

``````
#SETUP

D = 5 ## monetary value of harm from

maxw = 3

minw = 0

wbar = (maxw+minw)/2  ##average cost

xstar = 1 - wbar/(2*D) #true legal threshold given that the court does not create multiple rules for each cost type, but instead makes one for all types, based on the average.
xstar

iteration = (1:2)  ## Number of periods in the iteration of the loop.

aggiter = c()  ## Vector of matrices generated by the while loop that constitute an iteration of choices and legal threshold

aggnai = c() ## vector of matrices generate by the while loop that constitute an iteration of number of periods between accidents.

w1 = c()
w2 = c()
w3 = c()

``````
``````
#loop until all of x1,x2,x3 equal b or a

for (i in iteration) {

reset = {
t = 1

a = c() #lower bound of belief in xstar

b = c() #upper bound of belief in xstar

#w1star = runif(1, min = 0, max = 3)
#w2star = runif(1, min = 0, max = 3)

x1 = c()
x2 = c()
x3 = c()
x  = c()

w1[i] = runif(1,min = 0, max = 1) #cost of effort for person 1
w2[i] = runif(1,min = 2, max = 3)
w3[i] = runif(1,min = 2, max = 3)

na = c()   #number of periods that take place before an accident takes place.
na1 = c()
na2 = c()
na3 = c()

}

while (TRUE){
if(t==1){a[t]=0.3;b[t]=0.95}
if(t==1){na1[t]=0;na2[t]=0;na3[t]=0}

na[t]=0

x[t] = 0

##Tortfeasors problem: pick the x that will minimize your cost --- piecewise optimization. First object is if (x<=a), second is if (a<x<b), and third is when (x>=b).

x11 = (1-(w1[i]/(2*D)))
x12 = ((2+b[t])-sqrt((b[t]^2)-2*b[t]+1+3*(w1[i]/D)*(b[t]-a[t])))/3
x13 = b[t]

c11 = (w1[i]*x11)+(((x11-1)^2)*D)
c12 = (w1[i]*x12)+(((x12-1)^2)*((b[t]-x12)/(b[t]-a[t]))*D)
c13 = w1[i]*x13

x21 = (1-(w2[i]/(2*D)))
x22 = ((2+b[t])-sqrt((b[t]^2)-2*b[t]+1+3*(w2[i]/D)*(b[t]-a[t])))/3
x23 = b[t]

c21 = (w2[i]*x21)+(((x21-1)^2)*D)
c22 = (w2[i]*x22)+(((x22-1)^2)*((b[t]-x22)/(b[t]-a[t]))*D)
c23 = w2[i]*x23

x31 = (1-(w3[i]/(2*D)))
x32 = ((2+b[t])-sqrt((b[t]^2)-2*b[t]+1+3*(w3[i]/D)*(b[t]-a[t])))/3
x33 = b[t]

c31 = (w3[i]*x31)+(((x31-1)^2)*D)
c32 = (w3[i]*x32)+(((x32-1)^2)*((b[t]-x32)/(b[t]-a[t]))*D)
c33 = w3[i]*x33

c1star = min(c11,c12,c13)
c2star = min(c21,c22,c23)
c3star = min(c31,c32,c33)

x1[t] = if(c1star==c11){x11} else if(c1star==c12 & (x12 < x13)){x12} else if(c1star==c13 || (x12 >= x13)){x13}

x2[t] = if(c2star==c21){x21} else if(c2star==c22 & (x22 < x23)){x22} else if(c2star==c23 || (x22 >= x23)){x23}

x3[t] = if(c3star==c31){x31} else if(c3star==c32 & (x32 < x33)){x32} else if(c3star==c33 || (x32 >= x33)){x33}

proba1 = function(x1){(x1[t]-1)^2}  #probability of accident only given driver1's effort level.
show(proba1(x1))
probn1 = 1 - proba1(x1)      #probability of not getting in an accident given driver1's effort level
probn1

proba2 = function(x2){(x2[t]-1)^2}
show(proba2(x2))
probn2= 1 - proba2(x2)
probn2

proba3 = function(x3){(x3[t]-1)^2}
show(proba3(x3))
probn3= 1 - proba3(x3)
probn3

while(x[t]!="x1[t]" & x[t]!="x2[t]" & x[t] != "x3[t]"){x[t] = sample(c("x1[t]","x2[t]","x3[t]",0,0,0),size = 1, replace = TRUE, prob = c(proba1(x1),proba2(x2), proba3(x3), probn1, probn2, probn3))
if(x[t]== "0"){
na[t] = na[t]+1
na1[t] = na1[t]+1
na2[t] = na2[t]+1
na3[t] = na3[t]+1}}

show(x[t])

{if(x[t]=="x1[t]"){
print("Driver 1 Accident")
na1[t+1] = 0
na2[t+1] = na2[t]+1
na3[t+1] = na3[t]+1}
else if(x[t]=="x2[t]"){
print("Driver 2 Accident")
na2[t+1] = 0
na1[t+1] = na1[t]+1
na3[t+1] = na3[t]+1}
else if(x[t]=="x3[t]"){
print("Driver 3 Accident")
na3[t+1] = 0
na2[t+1] = na2[t]+1
na1[t+1] = na1[t]+1}}

if (x[t]=="x1[t]"){x[t] = x1[t];x = as.numeric(x)} else if (x[t]=="x2[t]"){x[t] = x2[t];x = as.numeric(x)} else if (x[t]=="x3[t]"){x[t] = x3[t];x = as.numeric(x)}

probg = function(x,a,b){(b[t]-x[t])/(b[t]-a[t])} ### probability of being found guilty given x in the ambiguous region
show(probg(x,a,b))

probi = function(x,a,b){1-probg(x,a,b)} ### probability of being found innocent given x in the ambiguous region
show(probi(x,a,b))

rulings = if(x[t]>=b[t]){ruling = print("Not Negligent")

} else if(x[t]<=a[t]) {ruling = print("Negligent")

} else if(x[t]>a[t] & x[t]<b[t]) {ruling = if(x[t]<xstar){"guilty"} else if(x[t]>=xstar){"not guilty"}; show(ruling)}

if(ruling == "guilty") {a[t+1] = x[t]; b[t+1] = b[t]} else if (ruling == "not guilty") {b[t+1] = x[t]; a[t+1]=a[t]} else {a[t+1]=a[t]; b[t+1] = b[t]; print("thresholds unchanged")}

x = round(x,digits= 6)
x1 = round(x1,digits = 6)
x2 = round(x2, digits = 6)
x3 = round(x3, digits = 6)
a = round(a, digits = 6)
b = round(b, digits = 6)

if((a[t+1]==a[t]) && (b[t+1] == b[t]) && all(round(c(x1[t],x2[t], x3[t]),4) %in% round(x,4)) && ((x1[t] <= a[t]) | (x1[t] >= b[t])) && ((x2[t] <= a[t]) | (x2[t] >= b[t])) && ((x3[t] <= a[t]) | (x3[t] >= b[t]))){break} else {print("not equilibrium")}

t= t+1

}

if((a[t+1]==a[t]) && (b[t+1] == b[t])){print("pls r")}

if(all(c(x1[t],x2[t],x3[t]) %in% x)){print("pls 1r")}

if(((x1[t] <= a[t]) || (x1[t] >= b[t])) & ((x2[t] <= a[t]) || (x2[t] >= b[t])) & ((x3[t] <= a[t]) || (x3[t] >= b[t]))){print("pls 2r")}

x1[(length(a))] = NA
x2[(length(a))] = NA
x3[(length(a))] = NA
na[(length(a))] = NA

aggiter[[i]] = tibble(x1,x2,x3,a,b)
aggnai[[i]] = tibble(na,na1,na2,na3)

}

``````

In case the simulation doesnt work.. here are some sample iterations

``````aggiter= c()

x1 = c(0.796399, 0.856540, 0.881432, 0.900808, 0.857664,NA)
x2 = c(0.695912, 0.788911, 0.827514, 0.857664, 0.857664, NA)
x3 = c(0.653771, 0.760512, 0.804842, 0.750000, 0.857664, NA)
a = c(0.300000, 0.653771, 0.760512, 0.827514, 0.827514, 0.827514)
b = c(0.950000, 0.950000, 0.950000, 0.950000, 0.857664, 0.857664)

aggiter[] = tibble(x1,x2,x3,a,b)

x1 = c(0.87827, 0.91340, 0.92754, 0.92065, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.85643, NA)
x2 =  c(0.68152, 0.78534, 0.82775, 0.82965, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, NA)
x3 =  c(0.67141, 0.77874, 0.82259, 0.82487, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.85643, NA)
a = c(0.30000, 0.67141, 0.77874, 0.77874, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965)
b = c(0.95000, 0.95000, 0.95000, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.85643, 0.85643)
na = c(0, 28,  6, 57, 32, 24, 13, 92, 37, 32, 58, 17, 11, 40,  1)

aggiter[] = tibble(x1,x2,x3,a,b)
``````

Please see the FAQ: What's a reproducible example (`reprex`) and how do I do one? Using a reprex, complete with representative data will attract quicker and more answers. Having to reverse engineer the problem is a big drag.

Without representative objects that go into making up `aggiter`, the best I can suggest is creating rows of `NA` to rbind the data with different lengths to avoid the notational brain damage of a list of matrices.

1 Like

I have edited the original post! The simulation should work fine its just the aggregation of results that I am trying to figure out and play with!

1 Like

I am now also trying to use `tibble` instead of `cbind` for `aggiter` (edit made in reprex)

1 Like

Thanks! Although I still have some tweaking to do before getting back to you, I'll share some experience that may be helpful generally.

Like you, I came to `R` with programming experience based on imperative/procedural languages like `C`, `PERL`, `Ruby` and `Python`. And, surprise, I was dismayed that `R` seemed under-developed using the approach that I knew: *do this, then do that, but if the other ... ."

It wasn't until I dabbled in `Haskell` that I realized that `R` presents to the user as a functional language, as in

I don't care how you do it but please give me that in exchange for this

just like school algebra, f(x) = y, with functions, arguments and return values.

There is nothing wrong with the imperative style. Indeed, many functions rely on it under the hood. Since `R` is so wealthy in off-the-shelf functions, though, there is almost always a way to chain these together to reach the result more easily.

1 Like

Thank you for the advice I appreciate it!

Just as an update in case this is helpful to illustrate what I am trying to do:

If I run the above simulation twice: I get two `aggiter` data frames `aggiter[]` and `aggiter[]`.

Now when I call `aggiter[][]` (and, the dataframe in [] is `tibble(x1,x2,x3,a,b)` ) I get the second element of `x1`, so `x1`.

Now, what I am trying to do (for one example) is get the second element for all `x1`'s for all [[i]] in `aggiter`, so that I can take the average of all those values, as in like "the average x1 value out of all the iterations"

sorry to bug you as I am sure you have other questions you are trying to answer, but perhaps do you know somewhere else I should look in the meantime? Im just trying to figure this out but I am not so sure where to go from here.

1 Like

Sorry, I was just looking at this. No worries.

I'm stuck with cutting and pasting the code and not getting it to work due to difficult-to-track down unpaired delimiters. Could you edit to get an object (or just the objects that get stuffed into the list of matrices)?

1 Like

If I am understanding you correctly, are you wanting me to run the simulation and just give you the results of `aggiter`? Or something else. Sorry

I think I (hopefully) made the right edit? Near the bottom of OP

1 Like

OK, here's what I came up with, comments after `reprex`

``````suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(tibble))

# aggiter

aggiter <- list(structure(list(
x1 =
c(0.897102, 0.926416, 0.938169, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.941771, 0.941771, 0.941771, 0.941771, 0.941771, 0.941771, 0.941025, 0.85703, NA),
x2 =
c(0.683423, 0.789996, 0.833832, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.793072, 0.85667, 0.85667, 0.85667, 0.85667, 0.85667, 0.85667, 0.85703, 0.85703, NA),
x3 =
c(0.681053, 0.788475, 0.832662, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.78978, 0.85703, NA),
a =
c(0.3, 0.681053, 0.789996, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832, 0.833832),
b =
c(0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.95, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.943873, 0.941771, 0.85703, 0.85703)),
row.names =
c(NA, -45L),
class =
c("tbl_df", "tbl", "data.frame")), structure(list(
x1 =
c(0.797041, 0.851597, 0.874718, 0.861926, 0.874718, 0.874718, NA),
x2 =
c(0.627584, 0.732504, 0.7771, 0.786855, 0.708586, 0.708586, NA),
x3 =
c(0.626078, 0.731444, 0.776231, 0.78617, 0.706107, 0.706107, NA),
a =
c(0.3, 0.627584, 0.732504, 0.732504, 0.78617, 0.78617, 0.78617),
b =
c(0.95, 0.95, 0.95, 0.874718, 0.874718, 0.874718, 0.874718)),
row.names =
c(NA, -7L),
class =
c("tbl_df", "tbl", "data.frame")))

# function for two element list of tibbles to perform binary argument
# operator functions on corresponding column rows

f <- function(x,y,z){

first_tib <- x
second_tib <- x
colno <- y
rowno <- y
operand_one <- first_tib[][colno,rowno]
operand_two <- second_tib[][colno,rowno]
z(operand_one[],operand_two[])
}

f(aggiter,c(3,3),sum)
#>  1.608893
f(aggiter,c(3,3),mean)
#>  0.832662
``````

Created on 2020-03-30 by the reprex package (v0.3.0)

This suffers, of course, from being too hardwired to the immediate problem. Its primary virtue is to lift the bracketing burden. There's an underlying burden that this doesn't address, which the necessity of digging the bare numbers out of a subset of a tibble in a list. (Because until getting down to the bare numeric, there is no guarantee that binary functions will work; some, like `sum` don't mind; others, like `mean` do.)

There's also the nuisance of namespace clutter with the mass of temporary objects

``````c("a", "aggiter", "aggnai", "b", "c11", "c12", "c13", "c1star", "c21", "c22", "c23", "c2star", "c31", "c32", "c33", "c3star", "D", "i", "iteration", "maxw", "minw", "na", "na1", "na2", "na3", "proba1", "proba2", "proba3", "probg", "probi", "probn1", "probn2", "probn3", "reset", "ruling", "rulings", "specimen", "t", "w1", "w2", "w3", "wbar", "x", "x1", "x11", "x12", "x13", "x2", "x21", "x22", "x23", "x3", "x31", "x32", "x33", "xstar")
#>   "a"         "aggiter"   "aggnai"    "b"         "c11"       "c12"
#>   "c13"       "c1star"    "c21"       "c22"       "c23"       "c2star"
#>  "c31"       "c32"       "c33"       "c3star"    "D"         "i"
#>  "iteration" "maxw"      "minw"      "na"        "na1"       "na2"
#>  "na3"       "proba1"    "proba2"    "proba3"    "probg"     "probi"
#>  "probn1"    "probn2"    "probn3"    "reset"     "ruling"    "rulings"
#>  "specimen"  "t"         "w1"        "w2"        "w3"        "wbar"
#>  "x"         "x1"        "x11"       "x12"       "x13"       "x2"
#>  "x21"       "x22"       "x23"       "x3"        "x31"       "x32"
#>  "x33"       "xstar"
``````

Created on 2020-03-30 by the reprex package (v0.3.0)

(all of them except for my mini-dput function specimen).

The reason that the crowded namespace is problematic comes from the nature of `R` as an interactive environment. What should be given consideration may seem like working backwards: First design the object structure that is easy to map functions to, then work backwards to populate it.

1 Like

Hi @NathanS,

I'm not sure if reshaping the data is an option for you, but in a case like this I'd recommend working on the data in long format and aggregate by group.

I have no expirience working with tibbles so I switched to data.table, please check the following:

``````library(tibble)
library(data.table)

aggiter= c()

x1 = c(0.796399, 0.856540, 0.881432, 0.900808, 0.857664,NA)
x2 = c(0.695912, 0.788911, 0.827514, 0.857664, 0.857664, NA)
x3 = c(0.653771, 0.760512, 0.804842, 0.750000, 0.857664, NA)
a = c(0.300000, 0.653771, 0.760512, 0.827514, 0.827514, 0.827514)
b = c(0.950000, 0.950000, 0.950000, 0.950000, 0.857664, 0.857664)

aggiter[] = tibble(x1,x2,x3,a,b)

x1 = c(0.87827, 0.91340, 0.92754, 0.92065, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.85643, NA)
x2 =  c(0.68152, 0.78534, 0.82775, 0.82965, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, 0.85643, NA)
x3 =  c(0.67141, 0.77874, 0.82259, 0.82487, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.77612, 0.85643, NA)
a = c(0.30000, 0.67141, 0.77874, 0.77874, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965, 0.82965)
b = c(0.95000, 0.95000, 0.95000, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.92754, 0.85643, 0.85643)
na = c(0, 28,  6, 57, 32, 24, 13, 92, 37, 32, 58, 17, 11, 40,  1)

aggiter[] = tibble(x1,x2,x3,a,b)

# use data.table over tibble
lapply(aggiter, setDT)

# row bind them to long format
DT <- rbindlist(aggiter)

# assign iteration groups
iteration_rows <- lapply(aggiter, nrow)
DT[, iteration := rep(seq_along(iteration_rows), iteration_rows)]
DT[, iteration_row := seq_len(.N), by = iteration]

# aggregate
sum(DT[iteration_row %in% 3]\$x1, na.rm = TRUE)
#  1.808972
mean(DT[iteration_row %in% 3]\$x1, na.rm = TRUE)
#  0.904486

# alternative: add results to existing DT
DT[, c("x1_sum", "x1_mean") := .(sum(x1, na.rm = TRUE), mean(x1, na.rm = TRUE)), by = iteration_row]
print(head(DT))
# x1       x2       x3        a        b iteration iteration_row   x1_sum   x1_mean
# 1: 0.796399 0.695912 0.653771 0.300000 0.950000         1             1 1.674669 0.8373345
# 2: 0.856540 0.788911 0.760512 0.653771 0.950000         1             2 1.769940 0.8849700
# 3: 0.881432 0.827514 0.804842 0.760512 0.950000         1             3 1.808972 0.9044860
# 4: 0.900808 0.857664 0.750000 0.827514 0.950000         1             4 1.821458 0.9107290
# 5: 0.857664 0.857664 0.857664 0.827514 0.857664         1             5 1.785204 0.8926020
# 6:       NA       NA       NA 0.827514 0.857664         1             6 0.927540 0.9275400
``````

Good question!

I see `0.832662` and ` 0.776231` there, so (hold your head in your hands!)

``````sum(c(aggiter[][3,3][],aggiter[][3,3][]))
``````

should produce the right number. If, there's no `set.seed()`, because I can't eyeball either of the numbers in your calculation.

However,

Cockroaches seldom travel alone. --Me

the return of `mean` is bogus, because I'm feeding two numbers instead of a vector

``````mean(2,4)
#>  2
mean(c(2,4))
#>  3
``````

Created on 2020-03-30 by the reprex package (v0.3.0)

Right, I just was checking your answer, which uses x1 where I was using x3.

This creates a far better data structure than the list of tibbles. With `data.table` under the hood will will speed along quickly for really big iterations. I haven't used `data.table` much, but the `reprex` is good incentive to get more into it. Thanks.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.