Finding elements which are exceeding a thershold?

tamas1 · October 20, 2021, 10:23pm

Hi,
I would like to speed up the following script somehow.

empiria<-function(zz){
 greater<-which(zz[1:nrow(zz)]<2)
 greater<-greater[diff(nagyobb)!=1]
 smaller<-which(zz[1:nrow(zz)]> -2)
 smaller<-kisebb[diff(kisebb)!=1]
 cross<-sort(c(greater,smaller))
 cross<-cross+1
 cross<-cross[cross>2]
 t_1nap<-c()
 for(val in atlepes){
   t_1nap<-c(t_1nap,abs(zz[[val]]-zz[[val-1]]))}
 return(t_1nap)}

zz is a time series, with the function I am trying to find those elements where the time series becames greater than 2 or smaller than -2. For instance, in a series of 1,3,5 , I am interested only in the second element, not in the third. If I have that, then I substract the previous element from this. I suppose both the for loop and the way I am adding this values to a vector one-by-on makes the code considerably slower. Could someone suggest me a way to avoid them? Thank you,

technocrat · October 21, 2021, 12:42am

for loops have their place even though they have a reputation for being slow, which can happen if interim steps aren't written to a pre-allocated object, etc. In many of the cases in which users coming from a procedural language background, the simpler functional approach does not readily come to mind.

In this case, having the indices of the elements of zz that satisfies the conditions provides a step to further processing.

I've illustrated one solution using made up data in the absence of zz. See the FAQ: How to do a minimal reproducible example reprex for beginners.

set.seed(137)
the_series <- ts(runif(120,min = -5, max = 5), frequency = 12, start = c(2010,1))
the_series[which(the_series < -2 | the_series > 2)]
#>  [1]  4.141790  2.636114  4.576050  4.113563  2.720766 -2.466418  3.043643
#>  [8]  3.735802  2.831292 -2.490212  3.707835 -2.280705 -2.258403  4.240154
#> [15]  4.221956 -4.472150 -2.899526  3.221924  2.189642  4.753328 -2.436642
#> [22]  3.341706 -4.841652 -3.325084  2.899013 -4.701774 -4.506472  2.873523
#> [29] -2.228279 -3.103191  3.616198 -4.472914  2.873050  4.287013 -3.440023
#> [36] -4.113404  2.982510 -3.183971 -4.010218  3.287020 -2.874006  4.324638
#> [43]  3.905566  4.635970 -2.791794 -4.778578 -4.809622  4.495908  2.608036
#> [50] -3.716669 -4.776359  2.303230 -3.892756  3.834854 -2.011916 -2.746332
#> [57] -2.250228  4.210272  3.397424 -3.849455  2.658914 -3.495754 -3.520557
#> [64] -2.944492 -4.975042  2.897202 -3.662780  4.416238 -3.892748  4.062615
#> [71]  4.954913  2.748661  3.597827  2.046638 -4.808251 -2.415291  2.955273
#> [78]  4.588577  2.076286
# the indices
which(the_series < -2 | the_series > 2)
#>  [1]   3   4   6   7   8  10  12  13  14  16  17  18  19  20  23  26  28  29  31
#> [20]  32  33  34  36  38  40  41  42  44  46  47  48  49  53  54  57  58  61  62
#> [39]  64  65  66  68  69  70  72  73  74  75  78  79  80  81  85  86  87  88  89
#> [58]  90  91  92  93  94  96  97  98  99 100 103 104 105 106 107 108 112 114 115
#> [77] 118 119 120

system · November 11, 2021, 12:43am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.