Parallel Computing problem

Dear Experts,

First of all thanks for reading my question.

I have a series says "x" comprises up of 1.8 million observations collected at an interval of 5 minute for a period of 18 years. I would like to shuffle this series for say 100 times such that the series x1 to x100 of x is generated by a function in an array object abc:


now i have a big data object abc with dimension :

dim(abc) = (1.8million,1,100)

Now i wanted to compute hurst exponent of this series using DFA method. DFA function in package fractal is used. Such that i make a function called DFAfunction as under:

DFAfunction<-function(x){DFA(x,detrend="ploy1",sum.order=0, overlap=0, scale.max=trunc(length(x)/1),scale.min=NULL,scale.ratio = 2, verbose=FALSE)}

Now tried to calculate this DFA hurst exponent by a function as under for 100 series:


But this function is taking almost 10 days to compute results.

Then i read somewhere about parallel package and used parApply function, like this:

parallel::clusterEvalQ(c1, c("PerformanceAnalytics","fractal" ))
DFA_SS_FS<- parApply(c1, abc,FUN =  apply(abc,3,function(x){DFAfunction(x)})))

Again it is taking too much of time as it is running from last 3 days and function running.

Anyone can help me with writing a code or giving me direction to run my code faster.

Please bear in mind that i have started using R- couple of months ago so i am a new user.

Please help,


Raheel Asif

A post was merged into an existing topic: Parallel Computing for Large Dataset

Please do not duplicate open topics