Compare sampling between R and Python

I am working on a validation exercise, where one programmer is using Python and the other R. Our hope is to obtain agreement to within roundoff error for a variety of analyses. Unfortunately one of the analyses involves bootstrapping, which has sampling with replacement in the background. We have become stuck on this one, because of different randomization algorithms. We can of course set the seeds in each package, but the algorithms for the random number generation are different. The use of set seeds is also potentially different. Even if we could sync the random number generation, the sampling algorithms are potentially different.

Perhaps there is a bootstrap package that gives the same results on the two platforms. But if we can just sync the sampling we can easily write the rest of the bootstrapping by hand.

Any suggestions for syncing these algorithms would be most appreciated.

You could pick one environment to do the sampling on, and record what samples are chosen. and force those samples in the other environment ?

you could even guarantee a non-biased approach by doing it twice...
use a side process in R to pick the samples
have both R and Python main testing processes use those samples
use a side process in Python to pick the samples
have both R and Python main testing processes use those samples

Thanks for the suggestion. I believe that will satisfy our purposes.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.