I am trying to perform correlation on a large dataset of over 100,000 columns and subseting and parallelization is not working, is there any other method that i can try?
Are these columns all numeric variables? About how many rows? Are you looking to do pairwise
How many paring combinations are there over 100,000 possible pair inputs...
choose(100000,2) # 4999950000
this is 4,999,950,000 or 5billion.
Its asking to do a lot of work
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.