I am a newbie in R programming.
I am having issues with large datasets in R.
I work with rehh package of R.
R version is 3.4.4
When I try to create an object of class haplohh( rehh package library) : SEE BELOW
> chr21<-data2haplohh(hap_file="chr21.hap", map_file="chr21_replaceAA.map.inp", haplotype.in.columns=TRUE) Map file seems OK: 1054416 SNPs declared for chromosome 21 Haplotype are in columns with no header > Error in scan(hap_file, what = "character", quiet = TRUE) : too many items
So the *.hap file has 1054416 lines and 5008 columns(approx 10 GB size), *.map.inp file has 1054416 lines and 5 columns(15 MB size). both the files are tab-delimited.
My systems working RAM memory is 64 GB and 8TB is the storage space
I have an option of chopping up the files but, in this case It would create errors in my calculation. So I need to load the input files as a whole.
I wrote to the authors of the rehh package, posted this issues on stackoverflow and other communities, but no response from anywhere.
Hoping for a reply or any suggestions would be really appreciated.