Hi guys,
There's a solution that permits me to get the quantile of a column? my dataset has 130 milions of rows.... I tried with this instruction, but i can't have the quantile in a fast tense..
quantile calculations require sorting, so you wont ever get a faster quantile calculation than you can sort the data... sorting 130million values is something time consuming.
not sure where the tradeoffs are in time/complexity but one possibility might be...
If you can be satisfied with less precision, you could look to aggregate the data to some few decimal places and aggregate count them, then pass the much fewer rows weighted data to a function like in MetricsWeighted package 'weighted_quantile'