This is a discussion on Re: Group-count estimation statistics within the pgsql Hackers forums, part of the PostgreSQL category; --> > From: Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> > >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > Tom> The only real ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| > From: Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> > >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > Tom> The only real solution, of course, is to acquire cross-column > Tom> statistics, but I don't see that happening in the near > Tom> future. > > Another approach is a hybrid hashing scheme where we use a hash table > until we run out of memory at which time we start spilling to disk. In > other words, no longer use SortAgg at all .. > > Under what circumstances will a SortAgg consumer more IOs than a > hybrid hash strategy ? Goetz Graefe did a heck of a lot of analysis of this, prior to his being snapped up by Microsoft. He also worked out a lot of the nitty-gritty for hybrid hash algorithms, extending the Grace hash for spill-to-disk, and adding a kind of recursion for really huge sets. The figures say that hybrid hash beats sort-aggregate, across the board. ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| Thread Tools | |
| Display Modes | |
|
|