Use more efficient hashtable for execGrouping.c to speed up hash aggregation. faster-hashtable
authorAndres Freund <[email protected]>
Thu, 21 Jul 2016 19:51:14 +0000 (12:51 -0700)
committerAndres Freund <[email protected]>
Sat, 1 Oct 2016 00:24:21 +0000 (17:24 -0700)
commit70a82c0808d6c9a6cf2fd6448bbdfc17345b977a
tree0cdb4034f56437de1cb28cd6acb616734f5661d8
parent65fdd6a193eb9ca83134d95c37ab2f0f4519baa0
Use more efficient hashtable for execGrouping.c to speed up hash aggregation.

The more efficient hashtable speeds up hash-aggregations with more than
a few hundred groups significantly. Improvements of over 120% have been
measured.

Due to the the different hash table queries that not fully
determined (e.g. GROUP BY without ORDER BY) may change their result
order.

The conversion is largely straight-forward, except that, due to the
static element types of simplehash.h type hashes, the additional data
some users store in elements (e.g. the per-group working data for hash
aggregaters) is now stored in TupleHashEntryData->additional.  The
meaning of BuildTupleHashTable's entrysize (renamed to additionalsize)
has been changed to only be about the additionally stored size.  That
size is only used for the initial sizing of the hash-table.

TODO:
* Should hash agg size estimation for the planner consider the
  fillfactor?
13 files changed:
src/backend/executor/execGrouping.c
src/backend/executor/nodeAgg.c
src/backend/executor/nodeRecursiveunion.c
src/backend/executor/nodeSetOp.c
src/backend/executor/nodeSubplan.c
src/include/executor/executor.h
src/include/nodes/execnodes.h
src/test/regress/expected/matview.out
src/test/regress/expected/psql.out
src/test/regress/expected/tsrf.out
src/test/regress/expected/union.out
src/test/regress/expected/with.out
src/test/regress/sql/psql.sql