Tomas Vondra [Sat, 4 Mar 2017 04:02:52 +0000 (05:02 +0100)]
resolve issues in create_index regression suite (obsolete error messages)
The expected output included error messages that were however made
unnecessary some time ago by making the table replicated.
Tomas Vondra [Sat, 4 Mar 2017 03:48:01 +0000 (04:48 +0100)]
fix regression failures in insert_conflict by replicating the table
The table 'insertconflict' has to be replicated, because the tests
need a unique index on (coalesce(a,0)). That is not possible on
distributed tables, because the index expression has to match the
distribution key and we only support columns there.
Tomas Vondra [Sat, 4 Mar 2017 01:17:42 +0000 (02:17 +0100)]
resolve failures in the 'alter_table' regression suite
The failures were due to foreign keys conflicting with distribution.
Resoved by replicating the tables used in the test suite.
Tomas Vondra [Sat, 4 Mar 2017 00:20:47 +0000 (01:20 +0100)]
accept plan changes due to Sort pushdown in xc_FQS_join
redistribute_path() makes sure the Sort is always pushed down, so
the plans generally change from this:
-> Sort
Output: tab3_rep.val, tab3_rep.val2
Sort Key: tab3_rep.val, tab3_rep.val2
-> Remote Subquery Scan on all
Output: tab3_rep.val, tab3_rep.val2
-> Seq Scan on public.tab3_rep
Output: tab3_rep.val, tab3_rep.val2
Filter: (tab3_rep.val > 2)
to this
-> Remote Subquery Scan on all
Output: tab3_rep.val, tab3_rep.val2
-> Sort
Output: tab3_rep.val, tab3_rep.val2
Sort Key: tab3_rep.val, tab3_rep.val2
-> Seq Scan on public.tab3_rep
Output: tab3_rep.val, tab3_rep.val2
Filter: (tab3_rep.val > 2)
The "Remote Subquery" node only shows "Sort Key" in verbose mode.
Tomas Vondra [Sat, 4 Mar 2017 00:10:23 +0000 (01:10 +0100)]
fix plan change in xl_join due to proper costing of the path
Due to accounting for the extra Sort node in redistribute_path(), the
costing now computes the costing now works properly. Originally the
Sort node was injected in createplan.c after the planning was done
(and so all the costing was done, and tweaking them would be poinless).
Tomas Vondra [Fri, 3 Mar 2017 23:03:01 +0000 (00:03 +0100)]
accept plan (added RemoteSubplan) in LockRows query ('select' suite)
Expected plan change for a new query added in the upstream, due to
adding RemoteSubplan at the top.
Tomas Vondra [Fri, 3 Mar 2017 16:08:43 +0000 (17:08 +0100)]
accept plan changes caused by Aggregate -> Partial/Finalize Aggregate
Most of the plan changes are obviously correct, caused by trivial
changes due to upstream now supporting partial aggregation. There are
a few larger changes (including e.g. switch from hash to sort), but
those are supported by also checking the query results.
Pavan Deolasee [Fri, 24 Feb 2017 10:05:24 +0000 (15:35 +0530)]
Look into the initPlans attached to lefttree of RemoteSubplan while deciding
whether correct variables are being referenced
This helps us to fix issue #81. It's not immediately clear if we should handle
this is more elegant manner than what we've done here. In PostgreSQL the
initplans are always attached to the top level plan, but in XL we add a
RemoteSubplan node on top of the top level plan. Unless we take into account
vars generated by the initPlans, we might incorrectly conclude that certain var
is not accessible by the subquery.
Pavan Deolasee [Thu, 9 Feb 2017 08:05:46 +0000 (13:35 +0530)]
Handle locking correctly in a global session.
Rearrange code so that we first check if the backend belongs to a global
sessions and if another member of the same session is holding a conflicting
lock. In such cases, the requesting backend does not wait and is granted the
lock immediately.
PG 9.6 changed a few things in this area to support locking for parallel
sessions. May be we can reuse that code in XL, but for now just use XL's code
for supporting distributed sessions
Tomas Vondra [Sun, 29 Jan 2017 18:15:29 +0000 (19:15 +0100)]
fix handling of inner/outer sortkeys in set_joinpath_distribution()
We must not reset inner/outer sortkeys unless the path is actually
redistributed. When a redistribution is not needed, the merge join
needs the original values. Otherwise this results in errors like
ERROR: inner pathkeys do not match mergeclauses
ERROR: outer pathkeys do not match mergeclauses
Tomas Vondra [Sun, 29 Jan 2017 17:45:55 +0000 (18:45 +0100)]
resolve most issues in the matview test suite
Those were fairly trivial issues - change of relation names to include
'mvtest_' prefix, ordering of SELECT results etc. There are a few
remaining differences in query plans - those may easily be correct,
but more detailed verification is needed.
Note: There seems to be quite a few differences when compared to
the matview.sql from commit
b5bce6c. Not sure why, but we probably
need to compare the tests tu upstream and minimize the difference.
Tomas Vondra [Sun, 29 Jan 2017 13:17:12 +0000 (14:17 +0100)]
fix ordering of results in the select_implicit regression test
A simple case of expected output not matching the ORDER BY.
Tomas Vondra [Sun, 29 Jan 2017 13:13:18 +0000 (14:13 +0100)]
remove GROUPING SETS from xl_limitations test for now
XL now handles grouping sets, although in a rather stupid way by
redistributing the data and executing them on the consumer node.
We may decide not to include that in 9.6 until we can distribute
(and possibly parallelize) the grouping sets, but for now let's
make the regression tests happy.
Tomas Vondra [Sun, 29 Jan 2017 02:14:36 +0000 (03:14 +0100)]
fix syntax error in foreign_key test suite
It's DISTRIBUTE BY HASH (x), not just DISTRIBUTE BY (x).
Tomas Vondra [Sun, 29 Jan 2017 02:06:51 +0000 (03:06 +0100)]
fix plans in tests broken by adding Sort Key to EXPLAIN(VERBOSE)
This fixes explain plans broken only because of the extra bit of
information added to EXPLAIN VERBOSE output. If the plan was broken
for some other reasons too, this not fix it.
Tomas Vondra [Sun, 29 Jan 2017 01:37:33 +0000 (02:37 +0100)]
fix failures due to distributed update/delete plans in some tests
Some tests are updating multiple columns of input tables, which is
not supported by XL. So instead of failing due to this limitation,
simply replicate the whole table to test whatever was the original
intent of the test.
All the UPDATE/DELETE cases fixed by this commit were failing in
XL 9.5, so none of this should be a regression.
Tomas Vondra [Sun, 29 Jan 2017 01:08:08 +0000 (02:08 +0100)]
fix breakage in 'rowsecurity' due to renamed roles used in tests
The tests originally used role names adam, bob, eve, etc. But those
roles got renamed to regress_rls_adam, regress_rls_bob etc. So fix
the exected output accordingly. Also fix some minor breakage due
to modified wording of error messages and/or data tweaks.
Tomas Vondra [Sat, 28 Jan 2017 22:43:01 +0000 (23:43 +0100)]
redistribute Limit nodes nested in MinMaxAggPath
Make sure min/max path are redistributed. Doing it in createplan.c
is probably only a temporary solution, though. First, it breaks
the costing model (the path cost does not include the redistribution),
and it also probably breaks how distribution is propagated up the
path tree.
Tomas Vondra [Sat, 28 Jan 2017 15:56:34 +0000 (16:56 +0100)]
add xl_bugs to expected/.gitignore and sql/.gitignore
Tomas Vondra [Sat, 28 Jan 2017 15:27:20 +0000 (16:27 +0100)]
add pgxc_clean/pgxc_ctl/pgxc_monitor binaries to .gitignore
Make the 'git status' a bit less verbose by ignoring the binaries,
and also th generated .c/.h files in pgxc_ctl.
Tomas Vondra [Sat, 28 Jan 2017 15:17:37 +0000 (16:17 +0100)]
remove unnecessary expected output for regression tests
There was quite a few unnecessary files in src/test/regress/expected,
not present at upstream, probably due to previous merges. As those
are most likely stale, let's just get rid of them.
Tomas Vondra [Sat, 28 Jan 2017 13:16:37 +0000 (14:16 +0100)]
minor fixes in 'cluster' regression suite (CREATE TABLE, extra .out files)
The 'CREATE TABLE clstr_tst' was wrong, as it needs to reference the
column and not the foreign key constraint. The table can't have a PK
on 'a' (does not include the distribution key), so remove that from
the expected list of constraints etc.
Also remove the unnecessary cluster_1.out and cluster_2.out files.
Tomas Vondra [Sat, 28 Jan 2017 12:55:11 +0000 (13:55 +0100)]
start both coordinators in pg_regress with is_main=false
The only difference seems to be that with is_main=false pg_regress
does not override list_addresses=*, which makes debugging of running
regression tests (e.g. when the tests get stuck) easier.
Tomas Vondra [Mon, 23 Jan 2017 00:54:30 +0000 (01:54 +0100)]
use log_line_prefix='%m ..' and log_error_verbosity='verbose' in pg_regress
This makes debugging of errors and crashes a tad easier. Timestamps
with milliseconds allow more precise correlation of events from
multiple log files (coordinators vs. datanodes). Verbose error log
messages include file and line number where the error was reported.
Tomas Vondra [Sun, 22 Jan 2017 23:51:29 +0000 (00:51 +0100)]
fix trivial failures in 'inherit' tests ('Sort Key' in EXPLAIN VERBOSE)
Tomas Vondra [Sun, 22 Jan 2017 22:57:31 +0000 (23:57 +0100)]
propagate distribution through UpperUniquePath and SetOpPath
Simply create a copy of the subpath's distribution, just like in most
other paths.
Tomas Vondra [Sun, 22 Jan 2017 22:54:39 +0000 (23:54 +0100)]
fix _readSetOp(), _readMergeAppend(), _readUnique() and _outSetOp()
Without this, the queries in 'union' test suite were crashing with
errors like "did not find '}' at end of input node" and so on. Fixed
mostly by reverting to XL 9.5 code.
Tomas Vondra [Sun, 22 Jan 2017 20:11:39 +0000 (21:11 +0100)]
properly redistribute WindowAgg input, fix plan change in window.out
We can't push down the whole WindowAgg (at least now for now), so
we need to redistribute the input properly. Luckily thanks to the
parallel aggregates, we get plans like this:
QUERY PLAN
-----------------------------------------------------------------
WindowAgg
-> Finalize GroupAggregate
Group Key: (tenk1.ten + tenk1.four)
-> Remote Subquery Scan on all (datanode_1,datanode_2)
-> Sort
Sort Key: ((tenk1.ten + tenk1.four))
-> Partial HashAggregate
Group Key: (tenk1.ten + tenk1.four)
-> Seq Scan on tenk1
(9 rows)
That however means mild breakage in window.out, due to the partial
aggregates etc.
Tomas Vondra [Sun, 22 Jan 2017 19:40:00 +0000 (20:40 +0100)]
make sure LockRows paths preserve distribution, update plans in xc_for_update
LockRows paths were not preserving the subpath's distribution, so fix
that. Also update minor plan difference (VERBOSE now shows sort key
for the redistribution step).
Tomas Vondra [Sun, 22 Jan 2017 19:17:03 +0000 (20:17 +0100)]
resolve failures in copy2 regression tests, get rid of copy2_1.out
A simple error in column name. The copy2_1.out was removed (or more
precisely, copy2.out was removed and copy2_1.out was renamed).
Tomas Vondra [Sun, 22 Jan 2017 19:11:37 +0000 (20:11 +0100)]
add space at the end of log_line_prefix in pg_regress
The GUC was missing a space at the end, so the log lines looke like
...,global_session=coord1_8874STATEMENT: ...
which is not particularly readable. With the fix it's
...,global_session=coord1_8874 STATEMENT: ...
Tomas Vondra [Sun, 22 Jan 2017 18:06:42 +0000 (19:06 +0100)]
remove the 'distribution' field from SubqueryScanPath
Considering there's a distribution field in every Path (which
the SubqueryScanPath includes), this seems unnecessary and quite
confusing, because sometimes we work with (Path*) directly, and
sometimes through the 'path' field. It took me quite a bit of
time to realize that when create_subqueryscan_path does
pathnode->distribution = X
it's not the same as doing
pathnode->path.distribution = X
Note: It's possible that we in fact need the separate field for
somehow transformed distribution?
Tomas Vondra [Sun, 22 Jan 2017 17:57:40 +0000 (18:57 +0100)]
add a redistribution missing in create_distinct_paths
A redistribution was missing in the loop handling paths already
sorted for the DISTINCT case. So add it there.
Tomas Vondra [Sun, 22 Jan 2017 17:52:20 +0000 (18:52 +0100)]
remove the comment about need for Sort in create_distinct_paths
The explicit Sort is not needed, because RemoteSubplan will take
care of that (using the built-in SimpleSort, if needed).
Tomas Vondra [Sun, 22 Jan 2017 17:36:24 +0000 (18:36 +0100)]
fix updatable_views.out to expect base_tbl to be replicated
Tomas Vondra [Sun, 22 Jan 2017 17:29:09 +0000 (18:29 +0100)]
fix width-related plan differences in xl_plan_pushdown tests
The plans only differ in width, and the new width values seem to
match the output. This was likely due to plan pushdown happening
in create_plan, after the planning is done.
Tomas Vondra [Sun, 22 Jan 2017 16:53:39 +0000 (17:53 +0100)]
accept valid distributed plans for LIMIT queries in regression tests
This typically means changing Limit->Scan and Limit->Sort plans to
Limit->Remote->Limit->Scan and Limit->Remote->Limit->Sort.
Tomas Vondra [Sun, 22 Jan 2017 16:52:18 +0000 (17:52 +0100)]
fix regression test breakage due to minor changes in output
Typically change in error message wording, description output etc.
Tomas Vondra [Sun, 22 Jan 2017 16:49:27 +0000 (17:49 +0100)]
fix ordering of results in the tsearch2 regression suite
Tomas Vondra [Sun, 22 Jan 2017 16:47:59 +0000 (17:47 +0100)]
accept valid plan changes to parallel queries in select_parallel
This typically means additional Remote Subquery Scan somewhere in
the plan.
Tomas Vondra [Sun, 22 Jan 2017 16:45:46 +0000 (17:45 +0100)]
fix tests broken by the unification of role names
A bunch of regression suites were broken by upstream changes to role
names used for testing.
Tomas Vondra [Sun, 22 Jan 2017 16:43:19 +0000 (17:43 +0100)]
add pgxc_prepared_xact results to prepared_xacts expected output
The SQL file was calling pgxc_prepared_xact(), but the results were
not included in the expected file.
Tomas Vondra [Sun, 22 Jan 2017 16:40:59 +0000 (17:40 +0100)]
fix failures in 'privileges' test suite, related to large objects
Postgres-XL currently does not support large objects, so accept the
'not supported' errors in the output.
Tomas Vondra [Sun, 22 Jan 2017 16:35:34 +0000 (17:35 +0100)]
resolve failures in 'gist' test suite by making them deterministic
As described in
e39c4afcfa, the results were platform dependent.
Tomas Vondra [Sun, 22 Jan 2017 16:31:06 +0000 (17:31 +0100)]
resolve failures in the 'foreign_key' test suite
The temporary table t2 needs to be distributed on (b) because of
the foreign key. Remove unnecessary foreign_key_1.out.
Tomas Vondra [Sun, 22 Jan 2017 16:26:34 +0000 (17:26 +0100)]
resolve failures in the 'rules' test suite
Accept the less fancy output in ruleutils tests, and remove the
rules_2.out expected output.
Tomas Vondra [Sun, 22 Jan 2017 16:22:43 +0000 (17:22 +0100)]
fix cluster regression test by properly distributing the clstr_tst
Without the DISTRIBUTE BY clause, the table was distributed by the
first column, resulting in a failure due to the foreign key.
Tomas Vondra [Sun, 22 Jan 2017 16:19:11 +0000 (17:19 +0100)]
restore expected output for groupingsets test suite from upstream
Since
44fd89e21 (and a few additional commits), XL should generate
correct paths for grouping sets instead of rejecting them as
unsupported feature. So restore the expected output from upstream
as XL should produce the same results.
Tomas Vondra [Sun, 22 Jan 2017 16:14:12 +0000 (17:14 +0100)]
fix foreign_data regression tests, remove foreign_data_1.out
As XL does not support FDW, the resolution was mostly about adding
the 'does not exist' errors as narrower table listings as expected.
Also remove the foreign_data_1.out, mostly for the same reasons as
arrays_1.out (unnecessary and stale).
Tomas Vondra [Sun, 22 Jan 2017 16:03:34 +0000 (17:03 +0100)]
minor fixes in 'arrays' regression tests
The ORDER BY is not needed, because arr_tbl is replicated and so the
output is as stable as on regular PostgreSQL.
Remove arrays_1.out, as it's stale and unnecessary (arrays.out is
used anyway).
Tomas Vondra [Fri, 20 Jan 2017 23:55:23 +0000 (00:55 +0100)]
fix output of json_agg(), remove unused json_agg_* prototypes etc.
json_agg_transfn() was relying on json_agg_collectfn() to generate
the initial '[' for the output, but that only worked with the old
XL-specific implementation. With the new implementation adopted from
upstream, this lead to output like this:
{"x":1},{"x":2}]
Fixed by removing the ifdef in json_agg_transfn(). While doing that,
I've noticed the json_agg_collectfn() is still in json.c, although
it's not needed anymore, and similarly for json_agg_in/out prototypes.
Tomas Vondra [Fri, 20 Jan 2017 23:34:46 +0000 (00:34 +0100)]
fix 'ERROR: unrecognized token' failures in _readWindowAgg()
The _outWindowAgg() and _readWindowAgg() were inconsistent, resulting
in failures in stringToNode(). In particular _outWindowAgg() was
producing output with tokens :partOperations and :ordOperations, while
_readWindowAgg() expected :partOperators and :ordOperators.
Perhaps more importantly, _readWindowAgg() used READ_OID_ARRAY marco
to parse :ordOperators, but the macro ignores portable_output flag.
So parsing the string failed.
Fixed mostly by reverting to the XL 9.5 code.
Tomas Vondra [Fri, 20 Jan 2017 23:18:41 +0000 (00:18 +0100)]
enforce sorting when RemoteSubplan feeds data to GroupAggregate
When the second step of distributed aggregation is executed using
GroupAggregate, the RemoteSubplan needs to produce sorted output.
We might insert Sort node above the RemoteSubplan, but we don't
do that as we wan to push as much work to the remote side.
Therefore the remote part of the plan needs to produce sorted
output, and RemoteSubplan needs to perform merge-sort. For the
code generating remote parallel paths, this was already done,
because the code injects an explicit sort on top of the Gather
when the second phase is GroupAggregate.
But for the special XL plans, using partial aggregates without
the Gather, this was not happening, so when the partial path was
executed using HashAggregate, the RemoteSubplan did not know
it needs to enforce sorted output.
Fixed by adding an explicit Sort node as the last step of the
remote path.
Tomas Vondra [Fri, 20 Jan 2017 22:46:23 +0000 (23:46 +0100)]
display sort keys for 'Remote Subquery Scan' nodes in EXPLAIN VERBOSE
This makes analysis of query plans a bit easier, as it shows which
remote plans may be sorted automatically (using merge-sort built
into the remote subplan). Only shows this for VERBOSE output though,
because in most cases it's obvious how the data is sorted.
Tomas Vondra [Fri, 20 Jan 2017 15:02:17 +0000 (16:02 +0100)]
fix the segfault crashes in
6de274a02
In short, create_remotescan_plan() was a bit confused when adding
the explicit Sort node, passing it the remote subplan node to
prepare_sort_from_pathkeys() while instead it should pass it the
subplan.
That resulted in building remote plans like this:
{REMOTESUBPLAN :startup_cost 188.23 :total_cost 243.52 ...
:targetlist ({TARGETENTRY :expr {VAR ...}}
{TARGETENTRY :expr {VAR ...}}
{TARGETENTRY :expr {OPEXPR ...}})
:lefttree {SORT :startup_cost 88.23 :total_cost 91.41 ...
:targetlist ({TARGETENTRY :expr {VAR ...}}
{TARGETENTRY :expr {VAR ...}})
:lefttree {APPEND ... }
...}
:distributionType H
:distributionKey 3
:distributionNodes (i 0 1)
:distributionRestrict (i 0 1)
:nodeList (i 0 1)
:execOnAll true
:sort {SIMPLESORT ...}
:cursor p_1_532c_2d :unique 0
}
Notice the Sort has only two targetlist entries, but distribution
key is 3. Which then triggers the segfault in PortalStart, when
accessing a non-existent target entry.
Fixed by making create_remotescan_plan sane again, and also fix
parameters passed to redistribute_path in set_joinpath_distribution,
although only in block wrapped in NOT_USED.
Also add a simple Assert() into PortalStart() to make detection of
such issues easier, instead of having to investigate segfaults.
Tomas Vondra [Fri, 20 Jan 2017 03:18:10 +0000 (04:18 +0100)]
WIP: mark Sort push-down (through RemoteSubplan) work again correctly
After moving the path redistribution and various bits to pathnode.c,
the explicit Sort nodes needed by Merge Joins with unsorted paths
were not properly pushed to the remote node (but on top of it).
Fixed by a bit of hacking set_joinpath_distribution(). Admittedly,
set_joinpath_distribution() is a mighty beast that would benefit
from a bit of refactoring.
This fixes most of the EXPLAIN regressions, although a bunch of
plans are still slightly different. But that seems to be either
because we managed to find a different (and slightly cheaper)
plan, or due to some other bug (likely in distribution handling).
FIXME This however seems to break portals/transactions regression
tests, resulting in strange crashes in PortalStart due to access
beyond the end of tupDesc->attrs:
#0 PortalStart (portal=0x124ddf0, params=0x0, eflags=0, snapshot=0x0) at pquery.c:791
791 InvalidOid :
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-18.fc23.x86_64
(gdb) bt
#0 PortalStart (portal=0x124ddf0, params=0x0, eflags=0, snapshot=0x0) at pquery.c:791
#1 0x00000000008720e5 in exec_bind_message (input_message=0x7ffec76f1790) at postgres.c:2217
#2 0x000000000087566f in PostgresMain (argc=1, argv=0x122f768, dbname=0x122f408 "regression", username=0x122f3e8 "user") at postgres.c:4788
#3 0x00000000007e7d63 in BackendRun (port=0x1255ae0) at postmaster.c:4496
#4 0x00000000007e74c7 in BackendStartup (port=0x1255ae0) at postmaster.c:4170
#5 0x00000000007e3887 in ServerLoop () at postmaster.c:1806
#6 0x00000000007e2ebc in PostmasterMain (argc=9, argv=0x122e390) at postmaster.c:1414
#7 0x00000000007103a0 in main (argc=9, argv=0x122e390) at main.c:228
(gdb) list
786
787 /*
788 * Set up locator if result distribution is requested
789 */
790 keytype = queryDesc->plannedstmt->distributionKey == InvalidAttrNumber ?
791 InvalidOid :
792 queryDesc->tupDesc->attrs[queryDesc->plannedstmt->distributionKey-1]->atttypid;
793 locator = createLocator(
794 queryDesc->plannedstmt->distributionType,
795 RELATION_ACCESS_INSERT,
(gdb) print queryDesc->plannedstmt->distributionKey
$1 = 3
(gdb) print queryDesc->tupDesc->natts
$2 = 2
In transaction.sql, this seems to be triggered by the 'revalidate_bug'
test case starting at like 378.
Tomas Vondra [Fri, 20 Jan 2017 03:11:06 +0000 (04:11 +0100)]
fix parsing of grpColIdx in _readGroup()
Parsing of :grpColIdx got broken during the merge conflict resolution.
Note: I don't see the bug, though. Reverting to code from XL 9.5 fixed
the issue (failing to find '}' at the end of the node) for me, but
maybe that's a fluke. So perhaps look at this again.
Tomas Vondra [Fri, 20 Jan 2017 01:34:39 +0000 (02:34 +0100)]
comment about a place adding unnecessary top-level RemoteSubpans
Tomas Vondra [Fri, 20 Jan 2017 01:15:02 +0000 (02:15 +0100)]
make sure explicit Sorts in Merge Append are pushed to remote nodes
When redistributing subpaths of Merge Append, an explicit sort may be
required. In that case we want to push it to the remote node instead
of performing it locally.
This was originally done in create_remotescan_plan(), i.e. after all
the planning was done. After the upper-planner pathification that's
not very elegant, so this patch moves this logic to pathnode.c.
The design follows what create_merge_append_path was already doing,
we only do the costing in redistribute_path() and the Sort node is
still construted within create_plan().
Tomas Vondra [Fri, 20 Jan 2017 01:00:13 +0000 (02:00 +0100)]
add _readSimpleSort() back to readfuncs.c (probably removed during merge)
Tomas Vondra [Thu, 19 Jan 2017 21:04:06 +0000 (22:04 +0100)]
make output of 'select' test suite stable by adding ORDER BY
In XL the order of rows may change depending on timing of responses
from remote nodes, unless the query has an explicit ORDER BY (on
PostgreSQL the regression tests often rely on reading the data from
disk in a single process).
Tomas Vondra [Thu, 19 Jan 2017 20:51:05 +0000 (21:51 +0100)]
remove 'missing redistribute_path' error from recurse_set_operations
I came to the conclusion that we probably don't need to redistribute
the path at this place. We're producing a generic subquery path with
particular distribution - if if a path with different distribution
is needed, it should be responsibility of the caller to do that.
The assumption is that there might be multiple possible plans, each
requiring a path with a different distribution.
Tomas Vondra [Thu, 19 Jan 2017 16:18:20 +0000 (17:18 +0100)]
obvious plan changes in inherit, aggregates, select and select_distinct
Update expected regression results in case of obvious plan changes,
as for example converting 2-phase aggregation to Final/Partial,
or wrapping the whole query into Remote Fast Query.
Tomas Vondra [Thu, 19 Jan 2017 14:47:01 +0000 (15:47 +0100)]
resolve most differences in insert_conflict regression test
The plan changes are either due to parallel query, or due to adding
a RemoteSubplan node at the top of the plan.
Tomas Vondra [Thu, 19 Jan 2017 14:30:06 +0000 (15:30 +0100)]
resolve most differences in create_index test suite
Most of the differences were trivial consequence of using partial
aggregates, instead of the XL-specific 2-phase aggregate. The
main difference is that that EXPLAIN now prefixes the Aggregate
nodes with either "Finalize" or "Partial" (while before it used
just Aggregate).
There's one remaining plan change, but that also switches from
plain Index Scan to Bitmap Index Scan. That probably deserves
a bit more attention and further investigation.
Tomas Vondra [Thu, 19 Jan 2017 13:44:02 +0000 (14:44 +0100)]
resolve failures in box regression tests (missing SP-GiST part)
The expected output was missing the new section, testing SP-GiST
on a box column. Make sure the table with test data is replicated
to get stable output.
Tomas Vondra [Thu, 19 Jan 2017 13:41:17 +0000 (14:41 +0100)]
resolve failures in timestamp/timestamptz regression tests
Update the SQL and expected output for XL, and remove unnecessary
whitespace (mostly at the end of line) to make diffing easier.
Tomas Vondra [Thu, 19 Jan 2017 13:08:37 +0000 (14:08 +0100)]
use text argument instead of cstring in pg_msgmodule_set()
opr_sanity checks that functions with cstring arguments are either
type input or conversion functions. Tweaking the function to accept
text seems like a better idea than relaxing opr_sanity tests.
Tomas Vondra [Thu, 19 Jan 2017 12:18:45 +0000 (13:18 +0100)]
mark pg_msgmodule_disable_all() as PROVOLATILE_IMMUTABLE
The pg_proc entry had provolatile=0, which is not a legal value,
so it was triggering an error in opr_sanity regression. Mark the
function as PROVOLATILE_IMMUTABLE.
Tomas Vondra [Thu, 19 Jan 2017 09:40:01 +0000 (10:40 +0100)]
skip RenameSequenceGTM() for SET SCHEMA with the same schema
Commit
bc4996e6 in upstream changed how CheckSetNamespace() handles
moving to the same namespace, e.g. when executing
ALTER SEQUENCE s1.s SET SCHEMA s1;
On PostgreSQL 9.5 this fails with an error, but since
bc4996e6 the
command is silently ignored (which was already the case for other
object types - ALTER EXTENSION etc).
XL however relied on the ERROR interrupting the program flow before
the call to RenameSequenceGTM(), so after
bc4996e6 it was executed
anyway.
GTM however does not expect unnecessary sequence renames, and treats
this as an error in seq_add_seqinfo.
Tomas Vondra [Thu, 19 Jan 2017 07:56:07 +0000 (08:56 +0100)]
add name of the originating node to application_name
Until now the application_name was set to 'pgxc' for all connections.
Set it to 'pgxc:NODE_NAME' instead, to make it easier to understand
data from pg_stat_activity.
Tomas Vondra [Wed, 18 Jan 2017 15:56:12 +0000 (16:56 +0100)]
fix reading of :mergeCollations in _readMergeJoin()
Fixes error observed in many regression tests
ERROR: did not find '}' at end of input node
Tomas Vondra [Wed, 18 Jan 2017 15:29:15 +0000 (16:29 +0100)]
properly decide when Grouping Sets do not require redistribution
When adding a GroupingSetsPath, an explicit redistribution is not
needed when either (distribution=NULL) or for replicated tables.
This resolves the crash in 'groupingsets' regression set, due to
triggering this assert in make_remotesubplan():
Assert(!equal(resultDistribution, execDistribution))
Tomas Vondra [Wed, 18 Jan 2017 14:59:29 +0000 (15:59 +0100)]
do not use Append with redistributed childrels as a partial path
Currently the subpaths are redistributed in create_append_path,
but even though that sets path.parallel_safe=false, it's too late
for set_append_rel_pathlist() to realize that.
It simply grabs the first path from childrel->partial_pathlist and
builds an AppendPath on top of that, assuming the resulting Append
is also parallel_safe.
But if the child relations had to be redistributed by adding a
RemoteSubplan on each of them, then the Append is parallel unsafe
and the attempt to add Gather on top of that fails because of
tripping on Assert(subpath->parallel_safe).
Make sure the result of create_append_path is also parallel safe
before adding it as a partial path.
I'd be surprised if there were no other places relying on the same
assumption. Ultimately, we probably want to build the redistributed
paths before calling create_append_path().
Tomas Vondra [Wed, 18 Jan 2017 11:32:42 +0000 (12:32 +0100)]
remove an extra lappend(subroots, subroot) from inheritance_planner
This resolves Assert() failure in create_modifytable_path(), causing
failures in regression tests (inherit, updatable_views, alter_table,
returning).
Tomas Vondra [Tue, 17 Jan 2017 20:06:37 +0000 (21:06 +0100)]
resolve crashes in queries involving merge-sort in a RemoteSubplan
Tomas Vondra [Mon, 16 Jan 2017 18:18:05 +0000 (19:18 +0100)]
temporarily merge all tuplesort changes from REL9_6_STABLE
Tomas Vondra [Mon, 16 Jan 2017 00:09:22 +0000 (01:09 +0100)]
add T_GroupingFunc to pgxc_shippability_walker
Planning grouping sets queries with grouping() function failed with
ERROR: XX000: unrecognized node type: 307
LOCATION: pgxc_shippability_walker, pgxcship.c:1199
STATEMENT: select grouping(a, b), count(*) from t1
group by grouping sets ((a), (b));
We simply inspect arguments of the grouping() function, although
that probably is not necessary - it can only reference grouping
expressions, which are checked as part of other nodes.
Tomas Vondra [Sun, 15 Jan 2017 23:07:47 +0000 (00:07 +0100)]
fix grouping sets planning by removing XL changes from make_sort()
Commit
44fd89e213 enabled grouping sets, but planning of some queries
failed in fix_upper_expr. This was happening when the grouping sets
need multiple sort orders - in that case create_groupingsets_plan
builds additional Sort nodes (using make_sort_from_groupcols), but
the XL code in make_sort() got confused by this and ended up with
a RemoteSubplan with an empty targetlist.
Removing the XL optimization from make_sort() resolve the planning
issue, producing plans like this:
QUERY PLAN
---------------------------------------------
GroupAggregate
Group Key: a
Sort Key: b
Group Key: b
-> Remote Subquery Scan on all (dn1,dn2)
-> Sort
Sort Key: a
-> Seq Scan on t1
(8 rows)
Which seems correct.
Tomas Vondra [Sun, 15 Jan 2017 03:53:51 +0000 (04:53 +0100)]
set pathtarget and parallel_aware/safe flags in redistribute_path
This fixes segfault in 'boolean' regression test suite.
Tomas Vondra [Sun, 15 Jan 2017 03:07:11 +0000 (04:07 +0100)]
if the whole grouping can be pushed down, don't construct XL paths
When the whole aggregate can be pushed to remote nodes, don't bother
constructing distributed 2-phase aggregate paths, because we can't
possibly beat the fully pushed-down path.
Tomas Vondra [Sun, 15 Jan 2017 02:59:49 +0000 (03:59 +0100)]
specify correct transtype for json_agg_transfn/json_agg_finalfn
Mistake in merge conflict resolution, which removed the custom XL
partial aggregate types, but failed to fix the aggregate itself.
Tomas Vondra [Sun, 15 Jan 2017 02:52:13 +0000 (03:52 +0100)]
fix 'cache lookup failed for function 0' errors during initdb
Commit
0e882bd2e02 broke planning for aggregates not supporting partial
aggregation (e.g. array_agg, which lacks the serialize, deserialize and
combine functions). The code attempted to initialize the costing info
anyway, which failed when attempting to fetch info about those functions
from syscache.
Introduce a new flag 'try_distributed_aggregation' akin to the existing
one for simple parallel case. This is a good idea anyway, because it'll
allow us to disable distributed paths if we happen to find out the whole
aggregate can be pushed down.
Tomas Vondra [Sat, 14 Jan 2017 23:22:32 +0000 (00:22 +0100)]
mark RemoteSubplan paths with parallel_safe=false and parallel_aware=false
We don't want to run parallel queries on the coordinators, particularly
when those queries query data nodes.
Tomas Vondra [Sat, 14 Jan 2017 23:04:06 +0000 (00:04 +0100)]
generate distributed grouping paths with a combine phase
Generate grouping paths with an extra aggregate step before the
remote subplan, looking like this:
Finalize GroupAggregate
-> Remote Subquery Scan
-> Combine Aggregate
-> Gather
-> Partial GroupAggregate
-> ...
The idea is that the combine phase may significantly reduce the
number of rows transferred to the coordinators (and processed on
them), possibly up to 1/N where N is the number of workers.
The plans don't seem to be really chosen at this point, most likely
because of incorrect costing that makes them look more expensive
than the paths without the combine step.
Tomas Vondra [Sat, 14 Jan 2017 19:51:55 +0000 (20:51 +0100)]
comment about generating grouping paths with extra combine phase
Tomas Vondra [Sat, 14 Jan 2017 18:56:35 +0000 (19:56 +0100)]
comment about cardinality estimates for 2-phase distributed aggregates
Tomas Vondra [Sat, 14 Jan 2017 16:24:19 +0000 (17:24 +0100)]
WIP: allow GROUPING SETS, ROLLUP and CUBE (without push-down)
This enables groupin sets path by removing the elog(ERROR, ...) from
transformGroupClause(). The paths are currently considered as not
eligible for push-down, irrespectedly of the distribution.
This could be improved by enabling pushdown if all the grouping sets
are compatible with the distribution (as if each grouping set was
a separate GROUP BY clause). So for example assuming a table 't'
distributed by column 'a', we could push down these grouping sets
GROUP BY GROUPING SETS ((a), (a, b), (a, c))
but not
GROUP BY GROUPING SETS ((b), (a, b), (a, c))
because the first grouping set does not contain the distribution key.
The grouping sets require sorted paths, so we get paths like this:
SELECT a, b, COUNT(c) FROM t GROUP BY GROUPING SETS ((a), (a, b));
QUERY PLAN
---------------------------------------------
GroupAggregate
Group Key: a, b
Group Key: a
-> Remote Subquery Scan on all (dn1,dn2)
-> Sort
Sort Key: a, b
-> Seq Scan on t2
(7 rows)
There's however a bug somewhere (likely in fix_upper_expr or elsewhere
in setrefs.c), resulting in failures like this:
SELECT a, b, COUNT(c) FROM t GROUP BY GROUPING SETS ((a), (b));
ERROR: variable not found in subplan target list
The only difference between the queries is removal of the distribution
key from the second grouping set.
Tomas Vondra [Sat, 14 Jan 2017 04:57:22 +0000 (05:57 +0100)]
generate distributed 2-phase grouping paths
Tomas Vondra [Sat, 14 Jan 2017 02:19:55 +0000 (03:19 +0100)]
fix output/read plans in _outAggref(), _readAggref() and _readAgg()
PostgreSQL 9.6 added new fields to Aggref for to partial aggregation,
and the functions were not reading them correctly.
Tomas Vondra [Sat, 14 Jan 2017 02:17:31 +0000 (03:17 +0100)]
make sure the distribution is propagated through grouping paths
When creating grouping paths (agg, group, gather, ...) the create_*_path
functions need to inherit the distribution from the subpath. Otherwise
we lose the information about the distribution and fail to create the
remote subplans at the top of the plan.
Tomas Vondra [Fri, 13 Jan 2017 23:37:07 +0000 (00:37 +0100)]
make grouping paths work again, without XL optimizations
Commit
d31927431b26af0c14f7a2abe6f2ee0af33f7b61 resolved some of the
conflicts in planner.c, caused by the pathification, by removing some
of the XL code. While this made the code compilable, it effectively
broken the aggregation on XL. This patch makes grouping paths work
again, in the sense that the generated paths should be correct and
return correct results.
The planner however does not generate advanced XL paths yet, in
particular it does not paths with 2-phase distributed aggregates
like this one:
Aggregate (2nd phase: combine and finalize)
-> Remote Subquery
-> Aggregate (1st phase: partial)
Thanks to the parallel aggregate, implemented in 9.6, the planner
however can generate queries like this:
Aggregate (2nd phase: combine and finalize)
-> Remote Subquery
-> Gather
-> Aggregate (1st phase: partial)
Tomas Vondra [Fri, 13 Jan 2017 07:33:30 +0000 (08:33 +0100)]
fix tuplesort_begin_merge() broken by rework of tuplesort in 9.6
The tuplesort got reworked quite a bit in 9.6, and this function
was subtly broken by the merge. In particular, it
* failed to set 'movetup' to valid function
* beginmerge() was always called with (finalMergeBatch=true)
* multiple arrays were left un-allocated
In total, this was causing segfaults when execRemote used the
merge sort.
Tomas Vondra [Fri, 13 Jan 2017 04:22:32 +0000 (05:22 +0100)]
introduce adjust_path_distribution(), adding RemoteSubplan as needed
This moves a huge chunk of code as a function, mostly for convenience.
Tomas Vondra [Fri, 13 Jan 2017 04:19:12 +0000 (05:19 +0100)]
rework grouping_distribution_match() to accept list of clauses
Accepting (Query*) is not sufficient, because we need to know which
of the clause lists to inspect (DISTINCT, GROUP BY, ...).
Note: Perhaps we should use PathKeys instead of those clause lists?
Tomas Vondra [Fri, 13 Jan 2017 04:15:31 +0000 (05:15 +0100)]
initialize state->getlen field in tuplesort_begin_cluster()
Not sure if this is an issue, but all other tuplesort_begin methods
do that, so presumably this one should do that too.
Tomas Vondra [Fri, 13 Jan 2017 03:59:50 +0000 (04:59 +0100)]
reintroduce XL logic into _readSort() to make Sort pushdown work
During merge, the XL portable_input infrastructure got removed from
_readSort(), resulting in errors like this
ERROR: did not find '}' at end of input node
when running queries with Sort push-down. Fixed by reintroducing
the coding from 9.5 branch, however this is hardly the last bug
in node functions.
Tomas Vondra [Thu, 12 Jan 2017 19:03:51 +0000 (20:03 +0100)]
grouping_distribution_match() and equal_distributions() comments
Add proper comments for the functions, used in planner.c.
Tomas Vondra [Thu, 12 Jan 2017 19:01:25 +0000 (20:01 +0100)]
remove grouping_distribution(), obsoleted by the pathification
The function was injecting Plan node (RemoteSubplan), but that
is incompatible with the pathified planner approach. The new
code uses grouping_distribution_match(), which only checks that
the grouping and distributions are compatible.
Tomas Vondra [Thu, 12 Jan 2017 11:25:38 +0000 (12:25 +0100)]
remove the extra & from start_postmaster() command in pg_ctl
The start_postmaster() function in pg_ctl constructed
snprintf(cmd, MAXPGPATH, "exec \"%s\" %s%s < \"%s\" 2>&1 &",
exec_path, pgdata_opt, post_opts, DEVNULL);
which resulted in extra fork thanks to the extra & at the end. Due to
this the 'pmpid' and 'pm_pid' in test_postmaster_connection() did not
match, resulting in failures like this:
pg_ctl: pg_ctl: could not start server