Pavan Deolasee [Wed, 19 Sep 2018 06:38:44 +0000 (12:08 +0530)]
Correct error message to use word "distributed" instead of "partitioned".
Per report by Pallavi Sontakke
Pallavi Sontakke [Fri, 14 Sep 2018 07:18:32 +0000 (12:48 +0530)]
Add test for Set Returning Functions
Fixes #204
Pavan Deolasee [Tue, 18 Sep 2018 10:31:43 +0000 (16:01 +0530)]
Ensure consistent output in the 'select_views' test case
We added ORDER BY clause to the query, but that returns different ordering
depending on the LC_COLLATE setting on a given machine. So enforce the correct
ordering by specifying the desired collation in the query itself.
Pavan Deolasee [Mon, 17 Sep 2018 13:44:39 +0000 (19:14 +0530)]
Correct date in the release notes
Pavan Deolasee [Mon, 17 Sep 2018 09:56:58 +0000 (15:26 +0530)]
Restamp Postgres-XL 10r1beta1 correctly this time.
Pavan Deolasee [Mon, 17 Sep 2018 09:56:32 +0000 (15:26 +0530)]
Correct some typos in the release notes
Pavan Deolasee [Fri, 14 Sep 2018 08:12:06 +0000 (13:42 +0530)]
Stamp Postgres-XL 10r1beta1
Pavan Deolasee [Mon, 17 Sep 2018 04:09:26 +0000 (09:39 +0530)]
Make COPYRIGHT changes
Pavan Deolasee [Fri, 14 Sep 2018 07:53:44 +0000 (13:23 +0530)]
Some updates to the release notes draft
Pavan Deolasee [Fri, 14 Sep 2018 05:51:02 +0000 (11:21 +0530)]
Fix couple of compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 05:29:48 +0000 (10:59 +0530)]
Fix some more compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 04:47:59 +0000 (10:17 +0530)]
Remove some unused variables, fix compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 04:43:54 +0000 (10:13 +0530)]
Fix a few compiler warnings
Pavan Deolasee [Thu, 13 Sep 2018 10:12:14 +0000 (15:42 +0530)]
Write draft release notes for Postgres-XL 10r1beta1
Pavan Deolasee [Thu, 13 Sep 2018 10:11:34 +0000 (15:41 +0530)]
Adding missing documentation for pgxl_remote_fetch_size
Pavan Deolasee [Wed, 12 Sep 2018 09:15:55 +0000 (14:45 +0530)]
Initialise a variable correctly.
This was leading to unexpected/unexplained crashes in the cluster monitor
process. Per reprot by Hengbing
Pavan Deolasee [Wed, 12 Sep 2018 08:55:57 +0000 (14:25 +0530)]
Disable FQS for explicit cursor queries.
The FQS mechanism is not currently handling cursors well as seen by failures in
the 'combocid' test case. The regular planner handles it well though. So for
now disable FQS for such queries.
Also remove the restriction that TID scan cannot be performed by non-FQS
queries. Regression does not throw up any errors. So remove the restriction
unless we see a counter example.
Tomas Vondra commented that this works fine in XL 9.5, so most likely we might
be failing to send a proper snapshot to the datanodes. Further investigations
required.
Pavan Deolasee [Wed, 12 Sep 2018 08:54:43 +0000 (14:24 +0530)]
Accept regression diffs in 'xml' test case
They are related to our lack of support for subtransactions (exception blocks)
and adding Remote Subquery Scan plan nodes.
Amruta Deolasee [Tue, 11 Sep 2018 06:02:56 +0000 (06:02 +0000)]
Test ALTER TABLE.. ADD PRIMARY KEY
Ensure that this sets columns NOT NULL on all child tables on all nodes
Amruta Deolasee [Mon, 10 Sep 2018 06:44:12 +0000 (06:44 +0000)]
test ALTER TYPE.. RENAME VALUE
Pavan Deolasee [Tue, 11 Sep 2018 09:21:09 +0000 (14:51 +0530)]
Run ANALYZE (COORDINATOR) on remote coordinators iff running outside a txn
block
We'd seen some distributed deadlocks when ANALYZE (COORDINATOR) is run on
remote coordinators, in parallel with VACUUM FULL on catalog tables. The
investigations so far indicate that the auxilliary datanode connections from
the other coordinators can lead to a distributed deadlock as the connections
from the originating coordinator still have the transaction open.
We now restrict remote analyze only when we are not running inside a user
transaction block. This allows us to commit the first transaction and then
start a new transaction to run ANALYZE (COORDINATOR) on the remote
coordinators, thus avoiding the distributed deadlock.
Per report from Pallavi Sontakke and lots of analysis by me, which may not
still be sufficient.
Pavan Deolasee [Tue, 11 Sep 2018 05:55:09 +0000 (11:25 +0530)]
Remove a problem test from 'inherit' test case
There is a problem with reading an anonymous record type, resulting in an error
such as "ERROR: input of anonymous composite types is not implemented".
This is an existing issue affecting even XL 9.5, but was masked so far by lack
of testing and also because we inadvertently accepted wrong output in XL 9.5.
For now, move the case to xl_known_bugs and open an issue to track this
potential long standing bug.
Pavan Deolasee [Fri, 7 Sep 2018 07:51:56 +0000 (13:21 +0530)]
Move failing tests from 'with' to 'xl_known_bugs'
There are two remaining failures in the 'with' test case. Both of these bugs
exist in XL 9.5 too and hence moving them to the xl_known_bugs test case. We
have issues open for these bugs.
Pavan Deolasee [Fri, 7 Sep 2018 06:12:38 +0000 (11:42 +0530)]
Adjust 'limit' test case to avoid regression failures.
There are two known problems in the test case.
1. When nextval() is pushed down to the remote nodes, the coordinator session
won't have seen a nextval() invokation and hence a subsequent currval() will
throw an ERROR or return a stale value.
2. When nextval() is pushed down to the remote nodes and a set of rows are
fetched (either ordered or otherwise), the results may be somewhat inconsistent
because each node will run nextval() concurrently and hence the ordering will
change depending on who runs when.
For now, change the queries to force nextval() evaluation on the coordinator.
Also move the original test cases to xl_known_bugs test case and we also have
an internal issue open to track this problem.
Pavan Deolasee [Fri, 7 Sep 2018 06:08:51 +0000 (11:38 +0530)]
Fetch sn_xcnt just once to ensure consistent msg
We're seeing some reports when fetching snapshot from the GTM hangs forever on
the client side, waiting for data which never arrives. One theory is that the
snapshot->sn_xcnt value changes while sending snapshot from the GTM, thus
causing a mismatch between what the server sends and what the client expects.
We fixed a similar problem in
1078b079d5476e3447bd5268b317eacb4c455f5d, but may
be it's not complete. Put in this experimental patch (which can't make things
any worse for sure) while we also investigate other bugs in that area.
Pavan Deolasee [Thu, 6 Sep 2018 08:12:48 +0000 (13:42 +0530)]
Fix problems associated with globalXmin tracking by ClusterMonitor
The very first report by the cluster monitor may be discarded by the GTM if the
reporting xmin has fallen far behind GTM's view. This leads to the globalXmin
value remaining Invalid in the shared memory state, as tracked by the
ClusterMonitor. ClusterMonitor process usually naps for CLUSTER_MONITOR_NAPTIME
(default 5s) between two successive reporting. But discard that during the
bootup process and report the xmin a bit more aggressively. This should in all
likelihood set the globalXmin correctly, before the regular backends start
processing.
The other major problem with the current code was that when the globalXmin
tracked in the shared memory state is Invalid, the callers were using
FirstNormalXid as the globalXmin. This could be disastrous especially when XID
counter has wrapped around. We could accidentally remove visible rows by using
a wrong value of globalXmin. We now fix that by computing the globalXmin using
the local state (just like we would have computed globalXmin in vanilla PG).
This should ensure that we never use a wrong or a newer value for globalXmin
than what is allowed.
Accept regression diff in txid test case resulting from the fix. The new
expected output actually matches with what upstream produces.
Per report by Hengbing and investigations/fix by me.
Pavan Deolasee [Wed, 5 Sep 2018 07:56:02 +0000 (13:26 +0530)]
Use correct format specifier for transaction ids
Pavan Deolasee [Wed, 5 Sep 2018 07:14:39 +0000 (12:44 +0530)]
Remove duplicate (and unused) declaration of XidStatus
Pavan Deolasee [Fri, 31 Aug 2018 09:42:42 +0000 (15:12 +0530)]
Force restart cluster monitor process if the GTM restarts
If GTM loses node registration information and returns
GTM_ERRCODE_NODE_NOT_REGISTERED to the coordinator/datanode, restart the
cluster monitor process (by simply exiting with exit code 0). This would ensure
that the cluster monitor re-registers with the GTM and start cleanly.
Per report by Virendra Kumar
Pavan Deolasee [Wed, 29 Aug 2018 05:58:38 +0000 (11:28 +0530)]
Ensure pgxc_node_str returns the result in correct format
Per report from Krzysztof Nienartowicz, this broke the pgxc_clean command on
XL10.
Pavan Deolasee [Thu, 23 Aug 2018 06:20:23 +0000 (11:50 +0530)]
Allow minimum value of pgxl_remote_fetch_size to zero
This is an experimental work. By setting pgxl_remote_fetch_size to 0, user can
fetch all rows at once from the remote side (instead of fetching a small number
at a time, thus giving the node to produce more rows as the previous set is
consumed). While fetching all rows at once is not very useful, it allows us to
circumvent PostgreSQL's limitation of not supporting parallel queries unless a
Portal is run once and to the end.
We do see hangs in regression while running with PGOPTIONS set to "-c
force_parallel_mode=on -c pgxl_remote_fetch_size=0", so there are issues that
need to be addressed. Also, it doesn't seem quite possible for users to
dynamically set pgxl_remote_fetch_size to enforce parallel query. So this is an
experimental feature that we don't expect users to heavily use, just yet.
Pavan Deolasee [Thu, 23 Aug 2018 05:24:37 +0000 (10:54 +0530)]
Ensure parallelModeNeeded flag is sent down to the remote node.
This still does not solve the problem that the datanodes don't make use of
parallel queries during distributed execution (they work OK if queries are
FQSed). That's because PostgreSQL does not support parallel queries when a
portal is used to fetch only a part of the result set. We need separate patches
to either fix or work-around that.
Pavan Deolasee [Wed, 22 Aug 2018 06:15:42 +0000 (11:45 +0530)]
Accept some more regression diffs in inherit.out
We changed one test case to use HASH distribution so that a primary key can be
defined. Another diff turned out to be a change from the upstream.
Pavan Deolasee [Tue, 21 Aug 2018 11:53:18 +0000 (17:23 +0530)]
Fix an assertion failure
Pavan Deolasee [Tue, 21 Aug 2018 09:54:13 +0000 (15:24 +0530)]
Ensure that RemoteSubplan is marked parallel unsafe.
We can't support multiple backends running RemoteSubplan in parallel. So block
that. In fact, we'd already done so by setting parallel_safe to false while
creating the path, but there is another code path which builds the plan node
directly and we'd missed that.
Pavan Deolasee [Mon, 20 Aug 2018 10:47:31 +0000 (16:17 +0530)]
Fix an oversight in
32025718755c4bbf100563fdc88e96225fc1b250
In passing, add test cases and fix other issues around fetching table sizes
from remote nodes. For example, temp handling and names requiring quoting was
broken for long too.
Per report by Virendra Kumar and further tests by me.
Pavan Deolasee [Mon, 20 Aug 2018 10:43:20 +0000 (16:13 +0530)]
Do not pushdown aggregates with SRFs in the targetlist
Pavan Deolasee [Fri, 17 Aug 2018 10:09:38 +0000 (15:39 +0530)]
Some minor changes to tsrf test case
We don't support SRF in VALUES clause yet. So make some minor adjustments to
the test case.
Pavan Deolasee [Fri, 17 Aug 2018 09:53:33 +0000 (15:23 +0530)]
Accept regression diffs in the sequence test case
These diffs are on the account of differences in the way we handle WAL logging
and sequences on the coordinator. There is scope for improvement here from
usability perspective, but these are not regressions from the past behaviour.
Pavan Deolasee [Thu, 16 Aug 2018 12:01:53 +0000 (17:31 +0530)]
Make improvements to sequence handling
We were pre-maturely checking for maximum and minimum values, thus throwing an
error when the maximum/minimum was still farther by one count. Fix that.
Improve the GTM error message by including the sequence name and the max/min
value reached. This though slightly changes the end user error message because
we also include database and schema in the sequence name at the GTM.
Make some changes to the way sequence is WAL logged. This is still not entirely
correct since we log every time a value is fetched from the GTM. But then WAL
logging is not strictly required in XL because sequence values are managed at
the GTM. In the old code, we were not WAL logging at all (though the code
existed and it was a bit confusing)
The sequence tuple is still not maintained correctly at the coordinator because
it may not know about the sequence values fetched and consumed by the
datanodes. But that's an existing problem and we should look at that
separately.
Pavan Deolasee [Tue, 14 Aug 2018 08:59:53 +0000 (14:29 +0530)]
Ensure table name is schema qualified while running remote ANALYZE
While sending down ANALYZE (COORDINATOR) command to the remote coordinator, we
must ensure that the table name is properly schema qualified. Also add a few
tests to confirm that this works as expected in various scenarios.
Per report by Virendra Kumar.
Pavan Deolasee [Thu, 9 Aug 2018 08:25:05 +0000 (13:55 +0530)]
Automatically trigger ANALYZE (COORDINATOR) on remote coordinators
One of the long standing problems in multi-coordinator setup is that when a
table is either manually ANALYZEd or auto-analyzed on a coordinator, the other
coordinators don't update their planner statistics automatically. This is even
a bigger problem with auto-analyze because a coordinator which is not involved
in any DMLs, may not be even aware about the changes to the table and hence it
will not pick up the table for auto-analyze, thus often generating very poor
query plans.
We now fix that by automatically running ANALYZE (COORDINATOR) command on the
remote coordinators when a table is either manually or automatically analyzed
on one coordinator. ANALYZE (COORDINATOR) does not force a ANALYZE on the
datanodes, but only rebuilts coordinator side stats using the current stats
available on the datanodes.
One problem with running ANALYZE on the remote coordinators in auto-analyze
process is that if one of the coordinators is down or not reachable, then it
will fail. This seems a bit too harsh because the worst that can happen is that
the unreachably coordinator will be left with stale stats. But that seems
better than letting auto-analyze fail on a running coordinator. We address this
by introducing a new mechanism by which a coordinator can execute a
command/query only on the currently available remote coordinators. We consult
the health-map to decide which coordinators are currently reachable. Since the
health-map itself can be stale for some short duration, there is a risk that
auto-analyze may still fail. But it shouldn't fail forever because the node
will be skipped next time.
Pavan Deolasee [Thu, 9 Aug 2018 08:22:28 +0000 (13:52 +0530)]
Fetch minmxid from remote nodes and compute local value
This was a TODO for quite sometime now. Just like we fetch relfrozenxid from
the remote nodes and compute a value at the coordinator, we now do the same for
multi-xid too.
Pavan Deolasee [Thu, 9 Aug 2018 08:21:11 +0000 (13:51 +0530)]
Remove some dead/commented code
Pavan Deolasee [Thu, 9 Aug 2018 07:33:58 +0000 (13:03 +0530)]
Correctly track XID obtained by autovac as one from the GTM
We were observing that a transaction started by the autovac process was left
open on the GTM, thus holding back xmin. This must have been a regression after
recent changes to track autocommit and regular transactions. We now correctly
track and close such transactions on the GTM.
Pavan Deolasee [Wed, 8 Aug 2018 05:17:21 +0000 (10:47 +0530)]
Add a test to do sanity check at the end
Right now it only contains a test to check replicated tables and confirm that
all nodes have the same number of rows. More to be added later.
Pavan Deolasee [Fri, 3 Aug 2018 08:58:54 +0000 (14:28 +0530)]
This reverts commit
85cd900fef99dab000bba7e2b6be541a03a48706.
I accidentally pushed a local change to the remote repo. So revert a revert.
Looks so stupid :-(
Pavan Deolasee [Fri, 3 Aug 2018 08:39:12 +0000 (14:09 +0530)]
Use correct path for tablspaces while creating a basebackup
In XL, we embed the nodename in the tablespace subdir name to ensure that
non-conflicting paths are created when multiple coordinators/datanodes are
running on the same server. The code to handle tablespace mapping in basebackup
was missing this support.
Per report and patch by Wanglin.
Pavan Deolasee [Thu, 2 Aug 2018 07:34:37 +0000 (13:04 +0530)]
Revert "Correct select the GTM proxy for a new node being added"
This reverts commit
edbe6703bf8d8749a807762135842b6a06309383.
Pavan Deolasee [Wed, 1 Aug 2018 05:17:21 +0000 (10:47 +0530)]
Fix couple of compile time warnings
Pavan Deolasee [Wed, 1 Aug 2018 05:17:06 +0000 (10:47 +0530)]
Move pgxc_clean to src/bin
Pavan Deolasee [Wed, 1 Aug 2018 05:16:55 +0000 (10:46 +0530)]
Move pgxc_ctl to src/bin
pgxc_ctl is now widely used for cluster management and hence it makes every
sense to have it installed by default. In passing, also fix several
problems/kludges in the Makefile
Pavan Deolasee [Tue, 31 Jul 2018 06:00:01 +0000 (11:30 +0530)]
Ensure that bad protocol ERROR message is sent to the frontend
In case of receiving bad protocol messages received by the GTM proxy, let
the client know about the error messages.
Pavan Deolasee [Tue, 31 Jul 2018 05:58:21 +0000 (11:28 +0530)]
Correct select the GTM proxy for a new node being added
This fixes an oversight in array index lookup. We should have been using
0-based indexes but were instead using 1-based index.
Pavan Deolasee [Mon, 30 Jul 2018 08:47:20 +0000 (14:17 +0530)]
Ensure partition child tables inherit distribution properties correctly
While in restore mode, that we use to load schema when a new node is added to
the cluster, the partition child tables should correctly inherit the
distribution properties from the parent table. This support was lacking, thus
leading to incorrect handling of such tables.
Per report by Virendra Kumar.
Pavan Deolasee [Fri, 27 Jul 2018 12:19:38 +0000 (17:49 +0530)]
Do not dump TO NODE clause for partition or child table
We missed this in the commit
c168cc8d58c6e0d9710ef0aba1b846b7174e0a79. So deal
with it now.
Pavan Deolasee [Fri, 27 Jul 2018 07:51:51 +0000 (13:21 +0530)]
Ensure qualified name for dumping sequence value
Without that the sequence won't be found correctly.
Pavan Deolasee [Fri, 27 Jul 2018 06:59:13 +0000 (12:29 +0530)]
Do not dump DISTRIBUTED BY for partition and inherited table
Child tables inherit the distribition property from the parent table. Even
more, XL doesn't support a syntax of the form PARTITION OF .. DISTRIBUTED BY
and doesn't allow child tables to have a distribution property different than
the parent. So attaching this clause to the partition table does not make any
sense.
Per report from Virendra Kumar.
Pavan Deolasee [Thu, 19 Jul 2018 09:31:07 +0000 (15:01 +0530)]
Teach pgxc_exec_sizefunc() to use pg_my_temp_schema() to get temp schema
Similar to what we did in
e688c0c23c962d425b82fdfad014bace4207af1d, we must not
rely on the temporary namespace on the coordinator since it may change on the
remote nodes. Instead we use the pg_my_temp_schema() function to find the
currently active temporary schema on the remote node.
Pavan Deolasee [Wed, 18 Jul 2018 05:04:24 +0000 (10:34 +0530)]
Make some adjustments to stats test case and accept regression diffs
We don't support SAVEPOINTs. So adjust a test case by removing the SAVEPOINT
and accept the differenece caused by the other. We also don't support updating
distribution key column, so add another column to the table and update that
column instead. The resulting stats output and other regression diffs look
sane.
Pavan Deolasee [Tue, 17 Jul 2018 11:01:57 +0000 (16:31 +0530)]
Fix an oversight in
47e01d7befddbe6
Looks like we forgot to update expected output file at one place for matview
test case.
Pavan Deolasee [Tue, 17 Jul 2018 04:56:50 +0000 (10:26 +0530)]
Teach pgxc_ctl to use the new --waldir option of pg_basebackup
PG 10 replaced --xlogdir with --waldir, but we forgot to update pgxc_ctl to use
the new syntax. This patch fixes that oversight.
Per report and analysis by Virendra Kumar and patch by Mark Wong.
Pavan Deolasee [Tue, 17 Jul 2018 04:47:40 +0000 (10:17 +0530)]
Fix handling of REFRESH MATERIALIZED VIEW CONCURRENTLY
We create a coordinator-only LOCAL temporary table for REFRESH MATERIALIZED
VIEW CONCURRENTLY. Since this table does not exist on the remote nodes, we must
not use explicit "ANALYZE <temptable>". Instead, just analyze it locally like
we were doing at other places.
Restore the matview test case to use REFRESH MATERIALIZED VIEW CONCURRENTLY now
that the underlying bug is fixed.
Pavan Deolasee [Mon, 16 Jul 2018 11:00:57 +0000 (16:30 +0530)]
Accept regression diffs in collate test case
The differences are similar to XL 9.5 and not regressions. The root cause for
the difference is that in XL one CREATE TABLE AS SELECT runs without error,
whereas it throws an ERROR in vanilla Postgres. So the number of dependent
objects dropped at the end changes.
The way CREATE TABLE AS SELECT is implemented in XL, we split it into two steps
of CREATE TABLE and INSERT INTO. CREATE TABLE simply derives the type
information from the SELECT query, and hence unlike Postgres, it doesn't fail.
Pavan Deolasee [Mon, 16 Jul 2018 10:28:41 +0000 (15:58 +0530)]
Accept failures in privileges test case.
These failures exists in XL 9.5 too and are not related to PG 10 merge or any
regression caused by it. These issues needs to be investigated though in more
detail.
Pavan Deolasee [Mon, 16 Jul 2018 08:29:05 +0000 (13:59 +0530)]
Accept regression diff in select_parallel test case
The error context message received from the remote datanode is not displayed.
This is a known limitation is XL currently.
Pavan Deolasee [Mon, 16 Jul 2018 07:45:15 +0000 (13:15 +0530)]
Give a different treatment to brin test case.
Since function invokations are not automatically sent down to the remote node,
brin_summarize_range() doesn't report the correct result. Instead we now use
EXECUTE DIRECT mechanism to execute the function on the remote node and get the
result back. The expected output is adjusted acoordingly. Also move
xc_create_function test case at the beginning to ensure the function to query
remote node is available for this test case.
Pavan Deolasee [Fri, 13 Jul 2018 09:55:31 +0000 (15:25 +0530)]
Use a different approach in identity test case for consistent output
Pavan Deolasee [Fri, 13 Jul 2018 09:48:09 +0000 (15:18 +0530)]
Accept regression diff in xml test case.
We don't yet support exception blocks in a procedure, which causes the error.
Pavan Deolasee [Fri, 13 Jul 2018 09:45:53 +0000 (15:15 +0530)]
Accept regression diff in plpgsql test case.
The preceding CREATE TRIGGER which affects the DELETE query's behaviour, thus
generating a different output. Accept that.
Pavan Deolasee [Fri, 13 Jul 2018 09:31:47 +0000 (15:01 +0530)]
Accept regression diffs in select_views test case
The output matches with the one obtained on PG 10 after adding the relevant
ORDER BY clause to the query.
Pavan Deolasee [Fri, 13 Jul 2018 08:35:36 +0000 (14:05 +0530)]
Accept regression diff in aggregates test case.
It simply adds a Remote Subquery Scan in the plan.
Pavan Deolasee [Fri, 13 Jul 2018 08:12:25 +0000 (13:42 +0530)]
Accept regression diffs in the triggers test case
The diffs are caused by known limitations such as:
- triggers not yet supported in XL
- DMLs are not supported in subqueries
As a side effect of triggers not working, output of some queries change. After
analysis, accept those changes too.
Pavan Deolasee [Tue, 10 Jul 2018 16:12:16 +0000 (21:42 +0530)]
Improve locking semantics in GTM and GTM Proxy
While GTM allows long jump in case of errors, we were not careful to release
locks currently held by the executing thread. That could lead to threads
leaving a critical section still holding a lock and thus causing deadlocks.
We now properly track currently held locks in the thread-specific information
and release those locks in case of an error. Same is done for mutex locks as
well, though there is only one that gets used.
This change required using a malloc-ed memory for thread-specific info. While
due care has been taken to free the structure, we should keep an eye on it for
any possible memory leaks.
In passing also improve handling of bad-protocol startup messages which may
have caused deadlock and resource starvation.
Pavan Deolasee [Tue, 10 Jul 2018 12:23:18 +0000 (17:53 +0530)]
Fix a compiler warning introduced in the previous commit
Pavan Deolasee [Tue, 10 Jul 2018 09:10:56 +0000 (14:40 +0530)]
Ensure that typename is schema qualified while sending row description
A row description messages contains the type information for the attributes in
the column. But if the type does not exist in the search_path then the
coordinator fails to parse the typename back to the type. So the datanode must
send the schema name along with the type name.
Per report and test case by Hengbing Wang @ Microfun.
Added a new test file and a few test cases to cover this area.
Pavan Deolasee [Mon, 18 Jun 2018 09:16:08 +0000 (14:46 +0530)]
Ensure pooler process follows consistent model for SIGQUIT handling
We'd occassionally seen that the pooler process fails to respond to SIGQUIT and
gets stuck in a non recoverable state. Code inspection reveals that we're not
following the model followed by rest of the background worker processes in
handling SIGQUIT. So get that fixed, with the hope that this will fix the
problem case.
Pavan Deolasee [Mon, 18 Jun 2018 09:14:08 +0000 (14:44 +0530)]
Properly quote typename before calling parseTypeString
Without this, parseTypeString() might throw an error or resolve to a wrong type
in case the type name requires quoting.
Per report by Hengbing Wang
Pavan Deolasee [Mon, 21 May 2018 06:41:40 +0000 (12:11 +0530)]
Remove some accidentally added elog(LOG) messages
Pavan Deolasee [Fri, 18 May 2018 09:30:36 +0000 (15:00 +0530)]
Fix broken implementation of recovery to barrier.
Per report from Hengbing, the current implementation of PITR recovery to a
BARRIER failed to correctly stop at the given recovery_target_barrier. It seems
there are two bugs here. 1) we failed to write the XLOG record correctly and 2)
we also failed to mark the end-of-recovery upon seeing the XLOG record during
the recovery.
Fix both these problems and also fix pg_xlogdump in passing to ensure we can
dump the BARRIER XLOG records correctly.
Pavan Deolasee [Fri, 18 May 2018 06:16:17 +0000 (11:46 +0530)]
Fix a long standing bug in vacuum/analyze of temp tables
The system may and very likely choose different namespace for temporary tables
on different nodes. So it was erroneous to explicitly add the coordinator side
nampspace to the queries constructed for fetching stats from the remote nodes.
A regression test was non-deterministically failing for this reason for long,
but only now we could fully understand the problem and fix it. We now use
pg_my_temp_schema() to derive the current temporary schema used by the remote
node instead of hardcoding that in the query using coordinator side
information.
Pavan Deolasee [Tue, 15 May 2018 12:25:35 +0000 (17:55 +0530)]
Accept regression diffs in join test case
The plans now look the same as vanilla PG except for additional Remote Fast
Query Execution nodes
Pavan Deolasee [Mon, 21 May 2018 06:21:42 +0000 (11:51 +0530)]
Accept regression diffs in plpgsql test case
The new output looks correct and has been fixed because of our work to get
transaction handling correct.
Pavan Deolasee [Mon, 21 May 2018 06:19:18 +0000 (11:49 +0530)]
Add expected changes to plpgsql.out missed in
0f65a7193da4b6b0a35b6446b4c904a9f5ac9bf6
Pavan Deolasee [Tue, 15 May 2018 10:30:12 +0000 (16:00 +0530)]
Accept regression diff.
We no longer see "DROP INDEX CONCURRENTLY cannot run inside a transaction
block" if the index does not exists and we're running DROP IF EXISTS
command
Pavan Deolasee [Fri, 18 May 2018 11:40:16 +0000 (17:10 +0530)]
Fix post-cherry-pick problems.
Pavan Deolasee [Mon, 9 Apr 2018 10:42:54 +0000 (16:12 +0530)]
Track clearly whether to run a remote transaction in autocommit or a block
Chi Gao and Hengbing Wang reported certain issues around transaction handling
and demonstrated via xlogdump how certain transactions were getting marked
committed/aborted repeatedly on a datanode. When an already committed
transaction is attempted to be aborted again, it results in a PANIC. Upon
investigation, this uncovered a very serious yet long standing bug in
transaction handling.
If the client is running in autocommit mode, we try to avoid starting a
transaction block on the datanode side if only one datanode is going to be
involved in the transaction. This is an optimisation to speed up short queries
touching only a single node. But when the query rewriter transforms a single
statement into multiple statements, we would still (and incorrectly) run each
statement in an autocommit mode on the datanode. This can cause inconsistencies
when one statement commits but the next statement aborts. And it may also lead
to the PANIC situations if we continue to use the same global transaction
identifier for the statements.
This can also happen when the user invokes a user-defined function. If the
function has multiple statements, each statement will run in an autocommit
mode, if it's FQSed, thus again creating inconsistency if a following statement
in the function fails.
We now have a more elaborate mechanism to tackle autocommit and transaction
block needs. The special casing for force_autocommit is now removed, thus
making it more predictable. We also have specific conditions to check to ensure
that we don't mixup autocommit and transaction block for the same global xid.
Finally, if a query rewriter transforms a single statement into multiple
statements, we run those statements in a transaction block. Together these
changes should help us fix the problems.
Pavan Deolasee [Tue, 17 Apr 2018 10:19:45 +0000 (15:49 +0530)]
Access regression diffs in select_parallel
These are simple addition to RemoteSubquery Scan nodes to the EXPLAIN plans
Pavan Deolasee [Tue, 17 Apr 2018 10:06:22 +0000 (15:36 +0530)]
Accept expected output differences for EXPLAIN
After the merge with 10.3, we see reduction in the redundant columns in the
targetlists for certain nodes. While we are yet to fully understand the
underlying change that's causing this, the changes loook quite benign and in
fact correct. So accept the changes.
Pavan Deolasee [Tue, 17 Apr 2018 09:16:25 +0000 (14:46 +0530)]
Accept regression diffs in alter_table test
Just addition of RemoteSubquery scan nodes.
Pavan Deolasee [Tue, 17 Apr 2018 09:05:06 +0000 (14:35 +0530)]
Accept regression diffs in foreign_data
We don't yet support FDWs in XL and the diffs are artifacts of that
Pavan Deolasee [Tue, 17 Apr 2018 08:27:31 +0000 (13:57 +0530)]
Accept some trivial expected output changes
This is about displaying table distribution and adding RemoteSubquery scans
Pavan Deolasee [Tue, 17 Apr 2018 06:51:39 +0000 (12:21 +0530)]
Merge tag 'REL_10_3'
Pavan Deolasee [Mon, 9 Apr 2018 11:35:13 +0000 (17:05 +0530)]
Do not send the new protocol message to non-XL client.
The new message 'W' to report waited-for XIDs must not be sent to a non-XL
client since it's not capable of handling that and might just cause unpleasant
problems. In fact, we should change 'W' to something else since standard libpq
understands that message and hangs forever expecting more data. With a new
protocol message, it would have failed, thus providing a more user friend
error. But postponing that for now since we should think through implications
of protocol change carefully before doing that.
Tom Lane [Mon, 26 Feb 2018 22:10:47 +0000 (17:10 -0500)]
Stamp 10.3.
Tom Lane [Mon, 26 Feb 2018 17:22:39 +0000 (12:22 -0500)]
Schema-qualify references in test_ddl_deparse test script.
This omission seems to be what is causing buildfarm failures on crake.
Security: CVE-2018-1058
Tom Lane [Mon, 26 Feb 2018 17:14:05 +0000 (12:14 -0500)]
Last-minute updates for release notes.
Security: CVE-2018-1058
Noah Misch [Mon, 26 Feb 2018 15:39:44 +0000 (07:39 -0800)]
Document security implications of search_path and the public schema.
The ability to create like-named objects in different schemas opens up
the potential for users to change the behavior of other users' queries,
maliciously or accidentally. When you connect to a PostgreSQL server,
you should remove from your search_path any schema for which a user
other than yourself or superusers holds the CREATE privilege. If you do
not, other users holding CREATE privilege can redefine the behavior of
your commands, causing them to perform arbitrary SQL statements under
your identity. "SET search_path = ..." and "SELECT
pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one
can use either as the first command of a session. As special
exceptions, the following client applications behave as documented
regardless of search_path settings and schema privileges: clusterdb
createdb createlang createuser dropdb droplang dropuser ecpg (not
programs it generates) initdb oid2name pg_archivecleanup pg_basebackup
pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready
pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby
pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb
vacuumlo. Not included are core client programs that run user-specified
SQL commands, namely psql and pgbench. PostgreSQL encourages non-core
client applications to do likewise.
Document this in the context of libpq connections, psql connections,
dblink connections, ECPG connections, extension packaging, and schema
usage patterns. The principal defense for applications is "SELECT
pg_catalog.set_config('search_path', '', false)", and the principal
defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC".
Either one is sufficient to prevent attack. After a REVOKE, consider
auditing the public schema for objects named like pg_catalog objects.
Authors of SECURITY DEFINER functions use some of the same defenses, and
the CREATE FUNCTION reference page already covered them thoroughly.
This is a good opportunity to audit SECURITY DEFINER functions for
robust security practice.
Back-patch to 9.3 (all supported versions).
Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy
Sharoglazov.
Security: CVE-2018-1058
Noah Misch [Mon, 26 Feb 2018 15:39:44 +0000 (07:39 -0800)]
Empty search_path in Autovacuum and non-psql/pgbench clients.
This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects. Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser. This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".
This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump(). If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear. Users may fix such errors
by schema-qualifying affected names. After upgrading, consider watching
server logs for these errors.
The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.
Back-patch to 9.3 (all supported versions).
Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.
Security: CVE-2018-1058
Tom Lane [Mon, 26 Feb 2018 15:18:22 +0000 (10:18 -0500)]
Avoid using unsafe search_path settings during dump and restore.
Historically, pg_dump has "set search_path = foo, pg_catalog" when
dumping an object in schema "foo", and has also caused that setting
to be used while restoring the object. This is problematic because
functions and operators in schema "foo" could capture references meant
to refer to pg_catalog entries, both in the queries issued by pg_dump
and those issued during the subsequent restore run. That could
result in dump/restore misbehavior, or in privilege escalation if a
nefarious user installs trojan-horse functions or operators.
This patch changes pg_dump so that it does not change the search_path
dynamically. The emitted restore script sets the search_path to what
was used at dump time, and then leaves it alone thereafter. Created
objects are placed in the correct schema, regardless of the active
search_path, by dint of schema-qualifying their names in the CREATE
commands, as well as in subsequent ALTER and ALTER-like commands.
Since this change requires a change in the behavior of pg_restore
when processing an archive file made according to this new convention,
bump the archive file version number; old versions of pg_restore will
therefore refuse to process files made with new versions of pg_dump.
Security: CVE-2018-1058