postgres-xl.git
9 years agoFix a memory leak in GTM proxy
Pavan Deolasee [Fri, 6 May 2016 12:36:37 +0000 (18:06 +0530)]
Fix a memory leak in GTM proxy

When two lists are concatnated, we might leak header of the second list since
only the list cells are concatnated. We must be careful not to free the list if
list_concat returned the to-be-concatnated list as-is.

9 years agoHonour shared queue refcount while deciding to remove it from hash table when
Pavan Deolasee [Thu, 5 May 2016 10:02:08 +0000 (15:32 +0530)]
Honour shared queue refcount while deciding to remove it from hash table when
producer unbinds

It's possible that another execution of the portal may start just while we are
unbinding. Hence premature removal of the shared queue should be avoided.

9 years agoMake minimum values of shared_queues and shared_queue_size GUC parameters
Pavan Deolasee [Thu, 5 May 2016 09:55:39 +0000 (15:25 +0530)]
Make minimum values of shared_queues and shared_queue_size GUC parameters
dependent on other settings

shared_queue_size is dependent on the number of datanodes in the cluster since
each datanode may attach itself as a consumer of the shared queue. So the
shared_queue_size now signifies per-datanode value and the actual value used
will be (max_datanodes * shared_queue_size). Existing users should modify their
settings after taking this into consideration.

Similarly, shared_queues highly depends on the number of concurrent queries. We
now conservatively set this to at least 1/4th of max_connections or user
specified value, whichever is higher.

9 years agoAdd a ref count mechanism to deal with situations where a Shared Queue is
Pavan Deolasee [Thu, 5 May 2016 05:32:04 +0000 (11:02 +0530)]
Add a ref count mechanism to deal with situations where a Shared Queue is
acquired but never bound by any of the node, thus causing leakage

To be honest, this area requires further work. The way things are currently
setup, producer and consumers all bind to a shared queue, but only producer
eventually unbinds. The implementation has logic to wait out for consumers
before destroying a shared queue. While this is okay, a more defined entry and
exit points are required for producer and consumers.

The code also today relies on timeouts to handle the case where a consumer
never binds to a shared queue, thus causing large delays. These delays are more
prominent for very short queries.

9 years agoCorrect shared memory size calculation for Shared Queue hashtable.
Pavan Deolasee [Thu, 5 May 2016 03:48:34 +0000 (09:18 +0530)]
Correct shared memory size calculation for Shared Queue hashtable.

9 years agoExtend CLog, Subtrans log and CommitTsLog appropriately when an XID is received
Pavan Deolasee [Wed, 4 May 2016 18:36:26 +0000 (00:06 +0530)]
Extend CLog, Subtrans log and CommitTsLog appropriately when an XID is received
from a remote node

9 years agoFix a nasty bug that was zeroing out clog and subtrans pages, thus causing
Pavan Deolasee [Wed, 4 May 2016 12:17:02 +0000 (17:47 +0530)]
Fix a nasty bug that was zeroing out clog and subtrans pages, thus causing
various sorts of data corruptions.

The bug dates back to the XC days, but probably became prominent in XL because
of certain recent changes. In XC/XL, a node may not see all the XIDs and hence
clog/subtrans log must be extended whenever a new XID crosses the previously
seen page boundary. We do this by comparing the pageno where the new XID maps
with the latest_page_no as stored in the shared SLRU data structure. But to
handle XID wrap-arounds, we added a check for difference in number of pages to
be less than CLOG_WRAP_CHECK_DELTA, which was incorrectly defined as
(2^30 / CLOG_XACTS_PER_PAGE). Note that "^" is a logical XOR operator in C and
hence this was returned a very small number of 28, thus causing incorrect
zeroing of pages if ExtendCLOG is called with an XID which is older than what
28 clog pages can hold. All such transactions would suddenly be marked as
aborted, resulting in removal of perfectly valid tuples.

This patch fixes the mess by just relying on built-in routines for checking
XID wrap-arounds.

I also found another issue while working on this. We must not only zero the
page at hand, but also all intermediate pages because we won't this opportunity
later if an intermediate XID is seen.

In our test setup, this seems to help some of the recent reports of data
corruption, including "missing attributes" errors.

9 years agoIt may happen that we try to read the status of a transaction
Mason Sharp [Mon, 2 May 2016 18:44:53 +0000 (14:44 -0400)]
It may happen that we try to read the status of a transaction
in clog before the page has been committed.

There are places in the code that try to extend clog,
but here we simply just do one retry if it looks like
we failed to read the desired page.

9 years agoUpdate release notes and also correct product name to Postgres-XL 9.5r1
Pavan Deolasee [Fri, 15 Apr 2016 04:40:35 +0000 (10:10 +0530)]
Update release notes and also correct product name to Postgres-XL 9.5r1

9 years agoFix yet another memory leak in the shared queue producer path.
Pavan Deolasee [Thu, 14 Apr 2016 09:19:35 +0000 (14:49 +0530)]
Fix yet another memory leak in the shared queue producer path.

9 years agoFix another memory leak in executor.
Pavan Deolasee [Thu, 14 Apr 2016 09:00:05 +0000 (14:30 +0530)]
Fix another memory leak in executor.

9 years agoPlug a memory leak that might help OOM situations in ALTER TABLE .. ADD NODE
Pavan Deolasee [Thu, 14 Apr 2016 07:15:18 +0000 (12:45 +0530)]
Plug a memory leak that might help OOM situations in ALTER TABLE .. ADD NODE
case

Report by Florian Iragne

9 years agoTest no more uses 'start' command for gtm slave
Pallavi Sontakke [Wed, 13 Apr 2016 09:36:51 +0000 (15:06 +0530)]
Test no more uses 'start' command for gtm slave

'pgxc_ctl start' command is no more needed to start
gtm slave, with recent code changes.

9 years agoDon't use special marker "none" while updating max_wal_senders in
Pavan Deolasee [Wed, 13 Apr 2016 06:29:56 +0000 (11:59 +0530)]
Don't use special marker "none" while updating max_wal_senders in
postgresql.conf via pgxc_ctl.

Instead use "0" if the variable is not set or set to "none"

9 years agoMake 'help add' more explanatory
Pallavi Sontakke [Wed, 13 Apr 2016 05:38:46 +0000 (11:08 +0530)]
Make 'help add' more explanatory

Help user to supply 'slave_name' in
'pgxc_ctl add gtm slave', different from others
where original node name is expected.

Fixes #85

9 years agoAvoid removing directories for some pgxc_ctl calls, just as an added protection
Pavan Deolasee [Tue, 12 Apr 2016 15:43:22 +0000 (21:13 +0530)]
Avoid removing directories for some pgxc_ctl calls, just as an added protection
if user makes a mistake

9 years agoCheck for 'status' and not return value of waitpid() function
Pavan Deolasee [Tue, 12 Apr 2016 12:53:52 +0000 (18:23 +0530)]
Check for 'status' and not return value of waitpid() function

9 years agoSuppress the message hinting to start coordinator/datanode/gtm server at the
Pavan Deolasee [Tue, 12 Apr 2016 11:03:51 +0000 (16:33 +0530)]
Suppress the message hinting to start coordinator/datanode/gtm server at the
end of initdb/initgtm when the commands are run via pgxc_ctl

This can be confusing to the user. We use an environment varibale
PGXC_CTL_SILENT to silence the message instead of adding a new option.

9 years agoAdd check against accidental start of GTM with an XID lower than what it's
Pavan Deolasee [Tue, 12 Apr 2016 10:45:49 +0000 (16:15 +0530)]
Add check against accidental start of GTM with an XID lower than what it's
saved in its control file.

User must now explicitly specify -f option to forcefully start GTM with the
given value. This should protect users from incorrect usage of the -x option
(like we saw in a recent bug report)

9 years agoFix a typo in the log message during datanode failover
Pavan Deolasee [Tue, 12 Apr 2016 10:12:04 +0000 (15:42 +0530)]
Fix a typo in the log message during datanode failover

9 years agoReduce log level for a message during initdb
Pavan Deolasee [Tue, 12 Apr 2016 10:09:46 +0000 (15:39 +0530)]
Reduce log level for a message during initdb

9 years agoAdd an alternate expected file for aggregates test on sunos
Pavan Deolasee [Mon, 11 Apr 2016 05:49:06 +0000 (11:19 +0530)]
Add an alternate expected file for aggregates test on sunos

Patch by Patrick SodrĂ©

9 years agoMake changes and bug fixes to let compilation and regression run on smartos
Pavan Deolasee [Mon, 11 Apr 2016 05:14:59 +0000 (10:44 +0530)]
Make changes and bug fixes to let compilation and regression run on smartos

We don't yet officially support the platform, given very little testing done so
far on this platform. But we don't stop others to doing it either. So
committing these changes upstream.

Reports, investigation and patches by Patrick SodrĂ©.

9 years agoDo not add a spurious ';' when not cleaning WAL directory for a datanode
Pavan Deolasee [Sun, 10 Apr 2016 04:46:27 +0000 (10:16 +0530)]
Do not add a spurious ';' when not cleaning WAL directory for a datanode

9 years agoTest: Change command to start GTM standby.
Pallavi Sontakke [Thu, 7 Apr 2016 10:35:09 +0000 (16:05 +0530)]
Test: Change command to start GTM standby.

Use temporary PGXC_CTL_HOME for test.

9 years agoAdd test for GTM standby
Pallavi Sontakke [Wed, 6 Apr 2016 06:43:43 +0000 (12:13 +0530)]
Add test for GTM standby

9 years agoModify tests
Pallavi Sontakke [Fri, 1 Apr 2016 11:28:38 +0000 (16:58 +0530)]
Modify tests

Remove cluster-cleanup at start.
Extract PGXC_CTL_HOME from ENV.

9 years agoExtend the array for various slave variables to match the size of the master
Pavan Deolasee [Fri, 1 Apr 2016 06:26:57 +0000 (11:56 +0530)]
Extend the array for various slave variables to match the size of the master
array.

This fixes the problem when a slave for only one master datanode or coordinator
is added, as demonstrated by the tap tests

9 years agoextendVar should only reset val_used only when newSize is greater than the
Pavan Deolasee [Fri, 1 Apr 2016 06:25:43 +0000 (11:55 +0530)]
extendVar should only reset val_used only when newSize is greater than the
current value of val_used

9 years agoAccept -m option to pgxc_ctl, but let "stop" command handle the rest
Pavan Deolasee [Fri, 1 Apr 2016 06:24:59 +0000 (11:54 +0530)]
Accept -m option to pgxc_ctl, but let "stop" command handle the rest

9 years agoCorrect a comment added to pgxc_ctl.conf upon coordinator master addition
Pavan Deolasee [Thu, 31 Mar 2016 14:18:22 +0000 (19:48 +0530)]
Correct a comment added to pgxc_ctl.conf upon coordinator master addition

9 years agoAdd test for pgxc_ctl minimal config
Pallavi Sontakke [Wed, 30 Mar 2016 07:04:11 +0000 (12:34 +0530)]
Add test for pgxc_ctl minimal config

Add some more cleanup to TAP tests.

9 years agoAvoid pre-mature line truncation in the auto generated INSTALL file
Pavan Deolasee [Tue, 29 Mar 2016 09:22:58 +0000 (14:52 +0530)]
Avoid pre-mature line truncation in the auto generated INSTALL file

9 years agoImprove draft release notes for upcoming beta2 release
Pavan Deolasee [Tue, 29 Mar 2016 07:16:20 +0000 (12:46 +0530)]
Improve draft release notes for upcoming beta2 release

9 years agoIn the installation guide, use datanode names that are consistent with what we
Pavan Deolasee [Tue, 29 Mar 2016 07:12:39 +0000 (12:42 +0530)]
In the installation guide, use datanode names that are consistent with what we
use in regression tests, for sanity.

9 years agoAdd missing steps to create information about the coordinator node on the
Pavan Deolasee [Tue, 29 Mar 2016 07:00:44 +0000 (12:30 +0530)]
Add missing steps to create information about the coordinator node on the
datanodes in installation guide

9 years agoDraft release notes which includes bug fixes and improvements since r1beta1 release
Pavan Deolasee [Mon, 28 Mar 2016 12:35:47 +0000 (18:05 +0530)]
Draft release notes which includes bug fixes and improvements since r1beta1 release

9 years agoAdd TAP test for pgxc_ctl
Pallavi Sontakke [Mon, 28 Mar 2016 13:13:41 +0000 (18:43 +0530)]
Add TAP test for pgxc_ctl

Test add/remove nodes and replicas

9 years agoCorrect URL to Postgres-XL online release notes
Pavan Deolasee [Mon, 28 Mar 2016 12:09:28 +0000 (17:39 +0530)]
Correct URL to Postgres-XL online release notes

9 years agoRemove a reference to sourceforge project page now that we don't use it anymore
Pavan Deolasee [Mon, 28 Mar 2016 12:04:24 +0000 (17:34 +0530)]
Remove a reference to sourceforge project page now that we don't use it anymore

9 years agoCorrect Copyright years
Pavan Deolasee [Mon, 28 Mar 2016 12:04:04 +0000 (17:34 +0530)]
Correct Copyright years

9 years agoCorrectly use Postgres-XL instead of PostgreSQL for reporting "make" status
Pavan Deolasee [Mon, 28 Mar 2016 11:46:33 +0000 (17:16 +0530)]
Correctly use Postgres-XL instead of PostgreSQL for reporting "make" status

9 years agoCorrectly specify HASH_BLOBS while using nodeOid as a key for pooler hash
Pavan Deolasee [Mon, 28 Mar 2016 11:34:53 +0000 (17:04 +0530)]
Correctly specify HASH_BLOBS while using nodeOid as a key for pooler hash
tables.

Without this, we were incorrectly using the default string copy/compare
functions, thus later breaking things.

9 years agoUse a non-zero default value for max_wal_senders on coordinator and datanode
Pavan Deolasee [Mon, 28 Mar 2016 08:29:04 +0000 (13:59 +0530)]
Use a non-zero default value for max_wal_senders on coordinator and datanode
master

9 years agopqsignal.c now comes from src/port and that must be used to create a symlink
Pavan Deolasee [Fri, 25 Mar 2016 11:38:50 +0000 (17:08 +0530)]
pqsignal.c now comes from src/port and that must be used to create a symlink
while building initgtm.

9 years agoDo not turn hot_standby in coordinator/datanode slaves since its not supported.
Pavan Deolasee [Fri, 25 Mar 2016 11:23:31 +0000 (16:53 +0530)]
Do not turn hot_standby in coordinator/datanode slaves since its not supported.

We'd earlier turned that on so that PQping() can check status of standbys. But
that clearly creates bigger trouble and standbys may just stop working. So add
a new mechanism to ping slave nodes by using pg_ctl

9 years agoCheck if gtm/gtm_proxy directory has a .pid file before trying to stop the
Pavan Deolasee [Tue, 22 Mar 2016 07:52:41 +0000 (13:22 +0530)]
Check if gtm/gtm_proxy directory has a .pid file before trying to stop the
server.

Before starting or initialising a new GTM/GTM proxy, we first try to stop
running server. But if server is not running, which is the case most often, it
will show an error This avoids those unnecessary error messages

9 years agoCorrect example in the tutorial.
Pavan Deolasee [Tue, 22 Mar 2016 07:06:19 +0000 (12:36 +0530)]
Correct example in the tutorial.

We don't support SRF in VALUES clause. They must be used via subqueries.
Report by Ernst-Georg Schmid

9 years agoRemove an obselete file
Pavan Deolasee [Tue, 15 Mar 2016 13:28:32 +0000 (18:58 +0530)]
Remove an obselete file

9 years agoRename RelationLocInfo->nodeList to RelationLocInfo->rl_nodeList to avoid using
Pavan Deolasee [Tue, 15 Mar 2016 13:15:21 +0000 (18:45 +0530)]
Rename RelationLocInfo->nodeList to RelationLocInfo->rl_nodeList to avoid using
such a common name for a very important structure member

9 years agoRe-add incorrectly removed call to consume txn_count during compiler warning
Pavan Deolasee [Tue, 15 Mar 2016 06:30:52 +0000 (12:00 +0530)]
Re-add incorrectly removed call to consume txn_count during compiler warning
cleanups

9 years agoAdd support for json_agg() pushdown
Pavan Deolasee [Tue, 15 Mar 2016 03:09:58 +0000 (08:39 +0530)]
Add support for json_agg() pushdown

This patch adds a collection function for json_agg() aggregate. Also use a
specific json_agg_state type for the internal agg state so that corresponding
in/out functions can be specified for transition values to be passed around
from one node to another

9 years agoCache argument type information in json(b) aggregate functions.
Andrew Dunstan [Fri, 18 Sep 2015 18:39:39 +0000 (14:39 -0400)]
Cache argument type information in json(b) aggregate functions.

These functions have been looking up type info for every row they
process. Instead of doing that we only look them up the first time
through and stash the information in the aggregate state object.

Affects json_agg, json_object_agg, jsonb_agg and jsonb_object_agg.

There is plenty more work to do in making these more efficient,
especially the jsonb functions, but this is a virtually cost free
improvement that can be done right away.

Backpatch to 9.5 where the jsonb variants were introduced.

9 years agoFix a compiler warning about mixing of code and declarations
Pavan Deolasee [Mon, 14 Mar 2016 11:43:26 +0000 (17:13 +0530)]
Fix a compiler warning about mixing of code and declarations

9 years agoAggregates with ORDER BY clause cannot be shipped to the datanode.
Pavan Deolasee [Mon, 14 Mar 2016 11:38:05 +0000 (17:08 +0530)]
Aggregates with ORDER BY clause cannot be shipped to the datanode.

A query such as "SELECT sum(x ORDER BY x) FROM tab" must not be shipped to the
remote side since the transition function must receive tuples in the specified
order. While it does not make much sense in this example, there could be other
aggregares, such as json_agg, where ordering could matter

9 years agoFix several compiler warnings
Pavan Deolasee [Fri, 11 Mar 2016 08:48:50 +0000 (14:18 +0530)]
Fix several compiler warnings

9 years agoDo not compare unsigned integer for "< 0"
Pavan Deolasee [Fri, 11 Mar 2016 06:44:07 +0000 (12:14 +0530)]
Do not compare unsigned integer for "< 0"

9 years agoExplicitly cast pthread_t to int for logging purposes
Pavan Deolasee [Fri, 11 Mar 2016 06:43:10 +0000 (12:13 +0530)]
Explicitly cast pthread_t to int for logging purposes

9 years agofix missing prototypes (and 'implicit declaration' warning)
Tomas Vondra [Tue, 1 Mar 2016 03:45:56 +0000 (04:45 +0100)]
fix missing prototypes (and 'implicit declaration' warning)

9 years agoremove functions that are not used (or defined)
Tomas Vondra [Tue, 1 Mar 2016 03:26:17 +0000 (04:26 +0100)]
remove functions that are not used (or defined)

9 years agoadd missing declarations of timeval/rusage structs
Tomas Vondra [Tue, 1 Mar 2016 03:11:38 +0000 (04:11 +0100)]
add missing declarations of timeval/rusage structs

9 years agofix declarations that discard 'const' modifier from pointers
Tomas Vondra [Tue, 1 Mar 2016 03:05:53 +0000 (04:05 +0100)]
fix declarations that discard 'const' modifier from pointers

9 years agoget rid of GTMGetFirstClientIdentifier (unused)
Tomas Vondra [Tue, 1 Mar 2016 02:48:06 +0000 (03:48 +0100)]
get rid of GTMGetFirstClientIdentifier (unused)

function not used or even defined in a header file

9 years agofix a few violations of ISO C90 (mixed code/declarations)
Tomas Vondra [Tue, 1 Mar 2016 02:36:41 +0000 (03:36 +0100)]
fix a few violations of ISO C90 (mixed code/declarations)

Interestingly ';;' confuses the compiler enough to emit this warning.

9 years agoeliminate variables that are not used at all
Tomas Vondra [Tue, 1 Mar 2016 02:29:42 +0000 (03:29 +0100)]
eliminate variables that are not used at all

9 years agoeliminate variables that are only set (but not used)
Tomas Vondra [Tue, 1 Mar 2016 02:23:41 +0000 (03:23 +0100)]
eliminate variables that are only set (but not used)

9 years agofix missing ExceptionalCondition prototype / return type
Tomas Vondra [Tue, 1 Mar 2016 02:03:59 +0000 (03:03 +0100)]
fix missing ExceptionalCondition prototype / return type

During compilation, there's like a zillion warnings about missing
prototype of ExceptionalCondition. Of course, in regular postgres
this is defined in postgres.h like this:

    void ExceptionalCondition(...)

but in XL apparently some places use Assert it without willing to
include the whole postgres.h (not sure why). So there's a copy of
the function in src/gtm/common/assert.c, but there's no prototype
in src/include/gtm/assert.h, thus the complaints.

Adding the prototype to the header file however reveals another
problem, as the function in src/gtm/common/assert.c is defined
like this

    int ExceptionalCondition(...)

with a rather wonky explanation about TrapMacro(). So this would
fail to compile when a file ends up g both header files, like for
example src/gtm/client/gtm_client.c. (Fun fact: gtm_client.c does
not really need the include at all.)

Therefore the best solution at this point seems to be to simply
change the return type in assert.c to void (and get rid of the
rather suspicious explanation above the function), and add the
prototype into src/include/gtm/assert.c. This way the prototype
matches the one from postgres.h, there's no conflict and the
warnings disappear.

In the long term however, the right solution seems to be simply
removing the redundancy by dropping the gtm copy of the function.

9 years agoAdd support for pushdown of Append and MergeAppend nodes.
Pavan Deolasee [Thu, 10 Mar 2016 11:13:32 +0000 (16:43 +0530)]
Add support for pushdown of Append and MergeAppend nodes.

While dealing with Append and MergeAppend pathnodes, we shouldn't be looking at
the Varno in the "distribution" information because each append subpath comes
from a different relation. So we devise a mechanism to compare distribution
strategies without comparing the Varnos.

Expected outputs of many test cases is also updated because Append and
MergeAppend plans are now pushed down to the datanodes when possible.

"misc" test case exhibited certain failures because of incorrect evaluation of
a "volatile" function on the datanode. This turned out to be an old bug which
should be fixed separately. There were existing failures, masked by incorrect
acceptance of the test output. All such sqls are now disabled from "misc" and
copied to xl_known_bugs. Once the bug related to the volatile functions is
fixed, we would enable those sqls again

9 years agoSend down SYNC message to a failed remote session that was running extended
Pavan Deolasee [Wed, 9 Mar 2016 12:18:23 +0000 (17:48 +0530)]
Send down SYNC message to a failed remote session that was running extended
query protocol.

While running extended query protocol, a backend that has thrown an error will
keep ignoring all messages until it sees a SYNC message. We now carefully track
the messages that we are sending to the remote node and remember if we must
send a SYNC message even before sending a ROLLBACK command.

While the regression was running fine even without this patch, this issue was
noticed as part of some other work and hence fixed

9 years agoSet log_line_prefix for regression run to collect more information by default
Pavan Deolasee [Wed, 9 Mar 2016 11:34:43 +0000 (17:04 +0530)]
Set log_line_prefix for regression run to collect more information by default

9 years agoAdd test cases to enable/disable certain modules using the new logging
Pavan Deolasee [Wed, 9 Mar 2016 11:32:50 +0000 (17:02 +0530)]
Add test cases to enable/disable certain modules using the new logging
infrastructure

They are not added to either serial or parallel schedules and just serve as
examples right now

9 years agoAdd support for process-level control for overriding log levels.
Pavan Deolasee [Fri, 4 Mar 2016 06:51:18 +0000 (12:21 +0530)]
Add support for process-level control for overriding log levels.

This patch changes the behaviour of pg_msgmodule_set/change() functions. These
functions now only change the log levels for various messages, but the actual
logging won't start until one of the following enable() function is called.

This patch adds a few more functions:

- pg_msgmodule_enable(pid) - the given pid will start logging as per the
  current settings for various msgs.

- pg_msgmodule_disable(pid) - the given pid will stop logging and use the
  compile time settings

- pg_msgmodule_enable_all(persistent) - all current processes will start
  logging as per the current setting. If "persistent" is set to true then all
  new processes will also log as per the setting

- pg_msgmodule_disable_all() - all current and future processes will stop
  logging and only use compile time settings.

9 years agoHonour client's request for binary data transfer.
Pavan Deolasee [Thu, 3 Mar 2016 11:40:41 +0000 (17:10 +0530)]
Honour client's request for binary data transfer.

When coordinator gets data from the datanode, it always gets it in TEXT mode.
But if the client has requested binary transfer of the data, then it must not
forward the data received from the datanode as it is. Rather it must send each
column in the desired format.

While this should fix the JDBC or libpq issue with binary data transfer, we
should really see if the coordinator to datanode communication can also use
binary mode for performance reason. But thats a separate patch.

9 years agoAvoid repeated palloc for query strings while handling multi-statement SQLs
Pavan Deolasee [Thu, 3 Mar 2016 09:31:06 +0000 (15:01 +0530)]
Avoid repeated palloc for query strings while handling multi-statement SQLs

We now only pass pointers until we have the complete query string. At that
point, we only required bytes and copy the query string

9 years agoCollect and return query substrings corresponding to each SQL statement
Pavan Deolasee [Thu, 3 Mar 2016 05:35:30 +0000 (11:05 +0530)]
Collect and return query substrings corresponding to each SQL statement
while parsing a multi-statement query separated by ';'

raw_parser() returns a list of parsetrees after parsing a multi-statement SQL
query, where each parsetree corresponds to one SQL statement. It does not have
any mechanism to return the source text of the SQL statement. In Postgres-XL,
we send out the query text as it is to remote datanodes and coordinators while
dealing with utility statements. Not having access to individual SQL statement
is a problem because we end up sending the same text again and again, leading
to various issues.

This patch adds some rudimentary mechanism to return a list of query strings
along with the list of parsetress.

9 years agoEnable 'random' test case to match with the parallel_schedule
Pavan Deolasee [Thu, 3 Mar 2016 04:06:11 +0000 (09:36 +0530)]
Enable 'random' test case to match with the parallel_schedule

9 years agoSave global_xmin in the GTM control file and use that when its restarted
Pavan Deolasee [Wed, 2 Mar 2016 06:18:30 +0000 (11:48 +0530)]
Save global_xmin in the GTM control file and use that when its restarted

The control file now also have a version identifier so that we can read and
interpret older versions while keeping flexibility to change the format

9 years agoInclude the string terminator in GID size calculation
Pavan Deolasee [Tue, 1 Mar 2016 13:27:23 +0000 (18:57 +0530)]
Include the string terminator in GID size calculation

9 years agoReport state to the GTM no more frequently than defined by
Pavan Deolasee [Tue, 1 Mar 2016 11:53:05 +0000 (17:23 +0530)]
Report state to the GTM no more frequently than defined by
CLUSTER_MONITOR_NAPTIME

Otherwise we have danger of flooding the GTM with messages in an infinite loop,
if the GTM is reporting back errors

9 years agoUse -P preprocessor option to avoid to inhibit generation of linemarkers in the
Pavan Deolasee [Tue, 1 Mar 2016 10:16:02 +0000 (15:46 +0530)]
Use -P preprocessor option to avoid to inhibit generation of linemarkers in the
output from the preprocessor

This ensures that grepping for the elog messages works correctly on various
compilers (tested for gcc and clang)

9 years agoFix bugs around handling of params passed to the datanodes.
Pavan Deolasee [Tue, 1 Mar 2016 05:57:20 +0000 (11:27 +0530)]
Fix bugs around handling of params passed to the datanodes.

This fixes problems reported in plpgsql function calls since
fast-query-shipping was added. But fixing the problem also highlighted other
bugs in the protocol handling. For example, we would set a datanode connection
state to be IDLE upon receiving a CommandComplete. But for simple query
protocol, we should really be waiting for ReadyForQuery before changing the
state. This patch has related fixes.

plpgsql test case's expected output is also fixed now that the underlying bug
has been take care of

9 years agoFix docs and Makefile so that "make dist" works correctly
Pavan Deolasee [Fri, 19 Feb 2016 10:45:48 +0000 (16:15 +0530)]
Fix docs and Makefile so that "make dist" works correctly

9 years agoHandle case correctly when collection function is not defined.
Pavan Deolasee [Fri, 19 Feb 2016 09:50:31 +0000 (15:20 +0530)]
Handle case correctly when collection function is not defined.

Also, make sure some other missed out responses are handled correctly during
receiving responses

9 years agoCorrect stale links in the README file
Pavan Deolasee [Thu, 18 Feb 2016 15:47:22 +0000 (21:17 +0530)]
Correct stale links in the README file

9 years agoUpdate release notes
Pavan Deolasee [Thu, 18 Feb 2016 14:22:33 +0000 (19:52 +0530)]
Update release notes

9 years agoMake some adjustments to the way documentation headers are printed
Pavan Deolasee [Thu, 18 Feb 2016 12:19:09 +0000 (17:49 +0530)]
Make some adjustments to the way documentation headers are printed

9 years agoInstall docs under "doc" directory instead of "doc-xc"
Pavan Deolasee [Thu, 18 Feb 2016 11:56:49 +0000 (17:26 +0530)]
Install docs under "doc" directory instead of "doc-xc"

9 years agoIncrease the timeout for waiting to end query to 1s from existing 20ms.
Pavan Deolasee [Thu, 18 Feb 2016 10:13:35 +0000 (15:43 +0530)]
Increase the timeout for waiting to end query to 1s from existing 20ms.

We see that this timeout expires sometime on a loaded machine, leading to other
errors. We need to handle that situation better, but for now increase the
timeout to 1s to reduce the chances of this happening.

9 years agoThere was a missing commit from when the repo was forked,
Mason Sharp [Thu, 18 Feb 2016 07:40:55 +0000 (23:40 -0800)]
There was a missing commit from when the repo was forked,
applying to the new repo.

Original commit from the sourceforge repo:

    commit e61639b864e83b6b45d11b737ec3c3d67aeb4b56
    Author: Mason Sharp <[email protected]>
    Date:   Sun Jul 26 17:54:08 2015 -0700

        Changed license from the Mozilla Public License
        to the PostgreSQL License

9 years agoFix a few compiler warnings.
Pavan Deolasee [Wed, 17 Feb 2016 12:37:21 +0000 (18:07 +0530)]
Fix a few compiler warnings.

9 years agoRemove known bugs' tests from schedule.
Pallavi Sontakke [Wed, 17 Feb 2016 10:58:08 +0000 (16:28 +0530)]
Remove known bugs' tests from schedule.

9 years agoHandle a race condition between portal close 'C' message and new request for
Pavan Deolasee [Wed, 17 Feb 2016 07:28:31 +0000 (12:58 +0530)]
Handle a race condition between portal close 'C' message and new request for
running the portal, as part of the next step of query execution

A producer will unbind and remove the SharedQ once all consumers are done
with reading pending data. It does not wait for consumers to actually send
close 'C' message. If the next step of the execution now recreates the SharedQ
with the same name (because its the same RemoteSubplan being re-executed), and
if the close messages arrives after that, but before the new producer gets
chance to bind to the SharedQ, we will end up marking future consumers of the
new SharedQ as 'DONE'. The SharedQueueAcquire then incorrectly assumes that
this is a stale Q belonging to earlier execution and gets in an infinite wait.

Also do not try indefinitely for the old producer to unbind and remove a stale
queue. Any further bugs in this area will cause infinite loops. Instead try for
a fixed number of times and then let the query fail.

9 years agoGroup all uncategorised log messages to a default module-id (with number set to
Pavan Deolasee [Wed, 17 Feb 2016 04:52:05 +0000 (10:22 +0530)]
Group all uncategorised log messages to a default module-id (with number set to
255 right now)

9 years agoCorrect a typo
Pavan Deolasee [Wed, 17 Feb 2016 04:51:22 +0000 (10:21 +0530)]
Correct a typo

9 years agoAdd a missing value for "%s" placeholder in an elog message
Pavan Deolasee [Tue, 16 Feb 2016 18:34:41 +0000 (00:04 +0530)]
Add a missing value for "%s" placeholder in an elog message

9 years agoUse "expr" instead of "bc" for arithmetic since "bc" may not be available on
Pavan Deolasee [Mon, 15 Feb 2016 18:28:58 +0000 (23:58 +0530)]
Use "expr" instead of "bc" for arithmetic since "bc" may not be available on
all platforms

9 years agoFix build failure when configured with --with-openssl
Pavan Deolasee [Mon, 15 Feb 2016 18:23:22 +0000 (23:53 +0530)]
Fix build failure when configured with --with-openssl

9 years agoHandle errors while PREPARing a transaction gracefully.
Pavan Deolasee [Mon, 15 Feb 2016 12:56:13 +0000 (18:26 +0530)]
Handle errors while PREPARing a transaction gracefully.

If an error occurs after PREPARE TRANSACTION command is sent, we don't know if
the command is successful or not. So the coordinator will go ahead and abort
the transaction. But the node which failed to run PREPARE TRANSACTION may not
be even reachable on the coordinator. So we don't try to rollback transaction
on such nodes.

9 years agoMake sure to write to the GTM control file only after paths are set up
Pavan Deolasee [Mon, 15 Feb 2016 12:50:44 +0000 (18:20 +0530)]
Make sure to write to the GTM control file only after paths are set up
correctly

9 years agoExtend commit, subtrans and commit_ts logs appropriately when new xids are
Pavan Deolasee [Mon, 15 Feb 2016 10:47:11 +0000 (16:17 +0530)]
Extend commit, subtrans and commit_ts logs appropriately when new xids are
received from the GTM.

After our recent work on cluster monitor, GTM will now send back
latestCompletedXid and RecentGlobalXmin which we record in a shared memory
area. But if the node has not asked for XIDs for some time, the SLRU
maintaining these logs may fall much behind. So we must keep these log files
upto date with the XIDs generated by the GTM

This should also fix the spotting of the following log message as reported
by Mason Sharp

LOG: could not truncate directory "pg_subtrans": apparent wraparound