From: Pavan Deolasee Date: Fri, 22 Jan 2016 14:06:38 +0000 (+0530) Subject: Further improvements to release notes X-Git-Tag: XL9_5_R1BETA1~74 X-Git-Url: http://git.postgresql.org/gitweb/static/gitweb.js?a=commitdiff_plain;h=a54ecab86f76805265221c9c6f378f617aab344d;p=postgres-xl.git Further improvements to release notes --- diff --git a/doc/src/sgml/release-xl-9.5.sgml b/doc/src/sgml/release-xl-9.5.sgml index 40c52e4ad4..da7d75e3cd 100644 --- a/doc/src/sgml/release-xl-9.5.sgml +++ b/doc/src/sgml/release-xl-9.5.sgml @@ -13,9 +13,9 @@ Overview - This major release of Postgres-XL is coming -almost 2 years after Postgres-XL 9.2 was released. The -earlier Postgres-XL release was based on + This major release of Postgres-XL comes +almost 2 years after Postgres-XL 9.2 was released, +which was based on PostgreSQL 9.2. This release includes most of the new features added in PostgreSQL 9.3, 9.4 and 9.5 releases. This also includes almost all significant performance enhancements @@ -70,23 +70,30 @@ short list of such enhancements, but all other enhancements also apply, unless otherwise stated. - Major Enhancements from PostgreSQL 9.5 + Major Enhancements from PostgreSQL 9.6 + - Allow INSERTs - that would generate constraint conflicts to be turned into - UPDATEs or ignored + Substantial improvement in 2PC performance by avoiding creation of state +files when its not necessary. - + + + + Major Enhancements from PostgreSQL 9.5 + - Add row-level security control + Allow INSERTs + that would generate constraint conflicts to be turned into + UPDATEs or ignored + Create mechanisms for tracking @@ -139,13 +146,6 @@ otherwise stated. - - - Allow materialized views - to be refreshed without blocking concurrent reads - - - Add support for logical decoding @@ -222,6 +222,20 @@ otherwise stated. ROLLUP + + + Add row-level security control + + + + + + Allow materialized views + to be refreshed without blocking concurrent reads + + + + Add materialized @@ -262,318 +276,369 @@ otherwise stated. release. - - - - Analyze all regression failures and make necessary changes to the -expected output or the test cases. - - - - - Persistent connections are not supported between datanodes. - - - Configuration parameter persistent_datanode_connections -is ignored on the datanodes. So connections between datanodes are returned back -to the connection pool at the end of the transaction. A WARNING will be shown -when this parameter is set on the datanode side and the value will be ignored. - - - - - Change GID format to include all participant nodes. - - - Every implicit 2PC GID now includes node_id of every -participating node in the 2PC transaction. This refers to the - element of pgxc_node - .node_id value. - - - - - WAL log only the actual GID used to prepare a 2PC transaction, not the - maximum bytes reserved for GID. - - - This change considerably reduces the WAL space required while preparing a -transaction and shows siginificant performance improvement. - - - - - - Support Greenplum syntax for specifying distribution strategy for a table. - - - Postgres-XL now supports additional syntax for specifying distribution -strategy. This syntax is compatible with Greenplum. See - for more details. - - - - - Support Redshift syntax for specifying distribution strategy for a table. - - - Postgres-XL now supports additional syntax for specifying distribution -strategy. This syntax is compatible with Redshift. See - for more details. - - - - - Improve pgxc_ctl so that it checks if a directory is empty before it can be - used as data directory for a new datanode/coordinator. - - - - - Replicated tables are now highly-available for read-access. - - - Every node now maintains a healthmap about all other nodes in the cluster. If -a node goes down or is unreachable, the healthmap is updated. Queries that read -from replicated tables are then sent to a healthy node. Unhealthy nodes are -periodically pinged and their status is updated when they come back online. - - - - - "make check" now automatically sets up a 2-coordinator, 2-datanode cluster -and runs parallel regression schedule. - - - - - Significantly improve performance for queries that can be fully executed on a -single node by shipping them to the node. - - - Postgres-XC has Fast Query Shipping (FQS) feature to fully ship queries that can be safely executed on the remote -datanodes, without any finalisation at the coordinator. The same feature has -now been ported to Postgres-XL. This improves -performances for extremely short queries which are now directly planned and -executed on the datanodes. - - - - - Print EXPLAIN plans, as created by the datanodes, for queries that are fully -shipped to the datanodes. - - - - - Bump up default value for sequence_range to 1000. - - - The earlier default for this GUC was 1. But performance of INSERT is observed to be extremely poor -when sequences are incremented by 1. So the default value of this GUC is now bumped up to -1000. This can create holes in the sequence assignment. -Applications that do not want holes in sequence values should set this GUC to -1. - - - - - Force commit ordering at the GTM for transactions that have followed a - specific logical ordering at the datanode/coordinators. - - - - - Add a developer GUC "enable_datanode_row_triggers" to allow ROW TRIGGERS to be - executed on the datanodes. - - - This is a developer GUC and it must be used with caution since the feature -is not fully supported yet. When this GUC is turned on, -ROW TRIGGERS can be defined on tables. Such triggers will only be executed on -the datanodes and they must be written in a way such that they don't need -access to cluster-wide data. This feature is not well tested and users are -advised to do thorough testing before using this feature. - - - - - Add a developer-GUC 'gloabl_snapshot_source' to allow users to - override the way snapshots are computed. - - - This is a developer GUC and it must be used with caution since its usage can -lead to inconsistent or wrong query results. In -Postgres-XL, snapshots are normally computed at the -GTM so that a globally consistent view of the cluster can be obtained. But -sometimes applications may want to read using a slightly stale snapshot that is -computed on the coordinator, so that an extra round trip to the GTM is avoided. -While such snapshots can improve performance, they may give a wrong result, -especially when more than one coorinators are running. - - - - - Allow on-demand assignment of XIDs. - - - - - Add a Cluster Monitor process which periodically reports local state to the - GTM for computation of a consistent global state. - - - - - Compute RecentGlobalXmin on each node separately and send it to the GTM - periodically. GTM then computes a cluster-wide RecentGlobalXmin and passes it - back to the nodes. - - - - - Support recursive queries for replicated tables. - - - - - Wait for the socket to become ready to receive more data before attempting to - write again. - - - - - Cancel queries on remote connections upon transaction abort. - - - - - Add two new GUCs, log_gtm_stats and -log_remotesubplan_stats to - collect runtime information about GTM communication stats and remote subplan - stats. - - - - - Use poll() instead of select() to check if one or more file descriptors are - ready for read/write. - - - - - Fix aggregate handling for BIGINT/INT8 datatype for platforms with - support for 128-bit integers. - - - - - Add support to receive error HINTs from remote nodes and send them back to the - client along with the error message. - - - - - Add necessary machinery to support TABLESAMPLE clause. - - - - - Add 'C' and 'R' to log_line_prefix. - - - This helps us print remote coordinator ID and PID of the remote coordinator - process. This is useful for debugging - - - - - Add support for materialized views on the coordinator. - - - - - Add "force" option to pgxc_ctl init command to forcefully remove datanode, - coordinator or gtm directory. - - - - - Add a new "minimal" option to "prepare" command of pgxc_ctl. - - - - - Do not read prototype config file when dealing with user specified conf - file. - - - - - Set the size of pending connections on a pooler socket to some respectable - high limit. - - - - - Allow DMLs inside a plpgsql procedure on the coordinator. - - - - - Add support for GTM to backup data at BARRIER command. - - - - - Disable internal subtransactions. - - - - - Add support for pg_stat_statements. - - - - - Add xc_maintenance_mode GUC which is useful for resolving in-doubt - prepared transactions. - - - - - Add support for gtmSlaveName in pgxc_ctl.conf. - - - - - Add "help" command to pgxc_ctl. - - - - - Improve "pgxc_ctl configure" command so that datanodes are also properly - configured. - - - - - Add ability to specify extra server configuration and pg_hba configuration while adding a new - datanode master or slave. - - - - - Add support to save history of pgxc_ctl commands. - - - - - Add ability to specify datanode slave ports and datanode slave pooler ports - separately. - - - + + Performance + + + + WAL log only the actual GID used to prepare a 2PC transaction, not the + maximum bytes reserved for GID. + + + This change considerably reduces the WAL space required while preparing a + transaction and shows siginificant performance improvement. + + + + + Significantly improve performance for queries that can be fully executed on a + single node by shipping them to the node. + + + Postgres-XC has Fast Query Shipping (FQS) feature to fully ship queries that can be safely executed on the remote + datanodes, without any finalisation at the coordinator. The same feature has + now been ported to Postgres-XL. This improves + performances for extremely short queries which are now directly planned and + executed on the datanodes. + + + + + Bump up default value for sequence_range to 1000. + + + The earlier default for this GUC was 1. But performance of INSERT is observed to be extremely poor + when sequences are incremented by 1. So the default value of this GUC is now bumped up to + 1000. This can create holes in the sequence assignment. + Applications that do not want holes in sequence values should set this GUC to + 1. + + + + + Add a developer-GUC 'gloabl_snapshot_source' to allow users to + override the way snapshots are computed. + + + This is a developer GUC and it must be used with caution since its usage can + lead to inconsistent or wrong query results. In + Postgres-XL, snapshots are normally computed at the + GTM so that a globally consistent view of the cluster can be obtained. But + sometimes applications may want to read using a slightly stale snapshot that is + computed on the coordinator, so that an extra round trip to the GTM is avoided. + While such snapshots can improve performance, they may give a wrong result, + especially when more than one coorinators are running. + + + + + Compute RecentGlobalXmin on each node separately and send it to the GTM + periodically. GTM then computes a cluster-wide RecentGlobalXmin and passes it + back to the nodes. + + + + + Wait for the socket to become ready to receive more data before attempting to + write again. + + + When client is pumping data at a much higher speed than what the network or + the remote nodes can handle, coordinator used to keep buffering all the + incoming data, eventually running out of memory. We now deal with this problem + in a much better way. + + + + + Use poll() instead of select() to check if one or more file descriptors are + ready for read/write. + + + select() system call is not well equipped to handle large number of file + descriptors. In fact, it has an upper limit of 1024, which + is not enough in a large distributed system such as + Postgres-XL. So we now use poll() for checking which + sockets are ready for send/recv. + + + + + Fix aggregate handling for BIGINT/INT8 datatype for platforms with + support for 128-bit integers. + + + + + Significant reduction in XID consumption. + + + In the older release of Postgres-XL, every +transaction would consume an XID, irrespective of it did any write activity to +the database. PostgreSQL fixed this problems a few +years back by using Virtual XIDs. This release of +Postgres-XL solves this problem to a great extent by +completely avoiding XID assignment for SELECT queries and +only assigning them when are really required. + + + + + + Additional Features + + + + Support Greenplum syntax for specifying distribution strategy for a table. + + + Postgres-XL now supports additional syntax for specifying distribution + strategy. This syntax is compatible with Greenplum. See + for more details. + + + + + Support Redshift syntax for specifying distribution strategy for a table. + + + Postgres-XL now supports additional syntax for specifying distribution + strategy. This syntax is compatible with Redshift. See + for more details. + + + + + Add xc_maintenance_mode GUC which is useful for resolving in-doubt + prepared transactions. + + + + + Add support for pg_stat_statements. + + + + + Allow DMLs inside a plpgsql procedure on the coordinator. + + + + + Add necessary machinery to support TABLESAMPLE clause. + + + + + Add support for materialized views on the coordinator. + + + + + Add 'C' and 'R' to log_line_prefix. + + + This helps us print remote coordinator ID and PID of the remote coordinator + process and useful for debugging. + + + + + Add support to receive error HINTs from remote nodes and send them back to the + client along with the error message. + + + + + Add two new GUCs, log_gtm_stats and + log_remotesubplan_stats to + collect runtime information about GTM communication stats and remote subplan + stats. + + + + + Support recursive queries on replicated tables. + + + + + Add a developer GUC "enable_datanode_row_triggers" to allow ROW TRIGGERS to be + executed on the datanodes. + + + This is a developer GUC and it must be used with caution since the feature + is not fully supported yet. When this GUC is turned on, + ROW TRIGGERS can be defined on tables. Such triggers will only be executed on + the datanodes and they must be written in a way such that they don't need + access to cluster-wide data. This feature is not well tested and users are + advised to do thorough testing before using this feature. + + + + + + Improvements to pgxc_ctl + + + + Add support for gtmSlaveName in pgxc_ctl.conf. + + + + + Add "help" command to pgxc_ctl. + + + + + Improve "pgxc_ctl configure" command so that datanodes are also properly + configured. + + + + + Add ability to specify extra server configuration and pg_hba configuration while adding a new + datanode master or slave. + + + + + Add support to save history of pgxc_ctl commands. + + + + + Add ability to specify datanode slave ports and datanode slave pooler ports + separately. + + + + + Add ability to specify separate XLOG directory while setting up a datanode + or a datanode slave using pgxc_ctl. + + + + + Add a new "minimal" option to "prepare" command of pgxc_ctl. + + + This new option can be used to create a sample pgxc_ctl.conf file to setup + a Postgres-XL cluster on the local machine, using + non-conflicting data directories and ports. This is very useful for quick + testing. + + + + + Improve pgxc_ctl so that it checks if a directory is empty before it can be + used as data directory for a new datanode/coordinator. + + + + + Add "force" option to pgxc_ctl init command to forcefully remove datanode, + coordinator or gtm directory. + + + + + + Misc Improvements + + + + Analyze all regression failures and make necessary changes to the + expected output or the test cases. + + + + + Persistent connections are not supported between datanodes. + + + Configuration parameter persistent_datanode_connections + is ignored on the datanodes. So connections between datanodes are returned back + to the connection pool at the end of the transaction. A WARNING will be shown + when this parameter is set on the datanode side and the value will be ignored. + + + + + Change GID format to include all participant nodes. + + + Every implicit 2PC GID now includes node_id of every + participating node in the 2PC transaction. This refers to the + element of pgxc_node + .node_id value. + + + + + + Replicated tables are now highly-available for read-access. + + + Every node now maintains a healthmap about all other nodes in the cluster. If + a node goes down or is unreachable, the healthmap is updated. Queries that read + from replicated tables are then sent to a healthy node. Unhealthy nodes are + periodically pinged and their status is updated when they come back online. + + + + + "make check" now automatically sets up a 2-coordinator, 2-datanode cluster + and runs parallel regression schedule. + + + + + Print EXPLAIN plans, as created by the datanodes, for queries that are fully + shipped to the datanodes. + + + + + Force commit ordering at the GTM for transactions that have followed a + specific logical ordering at the datanode/coordinators. + + + + + Add a Cluster Monitor process which periodically reports local state to the + GTM for computation of a consistent global state. + + + + + Cancel queries on remote connections upon transaction abort. + + + When a transaction abort or when user cancels a query, we now correctly + send down the query cancellation to all the remote nodes and cancel the query + on every node in the cluster. + + + + + Set the size of pending connections on a pooler socket to some respectable + high limit. + + + + + Add support for GTM to backup data at BARRIER command. + + + + + Disable internal subtransactions. + + + + Important Bug Fixes