A Comprehensive Guide To Oracle Partitioning With Samples
A Comprehensive Guide To Oracle Partitioning With Samples
The research presented focuses on partitioning strategy options for each currently
available Oracle version, in particular Oracle11g. The study guideline is on implementing
the new partitioning options available in 11g, while enhancing performance tuning for
the physical model and applications involved. Therefore, basic strategies such as
range, hash and list, composite partitioning strategies including all possible
combinations of Basic strategies, and Partition Extensions such as Reference and Interval
partitioning strategies are covered. The research covers both middle-to-large-size and
VLDB databases with significant implications for consolidation, systems integration, high-
availability, and virtualization support. Topics covered emphasize subsequent
performance tuning and specific application usage such as IOT-partitioning and
composite partitioning choices.
1. Introduction
Encountering that performance for a table whose segments have exceeded a good
number of Gigabytes or even Terabytes is probably the main consideration to partition
that table. Indeed, that table will require special segment access control, and thus,
special index architecture and administration. Ultimately, the usage of multiple block size
tablespaces and buffer caches become a database technology that greatly enhances a
partitioning approach.
In principle, tables (including materialized views), indexes, and Index-Organized Tables
(IOT) are the objects used as target in any valid partitioning strategy utilized.
Partitioning strategies can involve primarily Basic, Composite and Partition Extensions.
2. Partitioning Strategies
This strategy involves one of the following options: Partition by Range (Establishes
ranges within the domain used as partitioning key), List (Provides a list of values
matching one partition in the partition key domain and a default partition for those not
matched) or Hash (which transforms the partitioning key value and maps it to given
partition).
The following combinations of basic partitioning result into valid composite partitioning
strategies, namely:
Range-Range
Range-List
Range-Hash
List-List
List-Range
List-Hash
Interval Partitioning
The Interval Partitioning strategy is fundamentally a special implementation of Range
partitioning, which maps primarily a DATE, TIMESTAMP data type to a numeric interval,
using the INTERVAL keyword, use as a partition range marker. The functions
NUMTOYMINTERVAL and NUMTODSINTERVAL are commonly used. Interval partitioning
can occur as a single-level strategy or composite option in combination with all other
options, namely, Range, Hash and List.
This strategy normally uses the referential integrity constraint between to table, and
uses the key in the details table to attain partition on the referenced key, which points to
a candidate primary key in another partitioned table, the master table. The referential
integrity constraint must be enabled and enforced.
This option permits to partition of a column on a virtual column, which is usually the
outcome of a mathematical operation on two or more actual columns on the same table.
This option extends every basic partitioning strategy.
3. Object Partitioning
Tables strictly support each Basic and Composite strategy and all Partition Extension,
constrained by any SQL DDL rule.
3.2 Indexes
Indexes can have local or global partitions. Locally partitioned indexes can be pre-fixed
(if they partitioned on the left portion of the key) or non-prefixed, otherwise. A local
index has a one-to-one correspondence with the underlying table, and can reside in the
same or on a different tablespace, which could even be of a different block size, a
strategy that often enhances database performance. Index Partitioning can support Basic
Partitioning strategies in general.
Like tables, Index-Organized Tables support all partitioning Basic strategies, and in
general the partitioning key must be a subset of the primary key.
4. Creating Partitions
It is exceptionally good to have the experience of creating partitions under any possible
strategy, namely:
This sample code creates a table with four partitions and enables row movement:
The INTERVAL clause of the CREATE TABLE statement sets interval partitioning
for the table. At least one range partition must be specified using the PARTITION clause.
The range partitioning key value determines the high value of the range partitions
(transition point) and the database automatically creates interval partitions for data
beyond that transition point.
For each interval partition, the lower boundary is the non-inclusive upper
boundary of the previous range or interval partition.
The partitioning key can only be a single column name from the table and it must
be of NUMBER or DATE type.
The optional STORE IN clause lets you specify one or more tablespaces.
The following sample code sets four partitions with varying widths. It also specifies that
above the transition point of January 1, 2009, partitions are created with a width of one
month.
The high bound of partition pcd establishes the transition point. pcd and all partitions
below it, namely, (pca, pcb, and pcc) are in the range section while all partitions above it
fall into the interval section.
Hash-Partitioned tables map the insertion location for any row via a hashing algorithm
that determines the appropriate tablespace for the partition key instance. The following
example illustrates a typical case:
The PARTITION BY HASH clause of the CREATE TABLE statement identifies that the table
is to be hash-partitioned. The PARTITIONS clause can then be used to specify the
number of partitions to create, and optionally, the tablespaces to store them in.
Otherwise, PARTITION clauses can be used to name the individual partitions and their
tablespaces . The only attribute needed to specify for hash partitions is TABLESPACE. All
of the hash partitions of a table must share the same segment attributes (except
TABLESPACE), which are inherited from the table level.
A PARTITION BY LIST clause is used in the CREATE TABLE statement to create a table
partitioned by list, by specifying lists of literal values, (the discrete values of the
partitioning columns qualifying rows matching the partitions single column partitioning
key.) In fact, there is no sense of order among partitions.
The DEFAULT keyword is used to describe the value list for a partition that will
accommodate rows that do not map into any of the other partitions.
Optional subclauses of a PARTITION clause can specify physical and other attributes
specific to a partition segment. If not overridden at the partition level, partitions inherit
the attributes of their parent table.
The PARTITION BY REFERENCE clause is used with the CREATE TABLE statement,
specifying the name of a referential constraint, which becomes the partitioning
referential constraint used as the basis for reference partitioning in the table. The
referential integrity constraint must be enabled and enforced.
It is possible to set object-level default attributes, and optionally specify partition
descriptors that override the object-level defaults on a per-partition basis.
When providing partition descriptors, the number of partitions described should match
the number of partitions or subpartitions in the referenced table, i.e., the table will have
one partition for each subpartition of its parent when the parent table is composite;
otherwise the table will have one partition for each partition of its parent.
The partitions of a reference-partitioned table can be named, inheriting their name from
the respective partition in the parent table, unless this inherited name conflicts with one
of the explicit names given. In this scenario, the partition will have a system-generated
name.
Similarly, the database also ensures that the index is maintained automatically when
maintenance operations are performed on the underlying table. This sample code creates
a local index on the SCHOOL_DIRECTORY table:
Naturally, it is possible to optionally name the hash partitions and tablespaces into which
the local index partitions are to be stored, otherwise, the database uses the name of the
corresponding base partition as the index partition name, and stores the index partition
in the same tablespace as the table partition.
The following sample code illustrates the creation of a global hash-partitioned index.
Hash-partitioned global indexes can also limit the impact of index skew on monotonously
increasing column values. Queries involving the equality and IN predicates on the index
partitioning key can efficiently use hash-partitioned global indexes.
Range-Hash partitioned tables are probably the most common type among the
composite partitioning strategies. In general, to create a composite partitioned table, use
the PARTITION BY [ RANGE LIST ] clause of a CREATE TABLE statement. Next, you
specify a SUBPARTITION BY [ RANGE LIST HASH ] clause that follows similar syntax and
rules as the PARTITION BY [ RANGE LIST HASH ] clause. The PARTITION and
SUBPARTITION or SUBPARTITION. In fact, it is important to consider the following
issues, namely:
Range-List partitioned tables are subject to range rules at the first partitioning level and
list rules at second, list partitioning level, accordingly.
This example shows the CAR_RENTALS table that is list partitioned by territory and
subpartitioned using hash by customer identifier.
The following sample code shows a car_rentals table that is list by territory and
subpartitioned by range using the rental paid amount. Note that row movement is
enabled.
The following sample code illustrates how to use a subpartition template to create a
composite Range-Hash Partition Table.
The following sample code illustrates the use of a multicolumn partitioned approach for
table BI_AUTO_RENTALS_SUMMARY. This example shows a multicolumn range-
partitioned table, storing the actual DATE information in three separate columns: year,
month, and day with partition quarterly granularity. The purpose of this type of partition
is to avoid dispersion of data within the range and balance the physical usage of each
partition entries for performance tuning optimization purposes.
In the context of partitioning, a virtual column can be used as any regular column.
All partition methods are supported when using virtual columns, including interval
partitioning and all different combinations of composite partitioning.
There is no support for calls to a PL/SQL function on the virtual column used as the
partitioning column.
The next sample code shows the DIRECT_MARKETING table partitioned by range-range
using a virtual column for the subpartitioning key. The virtual column calculates the
difference between the historic average sales and the forecasted potential sales. As a
rule, at least one partition must be specified.
The reader could note that the hist_avg_sales and sales_forecast are two real columns in
the DIRECT_MARKETING table of NUMBER type, which are used in the description of the
virtual column.
The following sample code creates a list-partitioned table with both compressed and
uncompressed partitions. The compression attribute for the table and all other partitions
is inherited from the tablespace level.
This sample code creates a local partitioned index with all partitions except the most
recent one compressed:
Another option for partitioning index-organized tables is to use the hash method. In the
following example, the FUTURE_MKTG_CAMPAINGS index-organized table is partitioned
by the hash method.
The other option for partitioning index-organized tables is to use the list method.
[1] Removing the encryption option will allow for the table to be created.
When using multiple block size databases, i.e., those using more than one tablespace
block size and associated buffer cache, there are many congruent features to take
advantages of to a higher performance optimization.
Exhibit 2. Using multiple
Block Size to Optimize Index Performance.
Creating indexes on a tablespace with a larger block size will increasing performance in
DSS and in most OLTP scenarios.
The following sample code creates the CREDENTIAL_TABLES in the 8k block size
T1,T2,T3,and T4 tablespaces, and local indexes on the 16k T18,T20,T22,T24
tablespaces, as cached respectively.
The following exhibit illustrates the list of data dictionary views that handle partitioning-
related metadata.
Exhibit 5. Maintenance
Operations of Index Partitions.
Exhibit 7. Coalescing
Operation Reserved for the Hash Partitioning strategy.
Exhibit 9. Truncating a Partitioning and Resetting the High Water Mark by Dropping the
Storage for Exchange Preparedness.
The next exhibit shows the query and result set showing that the merged partition is set
in the same tablespace as one of the merged partitions. In contrast, the previous
example shows that the system-named partition was set in a default tablespace, since
neither the target partition name nor the tablespace name was provided.
7. Manageability
DBAs can use Oracle Enterprise Manager Database and Grid Control to create, maintain,
and verify accuracy of SQL, using the Schema tag and then selecting the desired
partitioning options on the relevant object, namely, tables, indexes, and index-organized
tables. There is extensive support to use standard, unstructured, and user-defined
datatypes.
The following are valid partitioning options to improve availability, manageability and
optimize performance, namely:
Partition pruning greatly optimizes time and resources when retrieving data from
disk, thus improving query performance.
Partition pruning affects the statistics of the objects involved and therefore also
the execution plan of the statement.
Oracle Database prunes partitions when using range, LIKE, equality, and IN-list
predicates on the range or list partitioning columns, and when using equality and IN-list
predicates on the hash partitioning columns.
When using composite partitioned objects, Oracle can prune at both levels using
the relevant predicates.
8.2 Partition-Wise Joins
The next paragraph asserts the most important issues regarding partition-wise joins, as
follows:
Partition-wise joins minimize query response time by reducing the amount of data
exchanged among parallel execution servers when joins execute in parallel, thus
reducing response time and improving the use of both CPU and memory resources.
Oracle Database can perform partial partition-wise joins only in parallel.
Unlike full partition-wise joins, partial partition-wise joins require partitioning only one
table on the join key.
The partitioned table is referred to as the reference table. The other table may or
may not be partitioned. Partial partition-wise joins are more common than full partition-
wise joins.
To execute a partial partition-wise join, the database dynamically repartitions the
other table based on the partitioning of the reference table. Then, the execution
becomes similar to a full partition-wise join.
In Oracle Real Application Clusters (RAC) environments, partition-wise joins also
avoid or at least limit the data traffic over the interconnect, which is the key to achieving
good scalability for massive join operations.
The performance advantage that partial partition-wise joins have over joins in
non-partitioned tables is that the reference table is not moved during the join operation.
The parallel joins between non-partitioned tables require both input tables to be
redistributed on the join key. This redistribution operation involves exchanging rows
between parallel execution servers.
A full partition-wise join divides a large join into smaller joins between a pair of
partitions from the two joined tables. To use this feature, you must equipartition both
tables on their join keys, or use reference partitioning. For example, consider a large join
between the DIRECT_MARKETING table and the CUSTOMERS table on cust_id.
To avoid remote I/O, both matching partitions should have affinity to the same
node.
Partition pairs should be spread over all nodes to use all CPU resources available
and avoid bottlenecks.
Nodes can host multiple pairs when there are more pairs than nodes, e.g., for an
8-node system and 16 partition pairs, each node receives two pairs.
8.3.1 Full Partition-Wise Joins: Composite - Single-Level
This method is a variation of the single-level - single-level method. In this scenario, one
table (typically the larger table) is composite partitioned on two dimensions, using the
join columns as the subpartition key.
The rules for partitioning indexes are similar to those for tables:
An index can be partitioned unless:
In a local index, all keys in a particular index partition refer only to rows stored in a
single underlying table partition. A local index is created by specifying the LOCAL
attribute. Other important aspects of local partitioned indexes are as follows:
Oracle constructs the local index so that it is equipartitioned with the underlying
table.
Oracle also maintains the index partitioning automatically when partitions in the
underlying table are added, dropped, merged, or split, or when hash partitions or
subpartitions are added or coalesced, ensuring that the index remains equipartitioned
with the table.
A local index can be created UNIQUE if the partitioning columns form a subset of
the index columns. This restriction guarantees that rows with identical index keys always
map into the same partition, where uniqueness violations can be detected.
Only one index partition needs to be rebuilt when a maintenance operation other
than SPLIT PARTITION or ADD PARTITION is performed on an underlying table partition.
The duration of a partition maintenance operation is proportional to partition size.
Local indexes support partition independence.
Local indexes support smooth roll-out of old data and roll-in of new data in
historical tables.
Oracle can take advantage of the fact that a local index is equipartitioned with the
underlying table to generate improved query access plans.
Local indexes simplify the task of tablespace incomplete recovery. In order to
recover a partition or subpartition of a table to a point in time, the corresponding index
entries must be recovered to the same point in time.
Oracle Database PL/SQL Packages and Types Reference for a description of the
DBMS_PCLXUTIL package
8.4.1.1 Local Prefixed Indexes
A local index is prefixed if it is partitioned on a left prefix of the index columns.
A local index is nonprefixed if it is not partitioned on a left prefix of the index columns.
Therefore, it is not possible to have a unique local nonprefixed index unless the
partitioning key is a subset of the index key.
In a global partitioned index, the keys in a particular index partition may refer to rows
stored in more than one underlying table partition or subpartition. It is also possible to
highlight the following features:
A global index can be range or hash partitioned, though it can be defined on any
type of partitioned table.
A global index is created by specifying the GLOBAL attribute.
Index partitions can be merged or split as necessary.
Normally, a global index is not equipartitioned with the underlying table and
usually nothing could prevent this. An index that must be equi-partitioned with the
underlying table should be created as LOCAL.
A global partitioned index contains a single B-tree with entries for all rows in all
partitions. Each index partition may contain keys that refer to many different partitions
or subpartitions in the table.
The highest partition of a global index must have a partition bound all of whose
values are MAXVALUE.
The following are distinctive features differentiating global indexes on prefixing, namely:
A global partitioned index is prefixed if it is partitioned on a left prefix of the index
columns.
Global prefixed partitioned indexes can be unique or nonunique.
Nonpartitioned indexes are treated as global prefixed nonpartitioned indexes.loba
The most practical aspects of global partitioned indexes are emphasized in the following
statements:
Global partitioned indexes are harder to manage than local indexes.
When the data in an underlying table partition is moved or removed (SPLIT, MOVE,
DROP, or TRUNCATE), all partitions of a global index are affected. Thus, global indexes
do not support partition independence.
When an underlying table partition or subpartition is recovered to a point in time,
all corresponding entries in a global index must be recovered to the same point in time.
Because these entries may be scattered across all partitions or subpartitions of the
index, mixed in with entries for other partitions or subpartitions that are not being
recovered, there is no way to accomplish this except by re-creating the entire global
index.
When deciding how to partition indexes on a table, consider the mix of
applications that need to access the table.
There is normally a trade-off between performance and availability, and
manageability.
The most important guidelines for OLTP and DSS are listed below respectively, as
follows:
8.5.1 For OLTP applications
Global indexes and local prefixed indexes provide improved performance over
local non-prefixed indexes because they minimize the number of index partition probes.
Local indexes support more availability when there are partition or subpartition
maintenance operations on the table.
Local non-prefixed indexes are very useful for historical databases.
8.5.2 For DSS applications
Local non-prefixed indexes can improve performance because many index
partitions can be scanned in parallel by range queries on the index key.
For historical tables, indexes should be local if possible. This limits the impact of
regularly scheduled drop partition operations.
Unique indexes on columns other than the partitioning columns must be global
because unique local non-prefixed indexes whose key does not contain the partitioning
key are not supported.
8.6 Tuning and Mixing objects in Multiple Block Size Database Models
The option to set index in larger block size tablespaces and caches has proven to greatly
improve performance in Decision Support Systems (DSS), as introduced in the authors
paper OMBDB, Oracle Multiple Block Size Databases: An Innovative Paradigm for
Datawarehousing Architectures, as verified by several field researchers and leading
consultants. This option has also performed quite well in many Online Transaction
Processing (OLTP) Systems.
When using partitioning and compression options together, it is possible to use compress
in most cases, and the clause COMPRESS FOR ALL OPERATIONS can be specified at table
creation time. Partitions can be individually set as compressed or not compressed, as
needed.
Likewise, when using table compression on partitioned tables with bitmap indexes, the
DBA needs to do the following before introducing the compression attribute for the first
time:
Range partitioning is a convenient method for partitioning historical data. Besides, other
reasons include:
The boundaries of range partitions define the ordering of the partitions in the
tables or indexes.
Interval partitioning is an extension to range partitioning in which, beyond a point
in time, partitions are defined by an interval. Interval partitions are automatically
created when the data is inserted into the partition.
Range or interval partitioning is often used to organize data by time intervals on a
column of type DATE.
For instance, keeping the past 48 months worth of data online, Range partitioning
simplifies this process. To add data from a new month, the DBA will load it into a
separate table, clean it, index it, and then add it to the range-partitioned table using the
EXCHANGE PARTITION statement, all while the original table remains online. After
adding the new partition, the DBA can drop the trailing month with the DROP PARTITION
statement.
There are scenarios when it is not trivial into which partition data should reside, although
the partitioning key can be identified. With hash partitioning, a row is placed into a
partition based on the result of passing the partitioning key into a hashing algorithm.
The next guidelines should be followed carefully:
When using this approach, data is randomly distributed across the partitions
rather than grouped together.
Hence, this is a great approach for some data, but may not be an effective way to
manage historical data.
Partition pruning is limited to equality predicates.
Hash partitioning also supports partition-wise joins, parallel DML and parallel index
access.
Hash-partitioning is beneficial when the DBA needs to enable partial or full
parallel partition-wise joins with very likely equi-sized partitions or distribute data evenly
among the nodes of an MPP platform using RAC, thus minimizing interconnect traffic
when processing internode parallel statements.
10.3 When to Use List Partitioning
It is recommended to use list partitioning when you want to specifically map rows to
partitions based on discrete values.
It can benefit from parallel backup and recovery of a single table (manageability
perspective).
The DBA can split up backups of your tables and you can decide to store data differently
based on identification by a partitioning key.
Thus, the subpartitions may have properties that differ from the properties of the table
or from the partition to which the subpartitions belong.
Composite range-hash partitioning is particularly common for tables that store history,
are very large as a result, and are frequently joined with other large tables. The
following are relevant issues, namely:
Composite list-hash partitioning is applicable to large tables that are usually accessed on
one dimension, but because of their size need yet to take advantage of parallel full or
partial partition-wise joins.
Composite list-list partitioning is helpful for large tables that are often accessed on
different dimensions. The DBA can explicitly map rows to partitions on those dimensions
on the basis of discrete values.
Composite list-range partitioning is advantageous for large tables that are accessed on
different dimensions. For the most commonly used dimension, the DBA can explicitly
map rows to partitions on discrete values. In general, list-range partitioning is likely to
be used for tables that use range values within a list partition; in contrast range-list
partitioning is mostly used for discrete list values within a range partition. Besides, list-
range partitioning is less likely to be used to store historical data, although equivalent
scenarios all work. Range-list partitioning can be implemented using interval-list
partitioning, while list-range partitioning does not support interval partitioning.
Interval partitioning can be used for every table that is range partitioned and uses fixed
intervals for new partitions. The database automatically creates interval partitions as
data for that partition is loaded. Until this happens, the interval partition exists but no
segment is created for the partition.
The benefit of interval partitioning is that there is no need to create your range partitions
explicitly. Therefore, a DBA could consider using interval partitioning unless there is a
need to create range partitions with different intervals, or a need to specific partition
attributes when creating range partitions. When upgrading an application it is
recommended to use range partitioning or composite range-rangehashlist partitioning,
accordingly.
Virtual column partitioning enables partitioning on an expression, which may use data
from other columns, and perform calculations with these columns.
The Oracle Database Partitioning option provides an uniquely ideal platform for
implementing an ILM solution offering:
11.2 Fine-grained
View data at a very fine-grained level as well as group related data together, whereas
storage devices only see bytes and blocks.
11.3 Low-Cost
Immutability
Privacy
Auditing
Expiration
12. Oracle Partitioning for Datawarehousing
Datawarehouses often require techniques both for managing large tables and providing
good query optimization.
Oracle Partitioning is beneficial in attaining the following Datawarehousing goals,
namely:
12.1 Scalability
12.2 Performance
Besides, partition pruning greatly reduces the amount of data retrieved from disk and
shortens processing time, thus improving query performance and optimizing resource
utilization.
The optimizer utilizes a wide variety of predicates for pruning. The three predicate types,
equality, range, and IN-list, are the most commonly used cases of partition pruning.
12.3 Manageability
Manageability is greatly improved of the usage of Oracle Enterprise Manager Database
and Grid Control and the usage of supplied packages.
The underlying storage for a materialized view is a table structure, and therefore
partitioning materialized views is quite similar. When the database rewrites a query to
run against materialized views, the query can take advantage of the same performance
features as those queries running against tables MVs directly benefit from. Similarly, a
rewritten query may eliminate materialized view partitions and it can take advantage of
partition-wise joins, when joins back to tables or with other materialized views are
necessary.
The next sample code illustrates how to effectively create a compressed materialized
view partitioned by hash, which uses an aggregation on period_code.
Partitions can be added using Partition Exchange Load (PEL). When using PEL, a
separate identical table to a single partition is created, including the same indexes and
constraints, if any.
Likewise, in order to manually keep materialized views up to date, the init.ora parameter
QUERY_REWRITE_INTEGRITY must be set to either TRUSTED or STALE_TOLERATED.
When using materialized views and base tables with comparable partitioning strategies,
then PEL can be an extremely powerful way to keep materialized views up-to-date
manually. Here is how PEL can work:
Create tables to enable PEL against the tables and materialized views
Load data into the tables, build the indexes, and implement any constraints
Update the base tables using PEL
Update the materialized views using PEL.
Besides, there is a need for a DBA to execute ALTER MATERIALIZED VIEW CONSIDER
FRESH for every materialized view updated using this strategy.
Likewise, partitioning also effectively addresses OLTP features and characterization such
as, namely:
The following scenarios imply special considerations for partition placement, namely:
Using Bigfile Tablespaces (due to one actual big file is used.)
Customization (since other Oracle options are used in conjunction with it.)
Oracle Exadata (a new VLDB Oracle database infrastructure with superfast
performance capabilities. )
The next exhibits illustrate some sample code involving large objects (LOBs) and user-
defined object datatypes, as shown:
Exhibit 31. Partitioning Support for Nested Tables using Object User-Defined Datatypes
When in doubt refer to sample code, forum discussions, and case studies.
This is more important volumes are based on a (Stripe and mirror everything)
SAME-approach (i.e., RAID 0+1).
15.3 Constraints
As previously stated, there is no support for LONG and LONG RAW data types on
any Oracle partitioned object or any partitioning strategy discussed.
A VARRAY of XML data types cannot be set in a partitioned table (via an SQL DDL
statement.)
Certain datatypes have size and storage constraints such as LOBs or large
VARCHAR2 definitions.
Upon completion of this resarch study, the author has arrived to the following
remarkable conclusions based on his experience in the field: