0% found this document useful (0 votes)
35 views

Chapt 23

The document discusses very large databases (VLDBs) and how to implement them using Oracle8 database features like object partitioning and large object types. It describes how partitioning allows large database objects like tables and indexes to be split into smaller, more manageable partitions. This reduces database downtime for maintenance and recovery from failures. Partitioning also improves query performance and disk access. The document provides guidelines for determining if a table is a candidate for partitioning and examples of using the CREATE TABLE statement to partition a table by range on a column.

Uploaded by

Arsalan Ahmed
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Chapt 23

The document discusses very large databases (VLDBs) and how to implement them using Oracle8 database features like object partitioning and large object types. It describes how partitioning allows large database objects like tables and indexes to be split into smaller, more manageable partitions. This reduces database downtime for maintenance and recovery from failures. Partitioning also improves query performance and disk access. The document provides guidelines for determining if a table is a candidate for partitioning and examples of using the CREATE TABLE statement to partition a table by range on a column.

Uploaded by

Arsalan Ahmed
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Very Large Databases

ery large databases are a rapidly growing trend in the enterprise world. From the gigabyte to the terabyte monsters, todays RDBMS must be able to store and manage very large databases. This chapter discusses how to architect and implement these monsters through the use of Oracle8 database Object partitioning and large Object types.

23
C H A P T E R

In This Chapter

Introduction to very large databases (VLDB) Table partitioning Index partitioning Large Object types (BLOB, NLOB, CLOB, and BFILE) The DBMS_LOB package

Very Large Databases


Information is the interface to the world and beyond. With the explosive use of the World Wide Web and the Internet as the highway for information retrieval, there is no shortage of information in todays society. The data, information, and knowledge lifecycle continues to live and grow within the veins of an enterprise. The database stores the data that forms the foundations of information and knowledge. The major increasing exponential factor over the decades is the currency value of data. Data is the fuel in every large or small company. It is continuously cross-fertilized and reused in its many incarnations. Concepts such as data warehouses and data marts are now a reality. The amount of current and historical permutated data required to satisfy a decision constantly grows. Enterprises store more data than ever before. The increase of the amount of stored data, coupled with the falling cost of disk storage hardware and raw CPU power, has caused an evolution of a very large database (VLDB). In the past, a VLDB was measured in the gigabytes, whereas now the term seems more appropriate for a database measured in the terabytes and even petabytes.

570

Chapter 23 3 Very Large Databases

VLDBs can be categorized into two types: 3 Online Transaction Processing Databases (OLTP) 3 Data warehouses Online Transaction Processing Databases support a large concurrent User population. The database transactions usually follow the simple Create, Read, Update, and Delete (CRUD) matrix. Data warehouses, on the other hand, have an infrastructure that supports both OLTP and Decision Support Systems (DSS). The segregation of data within the data warehouse makes this type of database unique. A data warehouse is architected to store all enterprise data requirements historical or current. The element of time splits the OLTP and DSS portions of the database. Oracle8 has been robustly improved to embrace todays VLDBs. The following Oracle8 database areas have been enhanced to store, manage, and support VLDBs: 3 Database Object partitioning 3 Transaction queuing 3 Networking through Net8 3 Parallel Server 3 Enterprise Manager 3 Oracle8 datatypes 3 Backup and recovery This chapter primarily focuses on database Object partitioning and Oracle8 datatypes in the context of VLDBs. You can easily locate the other subject areas in their respective chapters in this book.

Database Object Partitioning


A VLDB often contains a few large database Objects (such as Tables and Indexes) with their respective sizes spanning many gigabytes to a few terabytes. Databases are rarely identified as a VLDB based solely upon the number of database Objects it constitutes. Oracle8 partitioning addresses the implementation of VLDBs by allowing large database Objects to be split into smaller, more manageable units. These smaller units, or partitions, can only be created for Tables and Indexes containing structured data.

Chapter 23 3 Database Object Partitioning

571

Figure 23-1 illustrates the concept of partitioning.

Very large table

Partition A

Partition B

Partition C

Partition

Partition D

Partition E

Figure 23-1: The concept of partitioning

Advantages of using Oracle8 partitions


Using partitions to address VLDB implementation has the following advantages:

Reducing database downtime for maintenance


If you split the database Objects into smaller, more manageable Partitions, you can perform maintenance on individual units of the database Objects, as opposed to the whole database Object. Also, if you place the Object on its own Tablespace, you can bring the Tablespace offline for maintenance while the database is still operational.

Reducing database downtime due to failure


If the database experiences media (disk) failure, the database must be recovered. Recovery involves restoring the respective database files, using the SQL RECOVER command to recover the lost data, and finally bringing the respective datafiles and Tablespaces online. Most VLDBs are mission critical and database failures waste valuable data processing time. Partitioning a Table into smaller partitions reduces downtime for VLDBs if the respective partitions are placed on separate Tablespaces. In this way, the Tablespaces or datafiles can be brought offline and replaced and recovered from a backup file. Smaller partitions reduce recovery time. The recovery time is proportional to partition size as opposed to the entire Table size, which could be huge.

572

Chapter 23 3 Very Large Databases

Query performance
Because partitions can be created based on ranges of certain Columns, queries that may require a full Table scan can focus on a particular partition or a range of partitions. This focus drastically reduces the search time necessary to satisfy the data requirements of a SQL query.

Disk access
Partitions allow a flexible physical implementation of a database Object. Partitions can be created on separate Tablespaces, which you can, in turn, place on their own physical disk. Having a dedicated disk controller permits faster searches on partitions. Database design can specify the high hit rate database Objects have their own dedicated disks.

Partition transparency
In todays complex application development environment, the developer certainly does not want physical Constraints on data access. Partition implementation is transparent to end Users. Users or developers do not need to be aware of the physical implementation of the database Objects that capability is available for query and I/O optimization.

Creating Table partitions


Architecting VLDBs is a complex task. The database architect must be aware of the fundamental concepts of Oracle8 partitioning. You must forecast certain Tables to be potentially partitionable, but every Table cannot be partitioned. The following guidelines provide a validation framework for whether a potential logical Table is a partition candidate: 3 A Table existing within an Oracle8 cluster cannot be partitioned. 3 A Table cannot be partitioned if it contains unstructured data. This guideline implies the Table cannot contain the following datatypes: LOBs LONG RAW Object types 3 An Index-Organized Table cannot be Partitioned.
Note

An Index-Organized Table stores data sorted on the Primary Key. Use the CREATE Table command with the PARTITION BY Range clause to create a partitioned Table. This clause specifies the Columns from which Key Values will range-partition the Table. Use as many as sixteen Columns to specify the partition base.

Chapter 23 3 Database Object Partitioning

573

For example, the following CREATE Table statement creates a STUDENTS Table with five partitions. Each partitions description specifies a partition name and physical attributes for the partition.
CREATE Table STUDENTS ( STUDENT_ID INTEGER NOT Null, STUDENT_FIRST_NAME VARCHAR2(25) Null , STUDENT_LAST_NAME VARCHAR2(25) Null , STUDENT_DEPT_ID INTEGER Null , STUDENT_ADDRESS VARCHAR2(50) Null , STUDENT_CITY VARCHAR2(25) Null , STUDENT_STATE VARCHAR2(15) Null , STUDENT_ZIP VARCHAR2(10) Null , STUDENT_BL_STATUS CHAR Null, CONSTRAINT DEPT_ID_PK Primary Key(STUDENT_DEPT_ID) ) PARTITION BY Range (STUDENT_DEPT_ID) ( PARTITION DEPT_ID_1 ValueS LESS THAN (100) TABLESPACE OM1, PARTITION DEPT_ID_2 ValueS LESS THAN (250) TABLESPACE OM2, PARTITION DEPT_ID_3 ValueS LESS THAN (500) TABLESPACE OM3, PARTITION DEPT_ID_4 ValueS LESS THAN (750) TABLESPACE OM4, PARTITION DEPT_ID_5 ValueS LESS THAN (MAXVALUE) TABLESPACE OM5);

Figure 23-2 illustrates the partitioned STUDENTS Table.

STUDENTS table

DEPT_ID_1

DEPT_ID_2

DEPT_ID_3

Rangepartitioning

DEPT_ID_4

DEPT_ID_5

Figure 23-2: The STUDENTS Table after range-partitioning

574

Chapter 23 3 Very Large Databases

Note

Each partition can have its own storage characteristics allowing Tables to span multiple Tablespaces. The STUDENTS Table is partitioned based on the STUDENTS_DEPT_ID Column values. Five partitions are created with the following names and range values:
DEPT_ID_1 Range 0 to 99 DEPT_ID_2 Range 100 to 249 DEPT_ID_3 Range 250 to 499 DEPT_ID_4 Range 500 to 749 DEPT_ID_5 Range 750 to MAXVALUE

The MAXVALUE keyword allocates any rows that do not fit into the other partition range criteria to the last partition. The MAXVALUE keyword is usually reserved for the very last partition. The STUDENTS Table can also be partitioned using the Oracle8 Schema Manager. The Oracle8 Schema Manager provides a visual interface to create, maintain, split, and drop the partitions of a Table. Figure 23-3 illustrates the STUDENTS Table partitions in the Oracle8 Schema Manager.

Table partition attributes


Oracle8 implements horizontal partitioning using range-partitioning. Rangepartitioning maps rows to partitions based on the range of the partition Column values. Implement range-partitioning by using the following clauses in the CREATE Table statement:
PARTITION BY Range ( Column_list ) VALUES LESS THAN ( Value_list )

The PARTITION BY Range clause


The PARTITION BY Range clause defines the Columns used to partition the Table. The Column_list is also known as the partitioning Columns. The following statement specifies this clause:
PARTITION BY Range (STUDENT_DEPT_ID)

Chapter 23 3 Database Object Partitioning

575

Figure 23-3: Partitions through the Oracle8 Schema Manager

The Values LESS THAN clause


The Values LESS THAN clause for a Table partition specifies the upperbound and lowerbound Values for the partitioned Columns. The lowerbound value is always set for the next partition. The next partition must always have a value greater than the previous partition. For example, the partition DEPT_ID_2 has an upperbound Key value of 250, which is greater than the upperbound value for the previous partition DEPT_ID_1:
PARTITION DEPT_ID_1 ValueS LESS THAN (100) TABLESPACE OM1 PARTITION DEPT_ID_2 ValueS LESS THAN (250) TABLESPACE OM2
Note

If the partitioning Column contains null Key values, Oracle8 sorts them greater than the other partition Key values, but less than the MAXVALUE. Thus, in partitioning a Table on Columns that contain null values, the last partitions bounds should specify the MAXVALUE keyword. Otherwise, Oracle8 will not be able to place the respective null rows in any of the defined partition ranges and an error will occur for that transaction.

576

Chapter 23 3 Very Large Databases

Partitioning Tables using multiple Columns


If more than one Column is defined for the partition, Oracle8 treats the partition Key values as vectors to decide in which partition a row will be placed. For example, the following statement defines a multicolumn partition for the
STUDENTS Table: PARTITION BY Range (STUDENT_ID, STUDENT_DEPT_ID) ( PARTITION DEPT_ID_1 ValueS LESS THAN (1000, 100) TABLESPACE OM1, PARTITION DEPT_ID_2 ValueS LESS THAN (2000, 250) TABLESPACE OM2, PARTITION DEPT_ID_3 ValueS LESS THAN (3000, 500) TABLESPACE OM3, PARTITION DEPT_ID_4 ValueS LESS THAN (4000, 750) TABLESPACE OM4, PARTITION DEPT_ID_5 ValueS LESS THAN (MAXVALUE,MAXVALUE) TABLESPACE OM5);

If a row is inserted with the Keys (2500, 150), the row is inserted into the DEPT_ID_2 partition because Key (2500, x) is less than (3000, x) and greater than (1000, 100). Multicolumn partitions are best used when even distribution of rows need to occur through the defined partitions.

Partition names
You must identify every partition by a name. Each partitions name must be unique in respect to the other partitions defined for the same parent Table or Index. This unique naming scheme enables you to refer to the partitions directly in data manipulation, Import, Export, and maintenance operations. For example, the following SELECT statement only returns rows that exist in the
DEPT_ID_1 partition. SELECT STUDENT_FIRST_NAME, STUDENT_DEPT_ID FROM AMY.STUDENTS Partition (DEPT_ID_1); STUDENT_FIRST_NAME STUDENT_DEPT_ID ........................................ Amy 50 Sampson 25 Einstein 68 Cortney 99 Patrick 54 Chandi 10 6 Rows selected.

Chapter 23 3 Database Object Partitioning

577

Equi-partitioning
The equi-partition of Tables involves creating the same number of partitions with the same number of Columns for two or more related Tables. This also applies to a Table and its respective Indexes. For example, if the STUDENTS Table is partitioned using STUDENT_ID, and STUDENT_DEPT_ID and its respective Index also use the same partition Columns, the two are equi-partitioned. The equi-partition of Tables has the following advantages: 3 Oracle8 improves the partitioned Tables execution plan in complex JOIN and SORT operations. 3 Media recovery time is reduced as dependent Tables or Indexes can be recovered to the same time.
Note

You can achieve Table and Index equi-partitioning using local Indexes.

Creating Index partitions


Two types of range-partitioned Indexes can be created: 3 Local Indexes 3 Global Indexes Both types of Indexes abide by the following Index partitioning rules: 3 A partitioned Index cannot be applied on cluster Tables. 3 A bitmap Index on a partitioned Table must be a local Index. 3 Partitioned and nonpartitioned Indexes can be applied to partitioned and nonpartitioned Tables.
Note

Bitmap Indexes on nonpartitioned Tables cannot be range-partitioned.

Local Indexes
Local Indexes contain partition Keys that only map to rows stored in a single named partition. If a local Index is created for the STUDENTS Table, five Indexes are created one for each partition. Create a local Index on a partition by issuing the CREATE Index command using the local attribute. For example, the following CREATE Index statement creates a local Index called DEPT_IDX on the STUDENTS Table:
CREATE Index DEPT_IDX ON STUDENTS (STUDENT_DEPT_ID) LOCAL ( PARTITION DEPT_ID_1 TableSPACE OM1,

578

Chapter 23 3 Very Large Databases

PARTITION PARTITION PARTITION PARTITION

DEPT_ID_2 DEPT_ID_3 DEPT_ID_4 DEPT_ID_5

TableSPACE TableSPACE TableSPACE TableSPACE

OM2, OM3, OM4, OM5);

Figure 23-4 illustrates the local Index DEPT_IDX.

STUDENTS table

DEPT_ID_1

DEPT_ID_2

DEPT_ID_3

Rangepartitioning

DEPT_ID_4

DEPT_ID_5

Local index for DEPT_ID_1

Local index for DEPT_ID_2

Local index for DEPT_ID_3

Local index for DEPT_ID_4

Local index for DEPT_ID_5

Local index DEPT_IDX

Figure 23-4: The local Index DEPT_IDX

Every local Index is equi-partitioned with respect to the underlying Table. Thus, the Index is created using the same partition Columns and range-partition Keys as the underlying partitioned Table. These creation elements ensure the Index has the same partition bounds as the Table. Local Indexes have the following advantages: 3 Only one local Index is affected if a single partition undergoes maintenance. 3 Partition independence is supported. 3 Only local Indexes can support individual partition Import and Export routines. 3 Oracle8 ensures better query access plans.

Chapter 23 3 Database Object Partitioning

579

3 Incomplete recovery is simplified because the partition and the respective local Index can be recovered to the same time. 3 Local Indexes can be rebuilt individually. 3 Bitmap Indexes are supported only as local Indexes. There are two types of local Indexes: 3 Local prefixed Indexes 3 Local nonprefixed Indexes

Local prefixed Indexes


A local prefixed Index uses the partition Columns in its Index. For example, the STUDENTS Index DEPT_IDX is a local prefixed Index.

Local nonprefixed Indexes


A local nonprefixed Index does not contain any partition Columns in its Index. For example, if the STUDENTS table had an Index based on ZIPCODE, the Index would be a local nonprefixed Index. Local nonprefixed Indexes are very useful in historical Tables because they provide fast access to data required for reporting and other DSS requirements.
Note

A nonprefixed Index scan is more expensive than a prefixed Index scan.

Global Indexes
A global Index contains Keys that refer to more than one partition of an underlying Table. A global Index is created by using the global attribute of the CREATE Index command. In general, all nonpartitioned Indexes are treated as global prefixed Indexes. For example, the DEPT_IDX Index on the STUDENTS Table can be created as a global Index with the following CREATE Index statement:
CREATE Index DEPT_IDX ON STUDENTS (STUDENT_DEPT_ID) GLOBAL Partition BY Range (STUDENT_DEPT_ID) (PARTITION DEPT_ID_1 ValueS LESS THAN (1000, 100) TABLESPACE OM1, PARTITION DEPT_ID_2 ValueS LESS THAN (2000, 250) TABLESPACE OM2, PARTITION DEPT_ID_3 ValueS LESS THAN (3000, 500) TABLESPACE OM3, PARTITION DEPT_ID_4 ValueS LESS THAN (4000, 750) TABLESPACE OM4, PARTITION DEPT_ID_5 ValueS LESS THAN (MAXVALUE,MAXVALUE) TABLESPACE OM5);

Figure 23-5 illustrates the global Index DEPT_IDX.

580

Chapter 23 3 Very Large Databases

STUDENTS table

DEPT_ID_1

DEPT_ID_2

DEPT_ID_3

Rangepartitioning

DEPT_ID_4

DEPT_ID_5

Global index

Global index DEPT_IDX

Figure 23-5: The global Index DEPT_IDX

Global partitions can be created with different partition bounds than the underlying Tables partitions. But even if you define the bounds identically to the underlying Table, Oracle8 does not treat global Indexes as equi-partition Indexes. As a result, global Indexes require more maintenance than local Indexes.
Note

When using Global Indexes, use the MAXVALUE keyword for the highest bound on the last partition. This keyword takes all respective values into account in the underlying Table. Maintaining global Indexes on a very large Table is time consuming because of the following reasons: 3 If the underlying partition is removed or moved, the global Index is immediately affected because it spans all partitions of the underlying Table. You need to rebuild the global Index with the time proportional to the size of the Table not the partition directly affected by the operation. 3 If the partition needs to be recovered to a time, the partition and the global Index need to be resynchronized to the same time. Thus, the global Index will need to be rebuilt.

Chapter 23 3 Database Object Partitioning

581

Guidelines for partitioning Indexes


Implementing the correct type of Index on a very large Table affects the performance of the respective DML commands against the Table as well as any necessary maintenance operations. The following points provide a guide in selecting the proper Index for a large Table: 3 In general, global Indexes and local prefixed Indexes provide the best performance. 3 If the Tables are historical, query performance can be improved by using local nonprefixed Indexes. 3 Separate Tablespaces should be used for Indexes. 3 If the Table requires regular maintenance, use local Indexes. If the partition resides on its own Tablespace, you can still access the underlying Table while the partition is brought offline for maintenance. 3 The downtime for a partition is proportional to the size of the partition not to the size of the underlying Table.

Managing partitioned Tables and Indexes


A DBA can manage partitioned Tables and Indexes using the ALTER Table and ALTER Index commands. The following list of ALTER commands manage and maintain partition Tables and Indexes, respectively:

ALTER Table
MODIFY partition RENAME partition MOVE partition ADD partition

Modifies the real physical attributes of the partition. Renames a partition to another name. Moves a partition to another segment. Adds a new partition to the Table. It is usually added after the highest partition. Removes a partition and its data from the Table. Efficiently removes all rows in a partition. Creates a new partition from an old partition. Converts a partitioned Table into a nonpartitioned Table and also converts a nonpartitioned Table into a partitioned Table (the conversion includes Indexes).

DROP partition TRUNCATE partition SPLIT partition EXCHANGE partition

582

Chapter 23 3 Very Large Databases

ALTER Index
REBUILD partition DROP partition SPLIT partition UNUSABLE RENAME partition

Rebuilds one partition of an Index. Removes a partition from a global Index. Splits a global partition into two partitions. Marks the Index as unusable. Renames a partitioned Index.

Further partition information


Use the following views to access information on the implemented partitions in a database: 3 DBA_IND_PARTITIONS 3 DBA_TAB_PARTITIONS 3 DBA_PART_COL_STATISTICS 3 USER_TAB_PARTITIONS 3 USER_PART_COL_STATISTICS 3 USER_IND_PARTITIONS 3 ALL_TAB_PARTITIONS 3 ALL_IND_PARTITIONS 3 ALL_PART_COL_STATISTICS

Large Object Datatypes


Todays businesses must store more unstructured data than ever before. Databases must be prepared to store and manage these business demands by catering to both large character-based data and the following types of data: 3 Video 3 Movies 3 Graphic images 3 Sound wave files Oracle8 has improved the storage and manipulation of these large Objects. Oracle7 could only handle these large Objects through the use of LONG and LONG RAW Column types with a 2GB limit on each types length.

Chapter 23 3 Large Object Datatypes

583

Oracle8 introduces the new Large Object Column types, or LOBs. You can use LOBs to store up to 4GB of data internally. Oracle8 also provides better storage and piece-wise access mechanisms than Oracle7xx.

LOB datatypes
The two categories of LOB are differentiated by whether the datatype is stored internally or externally in respect to the Oracle8 database. Unless otherwise specified, only the locators for these LOBs are stored in their respective Table Columns.
Note

Locators point to where the actual LOB data is actually stored in the database or on the operating system disks.

Internal LOBs
The following internal LOBs types in Tables store up to 4GB of data: 3 BLOB (Unstructured binary data) 3 CLOB (Single-byte character data) 3 NCLOB (Fixed-length multibyte national character data)

External LOBs
The BFILE is the only external LOB type supported by Oracle8. This LOB type externally stores data to the database as an operating system file. Unlike the internal LOBs, the BFILE type is read-only. Oracle8 also provides no transaction control or integrity checks on BFILE types. BFILEs can also store up to 4GB of data.

Creating a Table with LOBs


A Table can be created using LOBs. Use the CREATE Table command and define your Columns with the respective LOB datatypes. If you define LOBs for a Table, you can specify additional storage requirements for the internal LOB columns in the CREATE Table statement. For example, the following statement creates a new STUDENT Table.
CREATE Table STUDENT ( STUDENT_ID INTEGER NOT Null, STUDENT_FIRST_NAME VARCHAR2(25) Null , STUDENT_LAST_NAME VARCHAR2(25) Null , STUDENT_PICTURE BLOB EMPTY_BLOB(), STUDENT_DEPT_ID INTEGER Null , STUDENT_ADDRESS VARCHAR2(50) Null , STUDENT_CITY VARCHAR2(25) Null , STUDENT_STATE VARCHAR2(15) Null , STUDENT_ZIP VARCHAR2(10) Null , STUDENT_HISTORY CLOB EMPTY_CLOB(), STUDENT_REPORT BFILE Null , CONSTRAINT STUDENT_PK Primary Key(STUDENT_ID)

584

Chapter 23 3 Very Large Databases

) STORAGE (INITIAL 10K NEXT 10K PCTINCREASE 30) LOB (STUDENT_PICTURE) STORE AS (TABLESPACE STUDENT_PIC_ts STORAGE (INITIAL 100M NEXT 100M PCTINCREASE 50)), LOB (STUDENT_HISTORY) STORE AS (TABLESPACE STUDENT_HIST_ts Storage (INITIAL 200K NEXT 100K PCTINCREASE 50));

Figure 23-6 illustrates LOB storage in the STUDENT Table.

Oracle8 database

STUDENT_PIC tablespace

STUDENT_HIST tablespace

STUDENT table tablespace

STUDENT table

STUDENT_REPORT (OS file)

Operating system file system Figure 23-6: The LOB storage in the STUDENT Table

The STUDENT Table contains the following LOB Columns: 3 STUDENT_PICTURE (BLOB) 3 STUDENT_HISTORY (CLOB) 3 STUDENT_REPORT (BFILE) Each LOB except the BFILE is assigned a separate Tablespace with its own storage parameters. These parameters are specified by the following statement:
LOB (STUDENT_PICTURE) STORE AS

Chapter 23 3 Large Object Datatypes

585

(TABLESPACE STUDENT_PIC_ts STORAGE (INITIAL 10M NEXT 10M PCTINCREASE 50)), LOB (STUDENT_HISTORY) STORE AS (TABLESPACE STUDENT_HIST_ts STORAGE (INITIAL 200K NEXT 100K PCTINCREASE 50))

The EMPTY_BLOB() and EMPTY_CLOB() functions in the Column definitions initialize the respective internal LOBs to zero length and provide a default locator value.
STUDENT_PICTURE STUDENT_HISTORY
Note

BLOB CLOB

EMPTY_BLOB(), EMPTY_CLOB(),

As this book goes to press, Tables that contain LOBs can only be created using SQL. The Oracle8 Schema Manager does not support LOBs as Column types when creating a Table.

LOB locators
A LOB Column must contain a LOB locator value before a value can be updated or inserted for that LOB. A locator acts as a pointer to where the LOB is actually stored. This includes Internal and External LOB Types. In other words, the LOB Column must not be null for use in an UPDATE or INSERT statement.

Storage settings for LOBs


The STUDENT Table was created using Tablespace and storage specifications for the BLOB and CLOB internal LOBs. BFILEs are stored outside the database and thus cannot have Tablespace or storage specifications specified for their respective Columns. When storage and Tablespace information is specified for a LOB Column, the data is stored outside the Table and is known as an out-of-line LOB. If the LOB is stored out-of-line, only a locator value for the respective LOB is stored in the Table, which prevents the perhaps-large LOB from being read every time the Table is scanned in a query.
Tip

Contention is reduced on a Tablespace if the data and the LOBs are separated into different Tablespaces. You can use the following storage options when creating LOBs: 3 PCTVERSION 3 CACHE 3 NOCACHE 3 LOGGING 3 NOLOGGING

586

Chapter 23 3 Very Large Databases

3 CHUNK 3 ENABLE STORAGE IN Row 3 DISABLE STORAGE IN Row

PCTVERSION
The PCTVERSION option determines the percentage of used LOB data pages permitted before reuse. In general, if the LOB will be consistently read-only, the PCTVERSION should be set to five percent or lower. If the LOB is going to experiences-consistent-read and update operations, the
PCTVERSION should be set to approximately thirty percent.

CACHE / NOCACHE
These options specify whether to cache the LOB values in memory after use. Caching the LOB increases the access speed to locate and manipulate the LOB. Be sure to specify the CACHE option if you plan to access the LOB frequently; otherwise, use NOCACHE.

LOGGING/NOLOGGING
The LOGGING option generates a record of all operations on the LOB into the online redo log files. The NOLOGGING option does not record any LOB operations in the online redo log files. Consider this option if the LOB contains critical data. Media recovery cannot occur if the LOB is set to NOLOGGING.

CHUNK
The CHUNK option specifies the number of LOB data blocks read at one time. For example, if you specify CHUNK 8, and the data block size is 2K, a 16K page will be read each time LOB is accessed.
Note

The INITIAL and NEXT parameters must always be larger than the CHUNK value.

ENABLE/DISABLE STORAGE IN Row


This option stores the LOB inline or out-of-line. If you use the ENABLE STORAGE IN Row option, the LOB is stored in the Table row. Only 4000 bytes can be stored in the row. If a LOB exceeds 4000 bytes upon insertion with this option specified, Oracle8 immediately moves the LOB out-of-line. The DISABLE STORAGE IN Row stores the LOB out-of-line. In general, the LOB is accessed faster if contained in the Table row. This location can affect the performance of other queries that use the Table, however, because the LOB may have to be scanned and read.

Chapter 23 3 Large Object Datatypes

587

Initializing internal LOBs


All internal LOBs must be initialized prior to manipulation. A LOB can be initialized in two ways: 3 Inserting a null into the LOB 3 Setting an empty value for the LOB
Note

An empty value in LOB contains a locator and has zero length. The following example initializes the STUDENT Tables LOBs to null:
INSERT INTO STUDENT (STUDENT_ID, STUDENT_FIRST_NAME, STUDENT_LAST_NAME, STUDENT_PICTURE, STUDENT_DEPT_ID, STUDENT_ADDRESS , STUDENT_CITY, STUDENT_STATE, STUDENT_ZIP, STUDENT_HISTORY, STUDENT_REPORT) VALUES (1003, AMY,CHANDI, Null, 1, 55 Sutton Place #922, Austin, TX, 78587, NULL, Null);

The STUDENT_PICTURE, STUDENT_HISTORY, and STUDENT_REPORT LOBs are all set to null. The following example initializes the respective LOBs in the STUDENT Table to empty:
INSERT INTO STUDENT (STUDENT_ID, STUDENT_FIRST_NAME, STUDENT_LAST_NAME, STUDENT_PICTURE, STUDENT_DEPT_ID, STUDENT_ADDRESS , STUDENT_CITY, STUDENT_STATE, STUDENT_ZIP, STUDENT_HISTORY, STUDENT_REPORT) VALUES (1003, AMY,CHANDI, EMPTY_BLOB(), 1, 55 Sutton Place #922, Austin, TX, 78587, EMPTY_CLOB(), Null);

The EMPTY_BLOB() function initializes BLOB datatypes. The EMPTY_CLOB() function initializes NLOB and CLOB datatypes.

Using BFILEs
BFILEs, as stated before, are externally stored LOBs. Typically, a BFILE is a file

stored in the operating system file system, CD-ROM, or any other external media devices connected to the Oracle8 server. This also includes network drives.

588

Chapter 23 3 Very Large Databases

Before you can use BFILEs within a Table, a DIRECTORY Object needs to be created. The DIRECTORY Object is used as an alias to the physical operating system directory that will contain the BFILE Object. Use the Create DIRECTORY command to create DIRECTORY Objects. For example, the following statement associates an alias AMY_ALIAS with the path D:\ORANT\AMY_FILES:
CREATE DIRECTORY AMY_ALIAS AS D:\ORANT\AMY_FILES;

Next, grant READ access to AMY to access the directory:


GRANT READ ON DIRECTORY AMY_ALIAS TO AMY;

The Oracle8 DBA must execute the preceding two commands regarding the DIRECTORY Object. Use the BFILENAME function to insert a value into the Tables BFILE Columns. This function maps the BFILE Column to the physical file. For example, the following statement inserts the file AMY_REPORT.DOC into the STUDENT Table:
INSERT INTO STUDENT (STUDENT_ID, STUDENT_FIRST_NAME, STUDENT_LAST_NAME, STUDENT_PICTURE, STUDENT_DEPT_ID, STUDENT_ADDRESS , STUDENT_CITY, STUDENT_STATE, STUDENT_ZIP, STUDENT_HISTORY, STUDENT_REPORT) VALUES (1003, AMY,CHANDI, EMPTY_BLOB(), 1, 55 Sutton Place #922, Austin, TX, 78587, EMPTY_CLOB(), BFILENAME(AMY_ALIAS, AMY_REPORT.DOC);
Note

Close all open BFILEs after their usage by using the DBMS_LOB.FILECLOSE call. Constraints exist on the maximum number of BFILEs open at any given time. This Constraint is set in the Session_MAX_OPEN_FILES parameter in the INIT.ORA file. The default value for this parameter is ten.

Manipulating LOBs
The three following methods access LOBs in a Table: 3 Using the Oracle8 API 3 Using the DBMS_LOB Package 3 Using the Oracle Call Interface (OCI) The remaining part of this chapter focuses on manipulating LOBs using the DBMS_LOB Package.

Chapter 23 3 Large Object Datatypes

589

Selecting a LOB from a Table


Before you can use a LOB in any capacity, you must retrieve its locator value. For example, the following PL/SQL retrieves the STUDENT_PICTURE locator:
DECLARE STUDENT_PIC BEGIN SELECT STUDENT_PICTURE INTO STUDENT_PIC FROM STUDENT WHERE STUDENT_ID = 7266; END; BLOB;

Copying LOBs from other LOBs


The internal LOB types can be copied from one row or Table into another by using subqueries. This operation actually copies the LOB values and the locator into the other Table or row. For LOBs to be inline (as opposed to out-of-line), they still must meet the four thousand byte maximum size Constraint. For example, the following statement copies the STUDENT_PICTURE from one row to another:
INSERT INTO STUDENT (STUDENT_ID, STUDENT_PICTURE) (SELECT STUDENT_ID, STUDENT_PICTURE FROM STUDENT WHERE STUDENT_ID = 1254);

Reading LOBs
The DBMS_LOB.READ function in the DBMS_LOB Package is best used for all LOB read operations. For example, the following code reads the STUDENT_HISTORY from the STUDENT Table for a student with an id of 1056:
DECLARE STU_HISTORY CLOB; HISTORY_LEN BINARY_INTEGER:= 2000; OFFSET INTEGER:= 1; BUFFER_VAR RAW(2000); BEGIN SELECT STUDENT_HISTORY INTO STU_HISTORY FROM STUDENT WHERE id = 1056; DBMS_LOB.READ(STU_HISTORY, HistorY_LEN, OFFSET, Buffer_VAR); END;

Updating a LOB
LOBs can also be updated using subqueries. The following example updates a students picture from another row in the same Table. The LOB locator is copied into the row with the LOB value stored out-of-line.

590

Chapter 23 3 Very Large Databases

UPDATE STUDENT SET STUDENT_PICTURE = (SELECT STUDENT_PICTURE FROM STUDENT WHERE STUDENT_ID = 7266); WHERE id = 1056;
Note

Lock a LOBs row before updating the LOB using the SQL SELECT FOR update statement.

Deleting the row


Use the DELETE command to delete rows containing LOB Columns. The following example deletes the student with an id of 1056:
DELETE FROM STUDENT WHERE id = 1056;
Note

If the LOB is a BFILE type, the physical file is not deleted by the DELETE operation. The physical file must be manually deleted from the operating system.

Using the DBMS_LOB Package


The DBMS_LOB Package contains routines that access BLOBs, CLOBs, NLOBs, and BFILEs. Each routine requires a LOB locator as input to manipulate or even read a LOB value. Thus, you need a SELECT statement to read the locator for a LOB value into a PL/SQL variable initially. You can then use this variable as input to the DBMS_LOB routines. The following lists categorize DBMS_LOB routines:

Routines that modify BLOBs, CLOBs and NLOBs


APPEND() COPY() ERASE() LOADFROMFILE() TRIM() WRITE()

Appends the contents of the source LOB to a destination LOB. Copies all or part of the source LOB to a destination LOB. Erases all or part of a LOB. Loads a BFILEs data into an internal LOB. Trims a LOB value to the specified shorter length. Writes data to a LOB from a specified offset position.

Routines that read or examine BLOBs, CLOBs and NLOBs


GETLENGTH() INSTR()

Retrieves the length of the LOB. Returns the matching position of the nth occurrence of a specified pattern in the LOB.

Chapter 23 3 Large Object Datatypes

591

READ() SUBSTR()

Reads data from the LOB starting at a specified offset position. Returns part of the LOB value starting at a specified offset position. Compares two similar LOB types.

COMPARE()

Read-only routines for BFILEs


FILECLOSE() FILECLOSEALL() FILEEXISTS()

Closes the BFILE. Closes all previously opened BFILEs. Verifies if a specific BFILE exists on the operating system file system. Retrieves the directory alias and file name for a BFILE. Verifies if a BFILE is open. Opens a BFILE.

FILEGETNAME() FILEISOPEN() FILEOPEN()

DBMS_LOB exceptions
DBMS_LOB Package raises exceptions that can be trapped using the exception

keyword in a PL/SQL block. The following exceptions can be raised by the DBMS_LOB package:
INVALID_ARGVAL

Occurs if the DBMS_LOB routine arguments are either null or out of range. Occurs if a read or write operation by a DBMS_LOB routine exceeds the LOB size bounds. Occurs if no data is found in the LOB. Occurs if the DBMS_LOB routine accepts an invalid input argument.

ACCESS_ERROR

NO_DATA_FOUND VALUE_ERROR

DBMS_LOB Package routine syntax


The following sections are a syntax guide for the preceding DBMS_LOB Package routines:

APPEND()
This routine appends the source LOB to the destination LOB and takes the following arguments:

592

Chapter 23 3 Very Large Databases

dest_lob src_lob

Identifies the LOB to be appended. Identifies the LOB to be read and appended to dest_lob.

The syntax for the APPEND() Routine follows:


DBMS_LOB.APPEND(dest_lob, src_lob);
Note

The dest_lob row must be locked using the SELECT FOR update statement.

COMPARE()
This routine compares two LOBs of the same datatype. This routine returns zero if the two LOBs match exactly; otherwise, a non-zero integer is returned. This routine takes the following arguments:
lob_1 lob_2 num_bytes offset_1 offset_2

Locator of the first LOB to compare. Locator of the second LOB to compare. Number of bytes to compare from lob_1 and lob_2. Offset of lob_1. Offset of lob_2.

The syntax for the COMPARE() routine follows:


DBMS_LOB.COMPARE(lob_1, lob_2, num_bytes, offset_1, offset_2);
Note

If a BFILE LOB is being compared, the BFILE must be successfully opened using the DBMS_LOB.FILEOPEN() routine. Afterwards, the BFILE must be closed using the DBMS_LOB.FILECLOSE() routine.

COPY()
This routine copies all or part of a source LOB to a destination LOB and takes the following arguments:
dest_lob src_lob num_bytes dest_offset src_offset

Locator of the destination LOB. Locator of the source LOB to be copied to the dest_lob. Number of bytes to copy. Specifies the offset of where to copy into the dest_lob. Specifies the offset of the source LOB from which to initiate copying.

Chapter 23 3 Large Object Datatypes

593

The syntax for the COPY() Routine follows:


DBMS_LOB.COPY(dest_lob, src_lob, num_bytes, dest_offset, src_offset);
Note

The dest_lob row is locked using the SELECT FOR update statement.

ERASE()
This routine erases all or part of a LOB. The routine returns the number of bytes erased as a return value and takes the following arguments:
lob_1 num_bytes lob_offset

Locator of the LOB to be erased. Number of bytes to erase. Specifies the offset of LOB to be erased.

The syntax for the ERASE() routine follows:


DBMS_LOB.ERASE(lob_1, num_bytes, lob_offset);

LoadFROMFILE()
This routine copies all or part of an external LOB to an internal LOB and takes the following arguments:
dest_lob src_lob

Locator of the destination LOB. Locator of the external source LOB (BFILE) to be loaded to the dest_lob. Number of bytes to copy from the BFILE. Specifies the offset of where to copy into the dest_lob. Specifies the offset of the source LOB from which to initiate loading.

num_bytes dest_offset src_offset

The syntax for the LoadFROMFILE() routine follows:


DBMS_LOB.LOADFROMFILE(dest_lob, src_lob, num_bytes, dest_offset, src_offset);
Note

A BFILE LOB must be successfully opened using the DBMS_LOB.FILEOPEN() routine. Afterwards, you must close the BFILE using the DBMS_LOB.FILECLOSE() routine.

594

Chapter 23 3 Very Large Databases

TRIM()
This routine trims the value of the internal LOB to a specified length and takes the following arguments:
lob_trim num_bytes

Locator of the LOB to be trimmed. Number of bytes to trim from the LOB.

The syntax for the TRIM() routine follows:


DBMS_LOB.TRIM(lob_trim, num_bytes);

WRITE()
This routine writes an amount of data into a LOB from the offset position to a specified length. Any data already contained in that space is overwritten. The data must be written from a buffer variable that can be defined in a PL/SQL block. This routine takes the following arguments:
lob_write num_bytes Offset buffer

Locator of the LOB to be written. Number of bytes to write into the LOB. Specifies the offset of where to start the write operation. Specifies the input buffer for the write operation.

The syntax for the WRITE() routine follows:


DBMS_LOB.WRITE(lob_write, num_bytes, offset, Buffer);

GETLENGTH()
This routine returns the length of a LOB and takes the following arguments:
lob_1

Locator of a LOB.

The syntax for the GETLENGTH() routine follows:


DBMS_LOB.GETLENGTH(lob_1);
Note

The length of an empty LOB is zero.

INSTR()
This routine searches a LOB for a specified pattern of data. The return value identifies the starting position of the pattern. This routine takes the following arguments:

Chapter 23 3 Heading 1

595

lob_1 pattern Offset

Locator of the LOB to be matched with a pattern. Specifies the pattern that needs to be located in lob_1. Specifies the offset of where to start the pattern matching operation. Specifies the occurrence number to find.

occurrence_no

The syntax for the INSTR() routine follows:


DBMS_LOB.INSTR(lob_1, pattern, offset, occurrence_no);
Note

The pattern and the LOB value must be from the same character set.

READ()
This routine reads a specified amount of LOB into a buffer. You can then use the buffer to perform write operations into other LOB records. This routine returns the number of bytes read and takes the following arguments:
lob_1 num_bytes offset buffer

Locator of the LOB to read. Number of bytes to read from lob_1. Specifies the offset of where to start the read operation. Specifies the buffer to store the output of the read operation.

The syntax for the READ() routine follows:


DBMS_LOB.READ(lob_1, num_bytes, offset, Buffer);
Note

Use the NO_DATA_FOUND exception clause with the DBMS_LOB.READ() routine.

SUBSTR()
This routine extracts a specified amount of data from a LOB and returns it to the calling application. This routine takes the following arguments:
lob_1 num_bytes offset

Locator of the LOB from which to extract data. Number of bytes to extract from lob_1. Specifies the offset of where to start the extract operation.

The syntax for the SUBSTR() routine follows:


DBMS_LOB.SUBSTR(lob_1, num_bytes, offset);

596

Chapter 23 3 Very Large Databases

FileCLOSE()
This routine closes a successfully opened external LOB (BFILE) file and takes the following arguments:
bfile_lob

Locator of the BFILE to close.

The syntax for the FileCLOSE() routine follows:


DBMS_LOB.FILECLOSE(bfile_lob);

FileCLOSEALL()
This routine closes all open BFILEs in a specific session. The syntax for the FileCLOSEALL() routine follows:
DBMS_LOB.FILECLOSEALL;
Note

This routine does not require arguments.

FileEXISTS()
This routine verifies a BFILE locator actually points to a valid file on the operating systems file system. This routine takes the following arguments:
bfile_lob

Locator of the BFILE to verify.

The Syntax for the FileEXISTS() routine follows:


DBMS_LOB.FILEEXISTS(bfile_lob);

FileGETNAME()
This routine returns the directory alias and physical file name of a specific BFILE locator. This routine cannot validate if the physical file exists and takes the following arguments:
bfile_lob dir_alias file_name

Locator of the BFILE LOB. Variable to hold the alias directory. Variable to hold the physical file name.

The syntax for the FileGETNAME() routine follows:


DBMS_LOB.SUBSTR(bfile_lob, dir_alias, File_name);

Chapter 23 3 Heading 1

597

FileISOPEN()
This routine verifies whether a specific BFILE is open and takes the following arguments:
bfile_lob

Locator of the BFILE LOB.

The syntax for the FileGETNAME() routine follows:


DBMS_LOB.FILEISOPEN(bfile_lob);

FileOPEN()
This Routine opens a BFILE for read-only access. No other operations are permitted on the BFILE because it is stored externally in respect to the Oracle8 database. This Routine takes the following arguments:
bfile_lob open_mode

Locator of the BFILE LOB. Specifies the mode to open the BFILE.

The syntax for the FileOPEN() routine follows:


DBMS_LOB.FILEOPEN(bfile_lob, open_mode);
Note

Currently, the open_mode can only be set to DBMS_LOB.FILE_READONLY. Oracle may extend the BFILE manipulation capabilities at a later time.

Summary
With the lowering of hardware prices and the escalation of data use data, ithat the database has become the central repository for all data in an enterprise. As a result, the database must store more complex and large data, such as video, movies, graphic images, and sound (wav) files. In Addition, the storage requirements of and more data in a typical enterprise has grown exponentially. The term VLDB refers to databases and a terabyte in size. This chapter has addressed the design, implementation, and management of VLDBs through the use of Object Partitioning and Large Objects (LOBs). The following key points can summarize this chapter: 3 Commonly, certain database implementations only contain a few extremely large tables which qualify the database as VLDB. These objects can be partitioned into smaller and more manageable units. 3 Partitioned units are transparent to Users.

598

Chapter 23 3 Very Large Databases

3 The PARTITION BY RANGE clause used within a CREATE TABLE statement creates a partitioned table. 3 The PARTITION BY RANGE clause permits the horizontal partitioning of data in a custom or equi-partitioned fashion. 3 The following Indexes can be created for partitioned tables: local prefixed, local non prefixed, and global. 3 Partitioning a very large Table has the following advantages: reduction of database downtime, increased query performance, and better disk access. 3 LOB datatypes can be stored internally or externally with respect to the database. 3 BLOBs, CLOBs, and NCLOBs datatypes can all be stored within a 4GB table. After that point they are stored externally in respect to the Table and only referenced from within the Table. 3 BFILE datatypes are stored externally as an operating system file. 3 The DBMS_LOB package can be used to manipulate LOB datatype Columns in a Table.

You might also like