DB2 - IBM’s Relational DBMS
CTS-PAC Version 1.1 1
Session 1
CTS-PAC Version 1.1 2
Topics to be covered in this session
• Introduction to databases - covers their advantages and
the types of databases (time : 30 min)
• Relational database concepts - covers Terminology,
ER model , Normalisation, An Introduction to
Database objects, CODD’s Relational Rules, An
Introduction to SQL.
CTS-PAC Version 1.1 3
Introduction to Databases
What is Data ?
‘A representation of facts or instruction in a form
suitable for communication’ - IBM Dictionary
What is a Database ?
‘Is a repository for stored data’ - C.J.Date
CTS-PAC Version 1.1 4
contd...
What is a database system ?
An integrated and shared repository for stored data or
collection of stored operational data used by
application systems of some particular enterprise.
Or
‘Nothing more than a computer-based record keeping
system’.
CTS-PAC Version 1.1 5
Advantages of DBMS over File Mngt Sys
• Data redundancy
• Multiple views
• Shared data
• Data independence (logical/physical)
• Data dictionary
• Search versatility
• Cost effective
• Security & Control
• Recovery restart & Backup
• Concurrency
CTS-PAC Version 1.1 6
TYPES OF DATABASES (or Models)
• Hierarchical Model
• Network Model
• Relational Model
• Object-Oriented Model
CTS-PAC Version 1.1 7
contd...
• HIERARCHICAL
• Top down structure resembling an upside-down
tree
• Parent child relationship
• First logical database model
• Available in legacy systems on Mainframe
computers
• Example - IMS
CTS-PAC Version 1.1 8
contd...
• NETWORK
• Does not distinguish between parent and child. Any
record type can be assocaited with any number of
arbitrary record types
• Enhanced to overcome limitations of Network
model but in reality, there is minimal diffeence due
to frequent enhancements
CTS-PAC Version 1.1 9
contd...
• RELATIONAL
• Data stored in table in the form of tables and rows.
• Examples - DB2, Oracle, Sybase, Ingres etc
• OBJECT -ORIENTED MODEL
• Data attributes and methods that operate on those
attributes are encapsulated in structures called
objects
CTS-PAC Version 1.1 10
RELATIONAL DB CONCEPTS
CTS-PAC Version 1.1 11
Relational Properties
• Why Relational ? - Relation is a mathematical
term for a table - Hence Relational database ‘is
perceived’ by the users as a set of tables.
• All data values are atomic.
• Entries in columns are from the same domain
• Sequence of rows (T-B) is insignificant
• Each row is unique
• Sequence of columns (L-R) is insignificant
CTS-PAC Version 1.1 12
Relational Concepts (or Terminology)
• Relation : A table or File
• Tuple : Row contains an entry for each attribute
• Attributes : Columns or the characteristics that
define the entity
• Domain:. A range of values (or Pool)
• Entity : Some object about which we wish to store
information
• Null : Represents an unknown value
• Atomic : Smallest unit of data; the individual data
value
CTS-PAC Version 1.1 13
contd...
• Candidate key : Some attribute (or a set of
attributres) that may uniquely identify each
row(tuple) in the relation(table) This exists only for a
short period of time and the primary and attribute
key take its place.
• Primary key : The candidate key that is chosen for
primary attributes to uniquely identify each row.
• Alternate key : The remaining candidate keys that
were not chosen as primary key
• Foreign key : An attrtibute of one relation that might
be a primary key of another relation.
CTS-PAC Version 1.1 14
Entity Relationship Model
• E-R model is a logical representation of data for a
business area
• Represented as entities, relationship between entities
and attributes of both relationships and entities
• E-R models are outputs of analysis phase i.e they are
conceptual data models expressed in the form of an E-
R diagram
CTS-PAC Version 1.1 15
Normalisation (1NF - 5NF)
• It is done to bring the design of database to a
standadized mode
• 1NF : All entities must have a unique identifier, or key,
that can be composed of one or more attributes. All
attributes must be atomic and non repeating.
• 2NF : Partial functional dependencies removed - all
attributes that are not a part of the key must depend on
the entire key for that entity.
CTS-PAC Version 1.1 16
contd...
• 3NF : Transitive dependencies removed - attributes that
are not a part of the key must not depend on any non-
key attribute.
• 4NF : Multi valued dependencies removed
• 5NF : Remaining anomalies removed
CTS-PAC Version 1.1 17
Types of Integrity
• Entity Integrity : Rule states that no column that is
part of a primary key can have a null value
• Referential Integrity : Rule states that every foreign
key in the first table must either match a primary key
value in the second table or must be wholly null
• Domain Integrity : Integrity of information allowed in
column
CTS-PAC Version 1.1 18
Example of a Relational Structure
CUSTOMER Places ORDERS
ORDERS Has PRODUCTS
CTS-PAC Version 1.1 19
The above relations can be interpreted as
follows :
• A Customer can place any number of orders (one-to-
many)
• Each order relates to only one customer (one-to-one)
• One order can contain many products (one-to-many)
• One Product can be a part of many orders(one-to-
many)
CTS-PAC Version 1.1 20
contd...
• In the above example Customer, Order & Product are
called ENTITIES.
• An Entity may transform into table(s).
• The unique identity for information stored in an
ENTITY is called a PRIMARY KEY. Eg. Customer-
No uniquely identifies each customer
CTS-PAC Version 1.1 21
contd...
A table essentially consists of
• Attributes, which define the characteristics of the
table
• Primary key, which uniquely identifies each row of
data stored in a table
• Secondary & Foreign Keys/indexes
CTS-PAC Version 1.1 22
contd...
Table Definition :
Table ‘Customer’ -
Attributes - Customer-No, Cust-name,
Cust-location, Cust-Id, Order-no...
Primary Key - Customer-No
Secondary Key - Cust-Id
Foreign-Key - Order-no
CTS-PAC Version 1.1 23
contd...
• The Relationships transform into Foreign Keys. For eg.
Customer is related to Orders thru ‘Order-No’ which is
the Foreign-key in Customer and Primary key in Order.
So basically the relationship ‘Places’ is thru the Order-
No.
• As per the relational integrity the Primary-Key ,Order-
No, for the table ‘Orders’ can never be Null, while it
can be so in the table ‘Customer’.
CTS-PAC Version 1.1 24
contd...
• Tables exist in Tablespaces. A tablespace can contain
one or more tables
• Apart from the Primary Key, a table can have many
secondary keys/indexes, which exist in Indexspaces.
• These tablespaces and indexspaces together exist in a
Database
CTS-PAC Version 1.1 25
contd...
• To do transformations as described above we need a
tool that will provide a way of creating the tables,
manipulate the data present in these, create
relationships,indexes,tablespace, indexspace and so on.
DB2 provides SQL which performs these functions.
The next part briefly deals with SQL and its functions.
A detailed explanation will be taken up later.
CTS-PAC Version 1.1 26
CODDS RELATIONAL RULES
• 1. All information in a relational database is
represented explicitly at the logical level and in exactly
one way - by values in tables
• 2. Each and every datum(atomic value) in a relational
database is guarenteed to be logically accessible by
resorting to a combination of tablename, primary key
value, and column name
CTS-PAC Version 1.1 27
contd...
• 3. Null values are supported for representing missing
information in a systematic way irrespective of the
datatype.
• 4. The database description is represented at the logical
level in the same way as ordinary data, so that
authorised users can apply the same relational language
to its interrogation as they apply to the regular data.
CTS-PAC Version 1.1 28
contd...
• 5.A relational system may support several languages
and various modes of terminal use. However there
must be one language whose statements can express all
of the following items: (1)data definitions (2)view
definitions (3)data manipulation(interactive and by
program)(4) integrity constraints (5) authorisation(6)
transaction boundaries(begin, commit,rollback)
CTS-PAC Version 1.1 29
contd...
• 6. All views are theoretically updatable, are also
updatable by the system
• 7. The capability of handling a base relation or a
derived relation (view) as a single operand applies not
only to the retrierval of of data but also to the
insertion, updation and deletion of data
CTS-PAC Version 1.1 30
contd...
• 8. Application programs and terminal activities remain
logically unimpaired whenever any changes are made
in either storage representations or access methods
• 9. Application programs and terminal activities remain
logically unimpaired when information-preserving
changes of any kind that theoretically permit
unimpairment are made to the base tables.
CTS-PAC Version 1.1 31
contd...
• 10. Integrity constraints specific to a particular
relational database must be definable in the relational
data sublanguage and storable in the catalog, not in the
application programs.
• 11. The data manipulation sublanguage of a relational
DBMS must enable application programs and inquiries
to remain logically the same whether and whenever
data are physically centralized or distributed.
CTS-PAC Version 1.1 32
contd...
• 12. If a relational system has a low-level(single-
record-at-a-time)language, that low level cannot be
used to subvert or bypass the integrity rules and
constraints expressed in the higher-level relational
language(multiple-records-at-a-time)
CTS-PAC Version 1.1 33
An introduction to SQL
SQL or Structured Query Language is
• A Powerful language that performs the functions of
data manipulation(DML), data definition(DDL) and
data control or data authorization(DAL/DCL).
• A Non procedural language - the capability to act
on a set of data and the lack of need to know the
how to retrieve it. An SQL can perform the
functions of more than a procedure.
• Very flexible
CTS-PAC Version 1.1 34
contd...
SQL - Features
• What you want and not how to get it
• Unlike COBOL or 4GL’s, SQL is coded without
data-navigational instructions.The optimal access
paths are determined by the DBMS. This is
advantageous because the database knows better
how it has stored data than the user.
• Set level processing & multiple row processing
CTS-PAC Version 1.1 35
The following are the Operations that can be
performed by a SQL on the database tables :
• Select
• Project
• Union
• Intersection
• Difference
• Cartesian Product
• Join
• Divide
CTS-PAC Version 1.1 36
Session 2
CTS-PAC Version 1.1 37
Topics to be covered in this session
• SQL - this is to be dealt here because all other data
objects manipulation, creation and use, involve SQL’s.
• DB2 objects - Database, Tablespaces & Indexspaces -
creation & use, and other terminologies associated with
databases.
CTS-PAC Version 1.1 38
Topics dealt with, in SQL
• Definition and Types
• usage of SQL’s with examples, scalar and column
functions
• Subqueries and Multiple queries, DMLs
• Static & Dynamic SQLs
CTS-PAC Version 1.1 39
Structured Query Language - SQL
• Standard query language for RDBMS
• Non procedural lang : Programmer specifies what data
is needed but not how to retrieve it
• Used also to define data structures, control access to
the data and delete occurrences of data
• Uses set-level processing
CTS-PAC Version 1.1 40
SQL - Types - based on the functionality
• Data Definition Language (DDL) - CREATE, ALTER,
DROP
• Data Manipulation Language (DML) - DELETE,
INSERT, SELECT, UPDATE
• Data Control Language (DCL) - GRANT, REVOKE
CTS-PAC Version 1.1 41
SQL - Types
• Production SQL or Ad-Hoc SQL
• Embedded SQL or Stand-alone SQL
• Static or Dynamic SQL
CTS-PAC Version 1.1 42
SQL - Selection & Projection
• Select retrieves a specific number of rows from a table
• Projection operation retrieves a specified subset of
columns(but all rows) from the table
Eg : Select Cust-no, Cust-name from Customer;
The WHERE clause defines the Predicates for the SQL
operation.
The above WHERE clause can have multiple conditions
using AND & OR.
CTS-PAC Version 1.1 43
Select distinct, select in range :
Select Cust-no, Cust-name, Cust-addr
where Cust-no BETWEEN 10000 AND 20000;
Select Cust-no, Cust-name, Cust-addr
where Cust-no NOT BETWEEN 1000 AND 2000;
Select Cust-no, Cust-name, Cust-addr
where Cust-no IN(1000, 2000);
CTS-PAC Version 1.1 44
contd...
Select Cust-no, Cust-name, Cust-addr
where Cust-id like/not like ‘425%’
Note :- ‘_’ for a single char ; ‘%’ for a string of chars
Escape ‘\’ - escape char;if precedes ‘_’ or ‘%’ overrides
their meaning
CTS-PAC Version 1.1 45
contd...
NULL : To check null the syntax is ‘IS NULL’ or ‘IS
NOT NULL’.
Select Cust-no, Cust-name, order-no
where order-no IS NULL;
However if there are null values for order-no, then these
are always evaluated as a ‘Not True’ condition in a
Query.
CTS-PAC Version 1.1 46
Order by and Group by clauses :
• Order by sorts retrieved data in the specified order;
uses the WHERE clause
• Group by operator causes the table represented by the
FROM clause to be rearranged into groups, such that
within one group all rows have the same value for the
Group by column (not physically in the database). The
Select clause is applied to the grouped data and not to
the original table.
Here ‘HAVING’ is used to eliminate groups, just like
WHERE is used for rows.
CTS-PAC Version 1.1 47
Example :-
Select Order-No, SUM(No-Prodts)
From ORDER
Group by Order-No
Having AVG(No-Prodts) < 10
Order by Order-No ;
CTS-PAC Version 1.1 48
Functions
• Types are two :
• Column Function
• Scalar Function
CTS-PAC Version 1.1 49
Column Functions
• Compute from a group of rows aggregate value for a
specified column(s)
• AVG, COUNT, MAX, MIN, SUM
• Rules for column Functions - Refer Handout
CTS-PAC Version 1.1 50
Scalar Functions
• Are applied to a column or expression and operate on a
single value.
• CHAR, DATE, DAY(S), DECIMAL, DIGITS,
FLOAT, HEX, HOUR, INTEGER, LENGTH,
MICROSECOND, MINUTE, MONTH, SECOND,
SUBSTR, TIME, TIMESTAMP, VALUE,
VARGRAPHIC, YEAR
• Rules for Scalar Functions - Refer handout
CTS-PAC Version 1.1 51
Complex SQL’s
• One terms a SQL to be complex when data that is
to be retieved comes from more than one table
• SQL provides two ways of coding a complex SQL
• Subqueries and
• Joins
CTS-PAC Version 1.1 52
Subqueries
• Nested select statements
• specified using the IN(or NOT IN) predicate, equality
or non-equality predicate(‘=‘ or ‘<>‘) and comparative
operator(<, <=, >, >=)
• When using the equality, non-equality or comparative
operators, the inner query should return only a single
value
CTS-PAC Version 1.1 53
contd...
• Select Cust-No, Cust-Name
From CUSTOMER Where Order-No IN
( Select Order-No From ORDER
Where No-Prdts <5);
• Select Cust-No, Cust-addr
From CUSTOMER Where Order-No =
( Select Order-No From ORDER Where No-
Prdts=5);
CTS-PAC Version 1.1 54
contd...
• The nested loop statements gives the user the flexibility
for querying multiple tables
• A specialized form is Correlated Subquery - the nested
Select stmt refers back to the columns in previous
select stmts
• It works on Top-Bottom-Top fashion
• Noncorrelated Subquery works in Bottom-to-Top
fashion
CTS-PAC Version 1.1 55
Eg - Correlated Subquery..
• SELECT A.Cust-name A.Cust-addr
FROM CUSTOMER A WHERE A.Order-No IN
(SELECT Order-No FROM CUSTOMER B
WHERE A.Cust-id = B.Cust-id)
ORDER BY A.Cust-id, A.Cust-no ;
CTS-PAC Version 1.1 56
Corelated Subquery using EXISTS clause :
SELECT Cust-No, Cust-name FROM CUSTOMER A
WHERE EXISTS (SELECT * FROM ORDER B
WHERE B.Order-No = A.Order-No
AND B.Order-No = 5);
CTS-PAC Version 1.1 57
Multiple levels of Subquery
SELECT Cust-no, Cust-name, Cust-addr
FROM CUSTOMER A
WHERE Order-no IN
(SELECT order-no FROM ORDER B
WHERE Prod-id IN
(SELECT Prod-id FROM PRODUCTS
WHERE Prod-name = ‘NUTS’));
CTS-PAC Version 1.1 58
Joins
OUTER JOIN : For one or more tables being joined, both
matching and nonmatching rows are returned.
Duplicate columns may be eliminated
The nonmatching columns will have nulls in them.
INNER JOIN: Here there is a possibility one or more of
the rows from either or both tables being joined will
not be included in the table that results from the join
operation
CTS-PAC Version 1.1 59
DML’s
INSERT :
Eg: INSERT INTO Tablename(column1,
column2, column3 ,......)
VALUES( value1, value2, value3 ,........)
If any column is omitted in an INSERT stmt and that
column is NOT NULL, then INSERT fails; if null it is
set to null
CTS-PAC Version 1.1 60
contd...
• If the column is defined as NOT NULL BY
DEFAULT, it is set to that default value
• Omitting the list of columns is equivalent to specifying
all values
• SELECT - INSERT
INSERT INTO TEMP (A#, B)
SELECT A#, SUM(B) FROM TEMP1 GROUP
BY A# ;
CTS-PAC Version 1.1 61
contd...
UPDATE:
Eg: UPDATE tablename SET Columnname(s) =
scalar expression
WHERE [ condition ]
• Single or Multiple row updates
• Update with a Subquery
CTS-PAC Version 1.1 62
contd...
DELETE:
Eg: DELETE FROM Tablename WHERE
[condition ];
• Single or multiple row delete or deletion of all rows
CTS-PAC Version 1.1 63
Static SQL
• Hard-coded into an application program
• cannot be modified during the program’s execution
except for changes to the values assigned to the host
variables
• Cursors are used to access set-level data
• The general form is EXEC SQL
[SQL stmts]
END-EXEC.
CTS-PAC Version 1.1 64
Dynamic SQL
• Stmts can change throughout the program’s execution
• When the SQL is bound, the application plan or
package that is created does not contain the same info
as that for a static SQL program
• The access paths cannot be determined before
execution
CTS-PAC Version 1.1 65
SQL Guidelines :
- Refer handout
- Mullins, chapter 2
CTS-PAC Version 1.1 66
Topics dealt with, in DB2 objects
• Databases, stogroup, Tablespaces (types, creation and
modification)
• Indexspaces (creation and modification)
• some more terms associated with tablespaces
CTS-PAC Version 1.1 67
DB2 Objects
• Databases - User & system(catalog)
• A collection of logically related objects - like
Tablespaces, Indexspaces, Tables etc.
• not a physical kind of object - may occupy more
than one disk space
• A STOGROUP & BUFFERPOOL must be defined
for each database. Stogroup and user-defined VSAM
are the two storage allocations for a DB2 dataset
defn.
CTS-PAC Version 1.1 68
Stogroup
• It is a collection of direct access volumes, all of the
same device type
• The option is defined as a part of tablespace definition
• When a given space needs to be extended, storage is
acquired from the
appropriate stogroup
CTS-PAC Version 1.1 69
contd...
• In a given database, all the spaces need not have the
same stogroup
• These are, in a sense, the most physical of various
storage objects in DB2
• More than one volume can be defined in a stogroup.
DB2 keeps track of which volume was defined first &
uses that volume.
CTS-PAC Version 1.1 70
VCAT Option
• User Defined VSAM datasets have to be defined
explicitly by the AMS utility IDCAMS
• Two types of VSAM datasets are used -ESDS & LDS.
Linear Data set is more efficiently used by DB2
• Vsam datasets defined here are different from the plain
vsam datasets - can access them only thru VSAM
Media Manager
CTS-PAC Version 1.1 71
Tablespaces
• Logical address space on secondary storage to hold one
or more tables
• A ‘SPACE’ is basically an extendable collection of
pages with each page of size 4K or 32K bytes.
• It is the storage unit for for recovery and reorganizing
purpose
• Three Type of Tablespaces - Simple, Partitioned &
Segmented
CTS-PAC Version 1.1 72
Simple Tablespaces
• Can contain more than one stored table
• Depending on appln, storing more than one Table
might enable faster retrieval for joins using these tables
• Usually only one is preferred. This is because a single
page can contain rows from all tables defined in the
database.
• LOAD with replace option deletes all data
CTS-PAC Version 1.1 73
Segmented Tablespaces
• Can contain more than one stored table, but in a
segemented space
• A ‘Segment’ consists of a logically contiguous set of n
pages
• No segement is allowed to contain records for more
than one table
• Sequential access to a particular table is more efficient
CTS-PAC Version 1.1 74
contd...
• Mass Delete is much more efficient than in any other
Tablespace
• Reorganizing the tablespace will restore every table to
its clustered order
• Lock Table on table locks only the table, not the entire
tablespace
• If a table is dropped, the space for that table can be
reclaimed with minimum reorg
CTS-PAC Version 1.1 75
Partitioned Tablespaces
• Only one table in a partitioned TS; 1 to 64
partitions/TS
• It is partitioned in accordance with value ranges for
single or a combination of columns. Hence these
column(s) cannot be updated
CTS-PAC Version 1.1 76
contd...
• Individual partitions can be independently recovered
and reorganized
• Different partitions can be stored on different storage
groups for efficient access.
CTS-PAC Version 1.1 77
Tablespace parameters to be specified for TS
creation
• Locksize - indicates the type of locking DB2 performs
for the given TS
• Page
• Table
• Tablespace
• ANY - DB2 decides the starting page
CTS-PAC Version 1.1 78
contd...
• USING - method of storage allocations - Stogroup or
Vcat
• PCTFREE - % of space available for future inserts
• FREEPAGE - no of pages after which an empty page is
available
• Bufferpool - BPQ, BP1, BP2 & BP32K
• CLOSE - Yes/No - whether the underlying vsam
datasets be closed each time the table is used.Max no
of datasets that can be open in DB2 at a time is 10,000
CTS-PAC Version 1.1 79
contd...
• ERASE - Yes/No - whether physical DASD where the
TS reside to be written with binary zeros when the TS
is dropped
• NUMPARTS - For Partitioned Tablespaces
• SEGSIZE - For Segmented Tablespaces
CTS-PAC Version 1.1 80
Table Parameters for Creation
• Column Definition
• Format : CREATE TABLE TABLENAME (Column
Definitions)
• PRIMARY KEY(Columns) / FOREIGN KEY
*
• UNIQUE (Colname) (referential constraint)
CTS-PAC Version 1.1 81
contd...
• 1. LIKE Table name / View name
• 2. IN Database Tablespace Name
• Foreign Key references dbname.table on ‘relation
condition for delete’
• Table1 references table2(target) - Table2’s Primary key
is the foreign key defined in Table1
CTS-PAC Version 1.1 82
contd...
• The Condn’s are CASCADE, RESTRICT & SET
NULL (referential constraint for the foreign key
definition)
• Inserting (or updating ) rows in the target is allowed
only if there are no rows in the referencing table
CTS-PAC Version 1.1 83
Alter & Drop stmts
• ALTER : ALTER TABLE <Tablename>
ADD Column Data-type [ not null with default]
• Alter allows primary & Foreign key specifications to
be changed
• It does not support changes to width or data type of a
column or dropping a column
CTS-PAC Version 1.1 84
contd...
• DROP : DROP TABLE <Tablename>
• Similar stmts are there for INDEX.
CTS-PAC Version 1.1 85
Some general rules for RI & Table Parameters
• Avoid nulls in columns participating in
Arithmatic logic or comparisons
• Primary key cols cannot be nulls
• Limit referential structures to no more than
three levels in a direction
CTS-PAC Version 1.1 86
contd...
• Use DB2’s inherent features rather than pgm coded
RI’s.
• Do not use RI’s on tables build from another RI system
• Consider using Fieldprocs or Editprocs or Validprocs
CTS-PAC Version 1.1 87
Index Parameters for Creation
• CREATE INDEX Indexname ON Tablename
(Colnames asc/desc)
• CLUSTER
• SUBPAGES
• USING STOGROUP/VCAT (the corresponding name)
• PRIQTY / SECQTY ; ERASE Yes/No
• BUFFERPOOL
• CLOSE - Yes/No
• FREEPAGE
• PCTFREE
CTS-PAC Version 1.1 88
Index Guidelines - What to do ?
1. Consider indexing on columns used in
UNION,DISTINCT,GROUP BY, ORDER BY &
WHERE clauses.
2. Limit the indexing of frequently updated columns
3. Create explicitly, a clustering index
4. Create a unique index on the primary key and
indexes on foreign keys
CTS-PAC Version 1.1 89
contd...
5. overloading of index when row length of a table to
be accessed is short
6. Atleast one index must be defined for a table with
more than 100 pages
7. Use Multicolumn index rather than a multi-index
(appln dependent); however the latter requires more
DASD .
CTS-PAC Version 1.1 90
contd...
8. Create indexes before loading the table.
9. Clustering reduces I/O; DB2 optimizer usually tries
to use an index on clustered column before using the
other indexes.
10. Optimize Subpages Parameter
11. Specify Indexspace freespace the same as
tablespace freespace
CTS-PAC Version 1.1 91
contd...
12. Use the DEFER option while creating the index.
RECOVER INDEX utility can then be used to populate
the index. Recover utility populates index entries
faster.
13. Use different STOGROUP’s for Tablespaces &
indexspaces
14. Create Critical indexes in a different bufferpool than
the tablespaces.
CTS-PAC Version 1.1 92
Index Guidelines - What Not to do ?
1. Avoid indexing on Variable columns
2. Limit the number of indexes on partitioned TS
3. Avoid indexes if the table is very small (< 10 pages) or
it has heavy inserts and deletes and is very small (< 20
pages) or it is accessed with a scan. Avoid defining
redundant indexes
CTS-PAC Version 1.1 93
Some more terms & concepts associated with
Tables
VIEWS:
• It is a logical derivation of a table from other
table/tables. A View does not exist in its own right.
• They provide a certain amount if logical independence
• They allow the same data to be seen by different users
in different ways
• In DB2 a view that is to accept a update must be
derived from a single base table
CTS-PAC Version 1.1 94
Some more terms & concepts associated with
Tables
Aliases and Synonyms :
Both mean ‘another name’ for the table.
however the difference is a synonym is private to the
user who created it. Aliases are used basically for
accessing remote tables (in distributed data
processing), which add a location prefix to their
names.Using aliases creates a shorter name.
CTS-PAC Version 1.1 95
Some more terms & concepts associated with
Tables
Format:
CREATE VIEW <Viewname> (<columns>)
AS Subquery (Subquery - Select from
other Table(s))
. CREATE ALIAS <Aliasname> FOR
<Tablename>
CREATE SYNONYM <Synonymname> FOR
<Tablename>
CTS-PAC Version 1.1 96
Session 3
CTS-PAC Version 1.1 97
Topic to be covered in this session
• The following topics will be covered in this session
• Application programming using DB2 - 1 day
• Data control Language, SPUFI, QMF, Appln pgming
Guidelines - 0.5 days
CTS-PAC Version 1.1 98
Application programming using DB2
• Application environments supporting DB2 :
• IMS(Batch/Online), CICS, TSO(Batch/Online)
• CAF - Call Attach Facility
• All DB2 application types can execute concurrently
• Host Language support - Cobol, PL/1, C, Fortran or
Assembly lang
CTS-PAC Version 1.1 99
Steps involved in creating a DB2 application
• Coding the application
• using Host variables
• using Embedded SQL
• using Cursors
• issue DCLGEN command
CTS-PAC Version 1.1 100
contd...
• Pre compile the program
• Compile & Link edit the program
• Bind
CTS-PAC Version 1.1 101
Host Variables
• These are variables(or rather area of storage) defined in
the host language to use the predicates of a DB2 table.
These are referenced in the SQL stmt.
• A means of moving data from and to DB2 tables
• DCLGEN produces host variables, the same as the
columns of the table
CTS-PAC Version 1.1 102
Host Variables
Can be used in
• ‘INTO’ CLAUSE OF SELECT & FETCH
STATEMENTS
• AS INPUT OF ‘SET’ CLAUSE OF UPDATE STMTS
• AS INPUT FOR THE ‘VALUES’ CLAUSE OF
INSERT STATEMENT
• IN WHERE CLAUSE OF SELECT, INSERT,
UPDATE & DELETE
• AS LITERALS IN SELECT LIST OF A SELECT
STATEMENT
CTS-PAC Version 1.1 103
Example
• SELECT Cust_No, Cust_name, Cust_addr
FROM CUSTOMER
INTO :H-Cust-No, :H-Cust-name, :H-Cust-addr
WHERE Cust_No = :H_Cust_No;
CTS-PAC Version 1.1 104
Embedded SQL statements
• It is like the file I/O
• Normally the embedded SQL statements contain the
host variables coded in the INTO or SELECT .... as
shown above
• they are preceded by EXEC SQL
• SELECT, INSERT, UPDATE & DELETE stmts can be
coded inline
CTS-PAC Version 1.1 105
Using Cursors
• can be likened to a pointer
• used when a large number of rows are to be selected
• can be used for modifying data using a ‘FOR UPDATE
OF’ clause
CTS-PAC Version 1.1 106
Cursors
• DECLARE : name assigned for a particular SQL stmt
• OPEN : readies the cursor for row retrieval; sometimes
builds the result table.However it does not assign
values to the host variables
• FETCH : returns data from the results table one row at
a time and assigns the value to specified host variables
• CLOSE : releases all resources used by the cursor
CTS-PAC Version 1.1 107
DCLGEN
• issued for a single table
• prepares the structure of the table in a COBOL
copybook
• The copybook contains a ‘SQL DECLARE TABLE’
stmt along with a working storage host variable defn
for the table
CTS-PAC Version 1.1 108
Precompile
• searches all the SQL stmts and DB2 related INCLUDE
members and comments out every SQL stmt in the
program
• the SQL stmts are replaced by a CALL to the DB2
runtime interface module, along with parameters.
• All SQL statements are extracted and put in a Database
Request Module (DBRM)
CTS-PAC Version 1.1 109
Contd...
• places a time stamp in the modified source and the
DBRM so that these are tied. If there is a mismatch in
this a runtime error of ‘-818‘, timestamp mismatch, is
got
• all DB2 related INCLUDE stmts must be placed
between EXEC SQL & END EXEC keywords for the
precompiler to recognize them
CTS-PAC Version 1.1 110
Compile & Link
• modified precompiler COBOL output is compiled
• compiled source is link edited to an executable load
module
• appropriate DB2 host language interface module
should also be included in the link edit step(i.e
DSNALI)
CTS-PAC Version 1.1 111
Bind
• A type of compiler for SQL statements
• It reads the SQL statements from the DBRM and
produces a mechanism to access data (in an efficient
manner) as directed by the SQL statements being
bound
• Checks syntax, checks for correctness of table &
column definitions against the catalog info & performs
authorization validation
CTS-PAC Version 1.1 112
Bind Types
• BIND PLAN : accepts as input one or more DBRMs
and outputs an application plan containing executable
logic representing optimized access paths to DB2 data.
• BIND PACKAGE : acceps as input a single DBRM
and produces a single package containing the
optimized access path. The PLAN in this case contains
a reference to the physical location of the package(s).
CTS-PAC Version 1.1 113
What is a Package ?
• It is a single bound DBRM with optimized access paths
• It also contains a location identifier, a collection
identifier and a package identifier
• A package can have multiple versions, each with its
own version identifier
CTS-PAC Version 1.1 114
Advantages of Package
• Reduced bind time
• can specify bind options at the programmer level
• versioning
• provides for remote data access(in version DB2 V2.3
or higher)
CTS-PAC Version 1.1 115
Data Control language
• GRANT & REVOKE
• GRANT : grants the table privileges, plan & package
privileges, collection privileges, database privileges,
use privileges and system privileges
• user with a SYSADM privilege will be responsible for
overall control of the system
CTS-PAC Version 1.1 116
contd...
• Format of GRANT :
GRANT SELECT, UPDATE(NAME,NO)
ON TABLE EMPL
TO A, B, C(or PUBLIC);
GRANT ALL ON EMPL TO PUBLIC;
GRANT EXECUTE ON PLAN PLANA TO USER;
CTS-PAC Version 1.1 117
contd...
• The table privileges allowed are SELECT, UPDATE,
DELETE, INSERT, (both base tables & views),
ALTER(Table) & (Create)INDEX(only to base tables)
• There are no specific DROP privilages;the table can be
dropped by its owner or a SYSADM
CTS-PAC Version 1.1 118
contd...
• A user having authority to grant privilege to another,
also has the authority to grant the privilage with “with
the GRANT Option”
CTS-PAC Version 1.1 119
contd...
• REVOKE : this stmt revokes the privileges given to a
user. The user granting the privileges has the authority
to REVOKE also.
• It is not possible to be column specific when revoking
an UPDATE privilege
REVOKE SELECT ON TABLE EMPL FROM
USERA;
CTS-PAC Version 1.1 120
For the following refer handout
• List of common SQL return codes and solutions
• JCL’s for bind, compile of DB2 program
CTS-PAC Version 1.1 121
Application development guidelines
• Code modular DB2 programs and make them as small
as possible
• use unqualified SQL stmts;this enables movement from
one environment to another(test to prodn)
• Never use Select* in an embedded SQL program;
• use joins rather than subqueries
CTS-PAC Version 1.1 122
contd...
• use WHERE clause and filter out data
• use cursors when fetching multiple rows, though they
add overheads
• use FOR UPDATE OF clause for UPDATE or
DELETE with cursor - this ensures data integrity.
• use INSERTs minimally ; use LOAD utility instead of
INSERT, if the inserts are not application dependent
CTS-PAC Version 1.1 123
QMF - Query Management Facility
• It is an MVS- and VM- based query tool
• allows end users to enter SQL queries to produce a
variety of reports and graphs as a result of this query
• QMF queries can be formulated in several ways : by
direct SQL stmts, by means of relational prompted
query interface or by query-by-example (QBE). QBE is
similar to SQL in some ways but more user friendly
CTS-PAC Version 1.1 124
SPUFI
• supports the online execution of SQL statements from
a TSO terminal
• used for developers to check SQL statements or view
table details
• Spufi menu contains the input file in which the SQL
statements are coded, option for default settings and
editing and the output file.
CTS-PAC Version 1.1 125
Session 4
CTS-PAC Version 1.1 126
Topics to be covered in this Session
The duration of this session is 0.5 days
• DB2 Utilities
• DB2 Security
• DB2 catalog & Optimizer
• Performance tuning
CTS-PAC Version 1.1 127
DB2 System administration
• DB2 UTILITIES
• CHECK
• COPY, MERGECOPY
• RECOVER
• LOAD
• REORG, RUNSTATS
• EXPLAIN
CTS-PAC Version 1.1 128
Check
• checks the integrity of DB2 data structures
• checks the referential integrity between two tables and
also checks DB2 indexes for consistency
CTS-PAC Version 1.1 129
contd...
• can delete invalid rows and copies them to a exception
table
• Use CHECK DATA when loading a table without
specifying the ‘ENFORCE CONSTRAINTS’ option or
after the partial recovery of tablespaces in a referential
set
CTS-PAC Version 1.1 130
Copy
• used to create an imagecopy for the complete
tablespace or a partition of the tablespace - full
imagecopy or incremental imagecopy
• every succesful execution of COPY utility places in the
table SYSIBM.SYSCOPY, atleast one row that
indicates the status of the imagecopy
CTS-PAC Version 1.1 131
Mergecopy
• The MERGECOPY utility combines multiple
incremental image copy data sets into a new full or
incremental image copy data set
CTS-PAC Version 1.1 132
Recover
• Standard unit of recovery is a Tablespace
• restore DB2 tablespaces and indexes to a specific
instance
• data can be recovered for single pages,pages that
contain I/O errors, a single partition or an entire
tablespace
• indexes are always recovered from the actual table
data, not from image copy and log data, as in the case
of tablespace recovery
CTS-PAC Version 1.1 133
Load
• to accomplish bulk inserts into DB2 table
• can replace the current data or append to it .i.e. LOAD
DATA REPLACE or LOAD DATA RESUME(S)
• if a job terminates in any phase of LOAD REPLACE
the utility has to be terminated and rerun
CTS-PAC Version 1.1 134
contd...
• if a job terminates in any phase other than
UTILINIT(which sets up and initializes the LOAD
utility), the tablespace must be first restored using the
full RECOVER, if LOG NO option of the LOAD was
mentioned.. After the tablespace is restored, the error is
to be corrected, the utility terminated and the job rerun.
CTS-PAC Version 1.1 135
Reorg
• to reorganize DB2 tables and indexes and thereby
improving their efficiency of access
• reclusters data, resets free space to the amount
specified in the ‘create ddl’ statement and deletes and
redefines underlying vsam datasets for stogroup
defined objects
CTS-PAC Version 1.1 136
Runstats
• collects statistical information for DB2 tables,
tablespaces, partitions, indexes, and columns.
• it can place this info in the catalog tables with DB2
optimizer statistics or DBA monitoring statistics or
with all statistics that have been gathered
• it can be used on specific SQL queries without updting
the current usable statistics
CTS-PAC Version 1.1 137
Reorg Job stream
• the total reorg schedule should include
• a RUNSTATS job or step : to record current tablespace
and index statistics to DB catalog
• two copy steps for each tablespace being reorganized :
so that data is recoverable. The second copy job is
required after the REORG if it was performed with a
LOG NO option
CTS-PAC Version 1.1 138
contd...
• After a REORG is run with LOG NO option, DB2
turns on the copy pending status flag for tablespaces
specified in the REORG.
• When LOG NO parameter is specified it is better to
take a imagecopy of the tablespace being reorganized
immediately after reorg
• a REBIND job for all plans using tables in any of the
tblspaces being organized
CTS-PAC Version 1.1 139
Explain
• this feature can be detail the access paths chosen by the
DB2 optimizer for SQL statements
• used for performance monitoring
• When EXPLAIN is requested the access paths that the
DB2 chooses are put in coded format into the table
PLAN_TABLE, which is created in the default
database
CTS-PAC Version 1.1 140
contd...
• To EXPLAIN a single SQL stmt precede that SQL stmt
with the EXPLAIN Command
EXPALIN ALL SET QUERYNO = integer
FOR SQL stmt
• the other method is specifying EXPLAIN YES with the
Bind command
• then PLAN_TABLE is to be queried to get the required
information.
CTS-PAC Version 1.1 141
contd...
• the information provided include the type of access of
particualar tables used in the SQL or Package or Plan,
the order in which the tables or joined in a JOIN,
whether SORT is required and so on
• Since the EXPLAIN results are dependent on the DB
catalog, it is better to run RUNSTATS before running a
EXPLAIN
CTS-PAC Version 1.1 142
DB2 Security
• LOCKING SERVICES :
These are provided by an MVS subsystem called the
IMS resource Lock Manager(IRLM).
It is used to control concurrent access DB2 data,
regardless of whether IMS is present in a system or not.
CTS-PAC Version 1.1 143
contd...
• The above is based on Transaction Processing - the
system component that provides this is ‘A
TRANSACTION MANAGER’
• COMMIT & ROLLBACK are key methods of
implementing this
CTS-PAC Version 1.1 144
Explicit locking facilities
• the SQL statement LOCK TABLE
• the ISOLATION parameter on the BIND PACKAGE
command - the two possible values are RR(‘Repeatable
Read’) & CS(‘Cursor Stability’)
CTS-PAC Version 1.1 145
contd...
• the tablespace LOCKSIZE parameter - physically DB2
locks data in terms of pages or tables or tablespaces.
This parameter is specified in ‘CREATE or ALTER
Tablespace’ option ‘LOCKSIZE’. The options are
‘Tablespace’, ‘Table’, ‘Page’ or ‘Any’
CTS-PAC Version 1.1 146
contd...
• the ACQUIRE/RELEASE parameters on the BIND
PLAN command specifies when table locks(which are
implicitly acquired by DB2) are to be acquired and
released.
• Types : ACQUIRE USE & ACQUIRE ALLOCATE
• RELEASE USE & RELEASE ALLOCATE
CTS-PAC Version 1.1 147
Session 5
CTS-PAC Version 1.1 148
Topics to be covered in this Session
The duration of this session is 0.5 days
• DB2 Catalog & Directory
CTS-PAC Version 1.1 149
Catalog Tables & the DB2 directory
• Repository for all DB2 objects - contains 43 tables
• Each table maintains data about an aspect of the DB2
environment
• The data refers to info about tablespaces, tables,
indexes, privileges, on utilities run on DB2 and so on
eg : SYSIBM.SYSTABLES,
SYSINDEXES/SYSCOLUMNS ......’
CTS-PAC Version 1.1 150
contd...
• When standard DB2 SQL is used, the DB2 catalog is
either accessed or updated. eg. When a ‘CREATE
TABLE’ stmt is issued the catalog tables
SYSIBM.SYSTABLES, SYSIBM.SYSCOLUMNS &
SYSIBM.SYSFIELDS are updated.
• However the DB2 catalog is semi active only. This
is because updates to number of rows, the physical
order of the rows for a set of keys and the like are
updated only after running a RUNSTATS utility
• DB2 catalog is integrated - DB2 catalog and DB2
DBMS are inherently bound together
CTS-PAC Version 1.1 151
contd...
• It is nonsubvertible - DB2 catalog cannot be updated
behind DB2’s back. i.e. if a table of 10 columns is
created, it is not possible to go and change the number
of columns directly on the catalog to 15. It has to be
done using the standard SQL statements for dropping
and recreating the table
CTS-PAC Version 1.1 152
DB2 Optimizer
• Analyzes the SQL statements and determines the most
efficient way to access data - gives Physical data
independence
• It evaluates the following factors : CPU cost, I/O cost,
DB2 catalog statistics & the SQL statement
• it estimates CPU time, cost involved in applying
predicates, traversing pages and sorting
CTS-PAC Version 1.1 153
contd...
• It estimates the cost of physically retrieving and
writing the data
• The information pertaining to the state of the tables that
will be accessed by the SQL statements are provided
by the Catalog
CTS-PAC Version 1.1 154
Performance Tuning
• The performance of an application can be monitored
and enhanced in the application, as well as database
level
• In application side the SQL’s can be tuned to make
them more efficient, and avoid redundancy
• It is better to structure the SQLs so that they perform
only the necessary operations
CTS-PAC Version 1.1 155
contd...
• On the database side, the major enhancements can be
done to the definitions of tables, indexes & the
distribution of tablespace and indexspace
• The application run statistics are obtained from
EXPLAIN or DB2PM monitor report
CTS-PAC Version 1.1 156
Thank U
CTS-PAC Version 1.1 157