IBM Software Group
DB2 9.5 for Linux, Unix and Windows
Fundamentals
Michele Benedetti
Certified IT Specialist
IBM Software Group Technical Sales
2008 IBM Corporation
Agenda
Introduction to DB2 9.5 for Linux, Unix and Windows
Basic Terminology
General Architecture: Instances & Databases
Accessing Remote Dabases: clients, drivers and catalogs
Managing Storage
Logging, Backup & Recoveries
Managing Performances
Managing Security,users & groups
Autonomics Wrap-up
High Avaliability
pureXML : native XML storage manager
Monitoring
Miscellaneous
DB2 9.5 Editions
2008 IBM Corporation
Informazione come Servizio
Processi, Applicazioni, Persone, accedono con
difficolt e in modo scoordinato alle informazioni
che servono:
inefficienza, duplicazione di codice
non univocit delle informazioni
EIM
Linformazione corretta, univoca,
fornita come servizio su richiesta
Information Services
Conformi alla SOA
Riutilizzabili
Semanticamente
consistenti
Esempi:
verfiche di
correttezza
trasformazioni di
formato
arricchimento,
aggregazione
sincronizzazione
query federate
Info 2.0
Information Services
2008 IBM Corporation
IBM Leads in Data Server Innovation
Innovation Milestones
1968
First Hierarchical
Data Server
IBM
IBMdesigns
designsIMS
IMS
starting
in
1966
starting in 1966for
for
the
Apollo
space
the Apollo space
program
program
1980
First IBM Relational
Data Server
IBM
IBMreleases
releasesRDBMS
RDBMS
for
System/38
for System/38
implementing
implementingthe
the
Relational
Relationalmodel
model
first
firstpublished
publishedby
by
Dr.
Edgar
Codd
Dr. Edgar Codd
2006
First Multi-Structured
Data Server
DB2
DB299first
firsttotosupport
support
both
relational
both relationaland
andXML
XML
structures
managed
structures managedby
by
single
single data
dataserver
server
Continuous IBM innovation
2008 IBM Corporation
IBM Intellectual property Patented Technology Leadership
IBM Intellectual property
DB2 9
Viper
pureXML is one out of the
IBM
70 patented technologies
part of DB2 9 Viper
25
20
Canon
Thousands
NEC
15
Hitachi
Sony
Matsushita
Toshiba
Mitsubishi Samsung
Motorola
10
2008 IBM Corporation
Basic Terminology
2008 IBM Corporation
Basic Terminology: Relational Model
Column
Table
ID
NAME
EXTENSION
MANAGER
John S
54213
Susan P
59867
Jennifer L
59415
Andrew J
55935
Michael B
52137
Jeremy W
50603
Leah E
58963
Row
Field
7
2008 IBM Corporation
Basic Terminology: SQL
DB2 9.5 supports SQL for querying relational data
Also supports XQUERY for querying XML data (see more later)
Supports mixed SQL/XML and XQUERY/SQL queries (see more
later)
4 types of SQL statements:
1. Data Definition Language (DDL) (create, drop, alter objects)
2. Data Manipulation Language (DML) (insert,update,delete.)
3. Data Control Language (DCL) (grant,revoke)
4. Transaction Control Language (TCL) (commit, rollback)
2008 IBM Corporation
General Architecture:
Instances & Databases
2008 IBM Corporation
DB2 Instances
Instances
Stand-alone DB2
environment
All instances share the
same executable binary files
Each instance has its own
configuration
DB2 Administration Server
Special instance that
responds to requests from
the DB2 Administration
Tools and the Configuration
Assistant
2008 IBM Corporation
DB2 Architecture
Instance = DB2 Engine (db2sysc)
Threaded model in which one main engine process exists (db2sysc) in
memory
Engine dispatchable units (EDU) exist as threads and perform work
Benefits of the threaded model:
Increased performance
Decreased memory usage
Connections = DB2 Agents (db2agent)
All database requests are performed by db2agent EDUs on behalf of an
application
The DB2 engine keeps a pool of agents available to service requests
Two major types of agents: Coordinator Agents, Subagents
2008 IBM Corporation
DB2 9.5 Processing Model
2008 IBM Corporation
DB2 Instances
Some sample commands for working with instances
Note that most commands here can be performed in Control Center
Command
Description
Example
db2start
Start the default instance
db2start
db2stop
Stop the current instance
db2stop -f
db2icrt
Create an instance
db2icrt u db2fenc1 db2inst1
db2idrop
Drop an instance
db2idrop f db2inst1
db2ilist
List all instances
db2ilist
db2imigr
Migrate an instance after
upgrading DB2
db2imigr u db2fenc1 db2inst1
db2iupdt
Update an instance after
installation of a fix pack
db2iupdt u db2fenc1 db2inst1
2008 IBM Corporation
DB2 Instances: Instance and Database Configuration
Viewing and Changing Instance Configuration:
Description
Example
View Database Manager db2 get dbm cfg show detail
Settings
Change a Database
Manager Setting
db2 update dbm cfg using health_mon off
Viewing and Changing Database Configuration:
Description
Example
View Database Settings
db2 get db cfg for testdb
db2 connect to testdb
db2 get db cfg show detail
Change a Database
Setting
db2 update db cfg using logprimary 10
2008 IBM Corporation
Memory Architecture (with background processes)
DB2 Linux/UNIX Example
DB2 Application
App. Global Memory
DB2 Instance
Monitor Heap mon_heap_sz
Audit Buffer audit_buf_sz
FCM Buffs (DPF) fcm_num_buffers
app_ctl_heap_sz (DPF)
db2agent
local
db2ipccm
remote
db2tcpcm
db2sysc
db2wdog
db2resyn
db2gds
db2fmtlg
db2rebal
App. Private Memory
database_memory
stat_heap_sz
stmtheap
agent_stack_sz
Buffer Pools
4k buffer pool
Database Heap
Utility Heap
Package Cache
logbuffsz
8k buffer pool
util_heap_sz
pckcachesz
catalog_cache_sz
4k buffer pool
8k buffer pool
query_heap_sz
rqrioblk
others
DB2 Database
applheapsz (non DPF)
sortheap
db2syslog
4k buffer pool
16k buffer pool
java_heap_sz
32k buffer pool
Lock List
Sorting
sheapthres_shr
locklist
sortheap
Other memory
areas
App. Shared Memory
aslheapsz
dir_cache
db2loggr
db2loggw
db2pclnr
db2dlock
db2pfch
others
2008 IBM Corporation
DB2 Directory Structure
Windows Example
Instance Name
\DB2
\node0000
\sql00001
Partition number (note: in ESE non DPF databases are a single node implementation)
Database ID
/SQLOGDIR
Default LOG directory
/db2event
Monitor data for deadlocks
TEMPSPACE1 table space (always created) containers
USERSPACE1 table space (always created) containers
SYSCATSPACE1 (catalog/object cross reference data (always created) containers
SQLDBCON
SQLDBCONF db configuration paramater file
Db2rhist.asc db recovery history file*.IN1 (contains index table data),
etc.
\program files
\sqllib
\bin
db2start.exe
db2stop.exe
db2cmd.exe, etc.
2008 IBM Corporation
Instance Directory (Windows)
\Document and Settings\all users\application\data\IBM
\DB2\DB2COPY1
\DB2
Copy of the installation Code (may be
more than one of different levels)
Instance Name
\SQLDBDIR
Local System Database Directory / Catalog
\SQLNODIR
Node Directory /Catalog
DB2DIAG.LOG
*.dmp
*.trc
etc.
Other system and temp dirs
Diagnostic file (text)
dump file (in case of fatal
trace files in case of fatal errors
2008 IBM Corporation
DB2 Directory Structure
Linux-Unix Example
Instance Name
/db2inst1
/node0000
/sql00001
Partition number (note: in ESE non DPF databases are a single node implementation)
Database ID
/SQLOGDIR
Default LOG directory
/db2event
Monitor data for deadlocks
TEMPSPACE1 table space (always created) containers
USERSPACE1 table space (always created) containers
SYSCATSPACE1 (catalog/object cross reference data (always created) containers
SQLDBCON
SQLDBCONF db configuration paramater file
Db2rhist.asc db recovery history file*.IN1 (contains index table data),
etc.
/opt
/ibm
/bin
/instance
/java
/db2
/V9.5
2008 IBM Corporation
Instance Directory (Linux-Unix)
/home/db2inst1
Instance Name = Instance Owner User
/SQLLIB
/SQLDBDIR
Local System Database Directory / Catalog
/SQLNODIR
Node Directory /Catalog
/cfg
..
/db2dump
Db2profile
profile file for user db2inst1 (contains environment settings)
Instance directory for diagnostic
(db2diag.log) trace and dump files
..
2008 IBM Corporation
Registry / Environment Variables
DB2 Registry: used to enable/disable specific
functionality/behaviours
Work at Environment, Server, Instance level
Via db2set command
Examples:
Variable
Function
db2adminserver
db2comm
db2include
db2instance (e)
db2instdef
db2owner
db2slogon
db2path (e)
db2system
Specifies which instance runs the admin. server
Started communications manager
Path to include in SQL searches
Current instance
Default instance
Instance owning machine
Enables secure logon
Directory where product is installed
Server name id
2008 IBM Corporation
Accessing Remote Dabases:
clients, drivers and catalogs
2008 IBM Corporation
DB2 Clients
DB2 9 API:
DB2 Client
application
emb.
SQL
supp.
drivers
DRDA Appl Requester
comm. layer
LAN WAN
TCP/IP - SNA
DB2 Server
embedded SQL
CLI, ODBC, OLE-DB, ADO.Net
JDBC, SQLJ
DB2
Connect
DRDA Appl Server
database services
protocollo DRDA usato da ogni client per accedere a:
DB2 UDB for LUW: non richiede DB2 Connect
DB2 UDB for i-Series: richiede DB2 Connect
DB2 UDB for OS/390 & z/OS: richiede DB2 Connect
DB2 for VSE & VM: richiede DB2 Connect
funzionalit DRDA AR inclusa in ogni DB2 Client
Solo Drivers (diversi pkg a seconda della tipologia)
IBM Client (contiene TUTTI I drivers + CLI)
Run Time client (per amministrazione)
funzionalit DRDA AS inclusa in ogni DB2 Server
(compresi mainframes e i-Series)
2008 IBM Corporation
DB2 Catalogs (directories)
On a Server:
Local System Database Directory
Lists local & Remote databases accessible by this server
Local Volume Database Directory
On each volume in which a DB exist, describe/list available local DBs
On a Client (CLI) and on a Server accessing remote DBs:
Local System Database Directory
Possible in
Control
Center
Lists only remote DBs accessible by the client
On both Server and Client (CLI)
Node Directory
Lists and describe Remote Servers/Instances hosting a DB
Specifies Protocol (TCPIP) ADDRESS/HOSTNAME, Port to be accessed (Server and/or Service) or TCPIP
Service
catalog
tcpipmust
node
db2
A DB
to be accessed
be: db2node remote SERVER1 server 50001
CATALOGUED IN THE LOCAL SYSTEM DB DIRECOTRY
THIS DEFINITION MUST EVENTUALY RELY ON A NODE/SERVER DEFINITION
Eg: DB2 CATALOG TCPIP NODE MYSERVER REMOTE 172.17.192.44 SERVER 5000
db2 catalog admin tcpip node db2das remote SERVER1
DB2 CATALOG DB SAMPLE AS MYSAMPL AT NODE MYSERVER
2008 IBM Corporation
Sample Catalog Commands
db2
db2
db2
db2
.
catalog/uncatalog database/node
list database directory
list node directory
list admin directory
2008 IBM Corporation
Tools
2008 IBM Corporation
DB2 Control Center
Centers
Commands and scripts execution
Graphical Monitor and Suggestions
Create Object Wizards
Backup Execution and Scheduling
.And
many more such as:
Visual Explain (also fed byRecovery
Activity Monitor
or Command Editor)
Wizard
Log Configuration Wizard
DBs and Instance cfg
Activity Monitor (performance inspection)
Task Scheduling
SQL Stmts fed by Activity Monitor
License management
Lock Inspection
Data Partitioning Wizard
Data Movement Wizards
Data Maintenance Wizard
Autobackup Wizard
2008 IBM Corporation
IBM Data Studio
Eclipse based: other tools available as plug-ins
Perspectives
Object Editors
ER Diagramming
Snapshot data Skews
Object Properties
Stored Procedure Builder and Debugging
2008 IBM Corporation
Data Web Services in 5 simple steps
1. Create and Test Queries or Stored Procedures
2. Create Service
3. Drag n Drop Resources
4. Deploy Service
5. Test and Deliver
2008 IBM Corporation
IBM Data Studio Administrative Console: Quick & Easy Problem
Determination
Dashboard Adhoc Investigation
Heatchart Overall Health Status
Where are the most important
hotspots that need my attention?
Something doesnt seem quite right. I
wonder whats happening?
Administrator
Recommendations Root Cause Analysis
Alert List Historical Investigation
What happened when I was out for
lunch? ... Away for weekend?
Guide me to the root cause and help
me fix it properly; I need to know all the
revelant info to make the best decision.
2008 IBM Corporation
Managing Storage
2008 IBM Corporation
Tablespaces
Tablespaces
Logical layer to store data as link between
logical objects (tables)
Physical devices (disks): containers
Three kinds:
SMS: System Managed Space.
Containers = directories
Average Perfomances
No maintenance (grow/shrink on need automatically)
DMS: Database Managed Space
Containers = files or raw devices
High to Best Perfomances
Requires manual intervention when out of disk space
Automatic Storage
Containers = files
High Perfomances
Auotresizing (grow/shrink) when needed based on predefined policies
Manual Intervention only to add NEW devices/disks when automatic storage pool
is exausthed
2008 IBM Corporation
Tablespace Usage
Possibilities:
Regular
Large
Type of usage:
System
System Temporary
User
User Temporary
2008 IBM Corporation
Pagesize
For all System/User (temporary or not) both Regular and Large:
Must specify a size for the data page at TBSPC level
4,8,16 or 32kb
Row must fit into page. If not, must use BLOB
Associate ONE Buffer Pool (cache memory for I/O)
Same page size as Tablespace associated
Can make use of Self Tuning Memory Manager
2008 IBM Corporation
DB2 Storage: Buffer Pools
Buffer Pool:
Area of main memory used to
cache table and index data
Can decrease time to access
data
Each database must have at
least one buffer pool
Tuning buffer pools
effortless with Self-Tuning
Memory Manager (STMM)
2008 IBM Corporation
Tablespace Capacity
Old Tablespace Design
New Large Tablespace Design
2008 IBM Corporation
DB2 Storage: Table Spaces
DB2 performs I/O grouping bits in sets of Data Pages. Data page have a fixed sized of 4K, 8K, 16K, or 32K
Table spaces are created specifying one page size out of 4K, 8K, 16K, or 32K.
Page size limits the data that can be stored in the table space. One row MUST fit page size, except for Binary Data
Default page size is 4KB
Inside the page, row is identified by ROWID
Description
4K page
size limit
8K page 16K page
size limit size limit
32K page
size limit
Maximum number of columns
in a table
500
1012
1012
1012
Maximum length of a row
including all overhead
(bytes)
Maximum size of a table per
DB partition in a regular
table space (GB)
Maximum size of a large DMS
table space, using the
default 6 byte RID size (TB)
4005
8101
16 293
32 677
64
128
256
512
16
2008 IBM Corporation
DB2 Storage: Table Spaces Example
2008 IBM Corporation
Logging, Backup &
Recoveries
2008 IBM Corporation
Logging
Circular vs Archive
Backup & Logging
Backup
Cold
Warm
Partial
Recovery
Restore vs Forward Recovery
2008 IBM Corporation
DB2 Logging Overview
Log Buffer
synchronous
asynchronous
write
Online Active Log
db2agent
db2pclnr
db2loggr
on COMMIT/
ROLLBACK
MINCOMMIT
SOFTMAX
Buffer Pool
write when
triggered
(chngpgs_thres
hold)
Database Files
2008 IBM Corporation
ARIES Write-ahead Logging
ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) was invented in
the IBM Almaden Research Center by Dr. C. Mohan
Write-ahead logging:
Must force the log record for an update before the corresponding data page gets to
disk
Guarantees Atomicity (all actions in a transaction happen, or none happen)
Must write all log records for a transaction before commit
Guarantees Durability (if a transactions commits, its effects persist)
How is it done:
Each log record has an unique Log Sequence Number (LSN)
Each data page contains a pageLSN
System keeps track of flushed LSN
Before a page is written, ensure pageLSN <= flushed LSN
DB2 uses ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) as the
transaction recovery method supporting fine-granularity locking and partial rollbacks
2008 IBM Corporation
DB2 Logging Circular Logging
"n"
PRIMARY
"n"
SECONDARY
2008 IBM Corporation
DB2 Logging Archival Logging
Configured via DB2 Logging
Wizard
(DB2 CC)
DB2 archive
log file
(db2uext2)
12
OFFLINE
ARCHIVE
ACTIVE - Contains
information for
non-committed
13
14
OFFLINE ARCHIVE Archive moved from
ACTIVE log subdirectory.
(May also be on other
media)
ONLINE ARCHIVE Contains information
for committed.
Stored in the ACTIVE
log subdirectory.
15
16
2008 IBM Corporation
Backup
Used to protect data in case of phisycal/logical failure affecting
them
Basic protection different from HA & DR concepts
Can be:
Full (entire DB is backed-up)
Cold (Full backup offline)
Warm (full backup online)
Partial (tablespace(s))
Incremental (only changes from latest activity)
Delta
Increment
Produce an asset of type file
Can be compressed
Can be put to disk or tape
2008 IBM Corporation
Warm backup
Need LOG ARCHIVE method
Data is copied to disk WHILE
applications are changing them
Changes are written to LOG files
Log files written WHILE backing up
DB are PART of the Backup image
file produced.
2008 IBM Corporation
Partial Backup
Partial(Incremental) backup types
Cumulative
backs-up everything since most recent full backup
Delta
backs-up only data that changed since last backup
(either full or incremental)
Sunday
Mon
Tue
Wed
Thu
Full
Fri
Sat
Sunday
Full
Cumulative Backups
Full
Delta Backups
Full
2008 IBM Corporation
Automatic backup
Automatic backup was:
Scheduled backup (fixed time)
Automatic backup for DB2 is:
The possibility to tell when a backup is needed
How many times (a day, a week, a month.)
How many changes/log spaces
The possibility to tell when a backup can be taken
Provide a time-frame
Within wich the backup must be executed
Outsied of which the bcakup can be executed
Number of images to be kept on line
The same for log sequences
2008 IBM Corporation
Recovery
Main assets: backup Image(s) + Log Files (when required)
Main steps:
Restore (of the backup image)
Rebuild the file system(s) onto which tables were stored
Forward Recovery (Roll Forward)
Mimic the changes accumulated in the Log Files after (and maybe
during) the backup timeframe.
2008 IBM Corporation
Forward Recovery
Roll Forward (or Forward Recovery) when possible:
Optional: when Log Archive enabled and when restoring a Full Backup
Offline
Mandatory: when Log Archive enabled and when restoring a Full Backup
Online.
Log files contain eventually modified data during the timeframe of online backup
execution.
Minimum: rollforward reading AT LEAT the log files written during online
backup timeframe.
2008 IBM Corporation
Forward Recovery: how far?
Two options:
TO END OF LOGS
Mimic all the changes in ALL the log files DB2 can read
Beware of log sequences assocatiated with the backup Image
TO ISO TIME
Mimic all the changes contained in ALL the log files DB2 can read AND
up to a specified timestamp.
2008 IBM Corporation
Managing Performance
2008 IBM Corporation
Performance: an esoteric approach?
Managing performance can be
pure esoterism
In the past, a very high level of
knowledge was necessary
Valuable phisycal database design
Valuable parameter settings
At system level
At RDBMS level
Valuable application logic design
2008 IBM Corporation
Take it easy with DB2: Automatic Optimizer
DB2 Learning Optimizer (LEO)
Analyze Query received
Automatically Calculate the best
access plan based on
Best practices algorythm cost
based
Optimization classes (eg: simple
transactional accesses vs complex
business intelligence ones)
Statistics automatically collected
onto tables and indexes
Use of materialized views,
clustering of data (eg:
multidimensional) etc
Automatically re-writes queries
No need of high skilled DBA (no
hints required by DBA to
induce a good behaviour).
2008 IBM Corporation
Take it easy with DB2: Self Tuning Memory Manager
DATABASE
MEMORY
Sorts &
Hash Joins
2008 IBM Corporation
Take it easy with DB2: Automatic Data Maintenance
For best performances, Optimzer needs up to date Statistics
Can be automatically calculated by DB2 when needed
Impact on system is minimized (on line execution, throttling and
policies)
For best performances, data in tables and indexes need to be
reorganized
Can be automatically done when needed
Impact on system is minimized (on line execution , throttling and
policies)
In specific cases, use of MDC avoid data reorganization
2008 IBM Corporation
DB2 aids to manage performances
Automatic settings of peculiar parameters affecting performances
At instance and/or database level
Related to I/O, Concurrency of workload (eg: n. of concurrent SQL
Statements executed, nr. Of concurrent connections etc)
Advisors to enhance phisycal db design in terms of:
Indexes needed for a specific workload
Materialized Query Tables: materialized views for complex
aggregations
Multidimensional Clustering: re-define critical tables in a
multidimensional (ipercube) fashion.
Analyze SQL cache
Analyze single Query behaviour
2008 IBM Corporation
DB2 advisors
DB2 Activity Monitor:
To produce reports to analyze SQL cache and feed Design Advisor
DB2 Design Advisor:
To analyze the workload (eg: coming from Activity Monitor) and
suggest:
Indexes
MQTs
MDCs
DB2 Visual Explain
To analyze single query (eg: coming from Activity Monitor) behaviour
Graphical
Specify costs for each part of the query
Determine when indexes are used and why they are not
2008 IBM Corporation
Materialized Query Tables
Pre-aggregated data store
Q2
Q1
Qn
Cost of plan calculation shared on several queries
no MQT: Optimizer calculate aggregation for each query
with MQT: Optimizer calculate ONCE the aggregation and
reuse plan each time
Automatic MQT use by Optimizer
In case of partitioned DB, MQT can be replied on all
nodes
create table m_vendite as
(SELECT v_cid,
sum((V_QTA * V_PREZZO)*(100 - V_SCONTO)/100) as venduto
FROM DB2ADM.IVENDITE group by v_cid)
data initially deferred refresh deferred
enable query optimization maintained by system;
aggregation
join
base tables
Qz
Qn
Q2
Q1
MQT
aggregation
join
base tables
2008 IBM Corporation
Row Compression
Automatically Compress a Table
without DBA intervention (Lempel-Ziv)
Automatically create a Compression
Dictionary for entire Table
Q u e ry #
Data stay compressed in Buffer
Pools, in Logs, not only on disk!!
Less disk, less I/O , better
Performance
N o c o m p re s s io n
R o w C o m p r e s s io n
M in
M in
% d iff
9 7 .3
7 3 .8
76%
9 .8
7 .9
81%
7 7 .0
5 5 .6
72%
9 3 .3
4 8 .5
52%
1 0 3 .0
6 2 .8
61%
1 4 .8
9 .3
63%
6 6 .7
4 4 .3
66%
2 3 0 .8
1 8 4 .0
80%
1 4 6 .3
9 2 .1
63%
10
1 0 2 .6
5 6 .1
55%
11
9 3 .9
3 4 .0
36%
12
1 0 5 .4
5 2 .4
50%
13
7 8 .2
7 6 .7
98%
14
6 .8
5 .0
74%
15
9 .8
9 .7
99%
101%
16
1 3 .3
1 3 .4
17
3 7 .2
2 8 .9
78%
18
2 7 8 .6
2 6 9 .0
97%
19
3 3 5 .2
2 6 6 .1
79%
20
4 6 .1
3 1 .6
69%
21
3 0 0 .2
1 6 6 .8
56%
22
6 .0
5 .3
88%
TS
2 2 5 8 .2
1 5 9 7 .9
71%
2008 IBM Corporation
DB2 Flexible Data Partitioning for High Performance
DISTRIBUTE BY HASH
PARTITION BY RANGE
Worlds Richest
Slice & Dice
Capability
ORGANIZE BY DIMENSIONS
Node 1
Node 2
Distribute
Node 3
T1 Distributed across 3 database partitions
Partition
TS1
TS2
TS1
TS2
TS1
TS2
Jan
Feb
Jan
Feb
Jan
Feb
North South
North South
North South
North South
North South
North South
East West
East West
East West
East West
East West
East West
Organize
2008 IBM Corporation
No Partitioning
CREATE TABLE my_hybrid
(A INT, B INT, C INT, D INT )
IN Tablespace A,
Data
2008 IBM Corporation
Distribute by Hash
Divide & Conquer Parallelism
P1
P2
CREATE TABLE my_hybrid
(A INT, B INT, C INT, D INT )
IN Tablespace A,
DISTRIBUTE BY HASH (A)
P3
P4
2008 IBM Corporation
Hash + Partition by Range:
Partition Elimination
P1
CREATE TABLE my_hybrid
(A INT, B INT, C INT, D INT )
IN Tablespace A, Tablespace B, Tablespace C
INDEX IN Tablespace B
DISTRIBUTE BY HASH (A)
PARTITION BY RANGE (B) (STARTING FROM (100) ENDING (300) EVERY (100))
P2
P3
P4
2
0
0
6
2
0
0
5
2008 IBM Corporation
Hash + Range + MDC
High density, High Value, Low IO Reads
P1
CREATE TABLE my_hybrid
(A INT, B INT, C INT, D INT )
IN Tablespace A, Tablespace B, Tablespace C
INDEX IN Tablespace B
DISTRIBUTE BY HASH (A)
PARTITION BY RANGE (B) (STARTING FROM (100) ENDING (300) EVERY (100))
ORGANIZE BY DIMENSIONS (C,D)
P2
P3
P4
2
0
0
6
2
0
0
5
2008 IBM Corporation
Workload Management
User Database
Requests
Superclass 1
Workload A
Stable and
Predictable
Performances by
predefining amount
of System and DB2
resources to be
used by workload
Workload B
Workload C
Work
Action
Set
Subclass1.1
Subclass1.2
Subclass1.3
Workload D
Default User
Class
Default workload
System Database
Requests
Default System
Class
2008 IBM Corporation
Locking and Concurrrency
DB2 implements standard locking
model
Row level not page level
Implements standard ISO Isolation
Levels
Proprietary terminology (from the
lower to the higher level of control)
UR
CS
RS
RR
CFR reference to understand how
readers and writers must behave to
guarantee the best compromise
among:
Performances
Concurrency of access
Consistency of application logic
2008 IBM Corporation
Managing Security,users &
groups
2008 IBM Corporation
Authentication
delegatated by default to Operating System
Can be done by Server (default) or Client (security
issues here)
Can be compliant with
LDAP/Active Directory
Kerberos
Data Encryption
Optionally: via User exit (eg: Migrations)
2008 IBM Corporation
DB2 Security
There are 3 types of authorization in DB2:
Authorities (Groups/Roles)
Privileges (Grant/revoke on objects like Tables, Views,
Packages,SP, Triggers..)
LBAC credentials
Explicit authorities/privileges:
Granted explicitly to the user, usually in the form of a GRANT
statement
Implicit authorities/privileges:
Granted to a group to which the user belongs, or to a role in
which the user, the group, or another role is a member
2008 IBM Corporation
DB2 Security: Authorities
2008 IBM Corporation
LBAC Query
No
LBAC
SELECT * FROM EMP
WHERE
SALARY >= 50000
SEC=254
SEC=10
0
SEC=50
ID
SALARY
255
60000
100
50000
50
70000
50
45000
60
30000
250
56000
102
82000
100
54000
75
33000
253
46000
90
83000
200
78000
2008 IBM Corporation
Autonomics wrap-up
2008 IBM Corporation
LEO Optimizer and Automatic
Query Rewrite
Automatic Maintenance
Self Tuning Memory Manager
Automatic Configuration
Self Healing and Automonitoring
Automatic Storage
Automatic Backup
Automatic Log Sequence
Maintenance
2008 IBM Corporation
High Availability
2008 IBM Corporation
The available Options
Via Scripts by use of
On-line Split & Mirror functionality
Using Log File Shipping and Backup + File Transfer + Restore + Roll
Forward
Relying on HA software built in in SO
E.g: MS Cluster Services
Via use og HADR feature: two DB copies kept in sync automatically
In any of the above: 1 server is active 1 is stand-by (hot stand-by)
Can have also Mutual Take Over (for two distinct workloads working
on different DBs. One server at a time is active for each workload)
2008 IBM Corporation
High Availability & Disaster Recovery (HADR)
Target Market
Portland
Toronto
Online commercial
applications
Challenge
HADR
24 x 7 Availability
Failover in seconds
Disaster recovery
HTTP & App.
Servers
Solution : HADR
Primary
Database
Server
Standby
Database
Server
Standby
HTTP & App.
Servers
Offsite Disaster Recovery
Single solution handles
More
later
Ultra-fast failover
Local and remote site
recovery
Value
Business continuation
Tight integration; Very
simple to use
Automatic Reroute
R
HAD
Client
Application Server
Standby Database
Server
Primary Database
Server
Onsite Warm Standby
2008 IBM Corporation
Synchronization modes
This diagram shows when acknowledgements are received in Synchronous,
Near Synchronous and Asynchronous modes
TCP/IP socket
HADR
HADR
send()
ro
ch
yn
As
log writer
receive()
us
no
new
logs
Primary
r
ea
hr
c
n
Sy
o
on
Synch
Commit Succeeded
us
ronou
new
logs
Standby
2008 IBM Corporation
pureXML : native XML storage manager
2008 IBM Corporation
pureXML : native XML storage manager
Its IBM patented proprietary
Technology
Its the only proprietary
implementation as for XML
pureXML is an IBM trademark
2008 IBM Corporation
pureXML: The XML Data Type
SQL/XML introduces a new data
type: XML
Two XML type flavors:
Transient XML type
Can only be used internally
Needs to be converted to non-XML type before returning to client
XML type not exposed to user
Has already been supported by DB2 V8.x
XML is converted to string value using XML2CLOB or XMLSERIALIZE
Persistent XML type
Data type XML can be used for
Tables, Views, Functions, Procedures,
Values of type XML can be stored in columns
Values of type XML are visible and, e.g., can be returned in queries
2008 IBM Corporation
pureXML: Native XML Storage
DB2 stores XML in parsed hierarchical format (~DOM)
create table dept (deptID char(8),, deptdoc xml);
Relational columns
are stored in relational
format (tables)
deptID
PR27
deptdoc
<dept>
<emp></emp>
</dept>
XML columns are
stored natively
DB2 Storage
No XML parsing for query
evaluation!
2008 IBM Corporation
pureXML: Native XML Storage VS XML Digital Signatures
pureXML preserves digital signatures
Because XML is stored natively
Digitally signed XML documents can be inserted in DB2, retrieved, and the signatures
verified
Details:
Creation of a signature requires canonicalization of the doc
W3C spec: http://www.w3.org/Signature/
Signatures can be created on all or part of a document
There are many canonicalization algorithms
Most canonicalization algorithms
need the whitespace preserved
Start with the "PRESERVE WHITESPACE" option
2008 IBM Corporation
Storing a XML Document
<dept>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
<employee id=902>
<name>Peter Pan</name>
<phone>408 555 9918</phone>
<office>216</office>
</employee>
</dept>
Parse
XML
XMLtext
textrepresented
represented
as
asdocument
documenttree
tree
dept
employee
id=901
name
John Doe
phone
408-555-1212
employee
office
344
id=902
name
Peter Pan
phone
408-555-9918
office
216
2008 IBM Corporation
Efficient Document Tree Storage
dept
employee
id=901
name
John Doe
phone
"Compression"
employee
office
408-555-1212
344
id=902
name
Peter Pan
phone
Reduced storage
Fast comparisons &
navigation
office
408-555-9918
216
SYSIBM.SYSXMLSTRINGS
0
4
1
5
2
3
String table
dept
employee
name
id
phone
office
Tag names encoded
as unique integers
5=901
John Doe
408-555-1212
5=902
344
Peter Pan
408-555-9918
216
2008 IBM Corporation
Efficient Document Tree Storage and Automatic Indexing
dept
employee
id=901
name
John Doe
phone
408-555-1212
employee
office
344
/
/dept
/dept/employee
/dept/employee/@id
/dept/employee/name
/dept/employee/phone
/dept/employee/office
(...)
Each node has a path
id=902
name
Peter Pan
phone
office
408-555-9918
216
5=901
John Doe
408-555-1212
5=902
344
Peter Pan
408-555-9918
216
2008 IBM Corporation
pureXML: XML Node Storage Layout & Region Index
Node hierarchy of an XML doc stored on DB2 pages
Documents that dont fit on 1 page: split into pages/regions
No architectural limit for size of XML documents
NodeIDs used to identify individual nodes (1.2.4.3)
Nodes are physically connected
Regions are logically connected
Key in Regions index is NodeID (NID)
E.g. NID = 1.4.1.3
Example:
Path index
Document split into 3 regions,
stored on 3 pages
Region index
/dept
/dept/employee
/dept/employee/@id
1
1.3
1.1
1.4
1.4.1
1.4.1.3
1.3.1.1.5.3
page
page
page
2008 IBM Corporation
XML Storage
INX Object
DAT Object
ID
PR27
PR28
ACC
DEPTDOC
Region
Path
/dept
/dept/employee
/dept/employee/@id
XDA Object
2008 IBM Corporation
Index on an XML Column
create table t1 (docID int, XMLDoc xml);
create index AgeIndex on t1(XMLDoc)
generate key using xmlpattern '/Person/Age' as sql varchar(10);
create index AgeUnitIndex on t1(XMLDoc)
generate key using xmlpattern '/Person/Age/@unit' as sql varchar(16);
AgeIndex
Index on an XML Column
/Person/Age
/Person/Age
17
6322
Relational Table T1
DocID
AgeUnitIndex
XMLDoc
1
XML Document Tree
Index on an XML Column
/Person/Age/@unit
/Person/Age/@unit
XML Document
<?xml version="1.0"?>
<Person gender="Male">
<Name>
<Last>Cool</Last>
<First>Joe</First>
</Name>
<Age unit="years">17</Age>
<Age unit="days">6322</Age>
</Person>
"days"
"years"
@gender
"Male"
Person
Name
Age
Last
First
text()
"Cool"
text()
"Joe"
@unit
"years"
Age
text()
"17"
@unit
"days"
text()
"6322"
XQUERY for $i in db2-fn:xmlcolumn('T1.XMLDOC')
/Person[Age='17'] return $i/Name
Created by users to improve query performance
XML pattern identifies paths and values to index
2008 IBM Corporation
The Big Indexing Picture
Relational
Index
Relational Column 1
Relational Column 2
Index
on XML
Column
XML Column
Logical mapping of
regions in an XML
XML
document used to retrieve
Regions Index
the document data
Catalog Path Table
XML
Column
Paths
Index
Created by users to
improve performance
during queries on XML
documents
SQL Table with XML
Column
XML Storage
.XDA file
Maps paths to path ids
for each XML column.
Subset of paths stored in
global catalog path table.
2008 IBM Corporation
Updating elements/attributes in DB2 9
DB2
DB2 Client Application
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
</dept>
XML
Parsing
Parsed
format
SQL Update
XML
Serialization
XML
Serialization
Change
XML Parsing
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
</dept>
Transmission
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
</dept>
2008 IBM Corporation
Updating elements/attributes in DB2 9.5
DB2
DB2 Client Application
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>209</office>
</employee>
</dept>
XML
Parsing
Parsed
format
SQL Update
w/ XQuery Transform
XML
Serialization
XML
Serialization
Change
XML Parsing
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
</dept>
Currently the first and unique
implementation of the new W3C
update
standard
Transmission
(released in April, 2007)
<dept bldg=101>
<employee id=901>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
</dept>
2008 IBM Corporation
Old XML-Enabled Databases: Two Main Options
le
ac
r
Or
rve
e
LS
SQ
CLOB/Varchar
Extract
selected
elements/attr.
Shredding
XML
DOC
Fixed
Mapping
XML
DOC
"Decomposition"
Shredder
Side Tables
XML DOC
XML DOC
XML DOC
Regular tables for
faster lookup
Varchar or clob
column
Regular
relational
tables
2008 IBM Corporation
DB2 9.5 XML features Recap
Sub-document update
Controlling index behavior
Replication
Validation triggers
Validation check constraints
Non-Unicode databases
XML Load
User-friendly publishing functions
Simpler SQL/XML functions
XSLT function
Decomp enhancements
Federation
XQuery enhancements
Omnifind
Index Design Advisor**
Compatible schema evolution
sqlquery() parameters
Base-table inlining/compression
** Available in future fixpacks
2008 IBM Corporation
Monitoring
2008 IBM Corporation
Tools
DB2 collects monitornign data based on specific criteria
Data exposed via non graphical built-in tools such as:
Snapshot Monitor
Event Monitor
Data exposed via Graphical Interface (DB2 CC):
Health Center
Data exposed in reports with aggregation and trend analisys:
DB2 Performance Expert tool (java application)
In future via Optim Data Studio Administration Console (web)
2008 IBM Corporation
DB2 Editions
2008 IBM Corporation
DB2 9.5 - Data Server Editions
DB2 for z/OS ,
DB2 for i5 OS
DB2 Workgroup
Linux
Windows
UNIX
4 CPUs, 16 GB RAM
DB2 Enterprise
Linux/Windows/UNIX
No limits
Servers
Free download,
64-bit
AIX
Windows Intel/AMD
Linux Intel/AMD, PowerPC, zSeries
Solaris, Sun IPF
HP PA-RISC, HP IPF
32-bit
Windows Intel/AMD
Linux Intel/AMD
DB2 Express
Linux/Windows
2 CPUs, 4GB RAM
DB2 Express-C
Linux/Windows
2GB RAM
Clients
32-bit & 64-bit ALL
DB2 Everyplace
Linux/Windows & PDA
2008 IBM Corporation
InfoSphere Warehouse 9.5 Editions
Key Features
DW Starter
DW Intermediate
DW Advanced
DW Enterprise Base
DW Enterprise
200 PVUs /
4GB database memory
400 PVUs /
32GB total memory
1000 PVUs / 2TB user data
None
None
Linux Windows
Linux Windows
Linux only
Any
Any
DB2
SQL Warehousing Tool
DWE Admin Console
DWE Design Studio
Cubing Services & MQT
DB2 Range Partitioning &
MDC
Performance Optimization
Feature
Storage Optimization
Feature
Alphablox
Alphablox
Licence limitations
(maximums)
Platform Support
Query Patroller / Workload
Mgmt
Performance Expert
Compression: Table and
Backup
Alphablox
DWE ABX Add-in
Alphablox connectors
Alphablox
non-DWE Connectors
Intelligent Miner
Unstructured Text Analysis
DW Enterprise
DW Enterprise Edition
N/A
DW Advanced
Trade-ups available
DW Intermediate
DW Enterprise Base
2008 IBM Corporation
DB2 9.5 Editions
DB2 Express-C
No-charge edition of DB2 available for download at
www.ibm.com/db2/express
Will use up to 2 processor cores, 2 GB memory
Available for:
Windows 32-bit (x86)
Windows 64-bit (x86_x64)
Linux 32-bit (x86)
Linux 64-bit (x86_64)
Linux on POWER (iSeries, pSeries)
Includes pureXML
Fixed Term License (FTL) can be purchased for support. See
http://www.ibm.com/software/data/db2/express/support.html
2008 IBM Corporation
Misc
2008 IBM Corporation
DB2 Data Types
DB2 supports all standard SQL data types (e.g. CHAR, VARCHAR)
DB2 9 introduced the native XML data type is available to store XML
documents directly in a table
XML columns can be queried using XQUERY, allowing subsets of a
document to be retrieved
DB2 9.5 introduces two new data types:
DECFLOAT: Accuracy of DECIMAL; performance of FLOAT
ARRAY: Defines an array based on one of the built-in types
CREATE TYPE phonenumbers AS DECIMAL(10,0) ARRAY[5]
2008 IBM Corporation
Database Application Development Technologies
Key Database Technologies
SQL / SQL Procedures
XML
SOA / Web Services
Developer communities
C/C++
Java (JDBC / SQLJ)
.NET (C#, VB .NET)
Open Source
PHP
Perl
Python
2008 IBM Corporation
DB2 Sample Database
The SAMPLE database can be used for testing applications, trying
features of DB2, etc.
To create the sample database populated with both standard
relational data and XML data:
db2sampl sql xml
The SAMPLE database can be dropped and recreated at any time:
db2 drop database SAMPLE
Most of the sample application programs that come with DB2 use
sample database
2008 IBM Corporation
Grazie
2008 IBM Corporation
IBM Software Group
DB2 9.5 for Linux, Unix and Windows
Fundamentals
2008 IBM Corporation
DB2 Storage: Table Spaces
Table spaces can be managed by the operating system, by the
database manager, or by the DB2 automatic storage feature:
System Managed Space (SMS):
Data stored in files in the file system
Access to data controlled using standard I/O functions of the OS
Space not allocated by the system until it is required
Ideal for small, personal databases; databases that grow/shrink rapidly
Low maintenance and monitoring
CREATE TABLESPACE tbsp1
MANAGED BY SYSTEM
USING ('d:\acc_tbsp', 'e:\acc_tbsp', 'f:\acc_tbsp')
2008 IBM Corporation
DB2 Storage: Table Spaces
Database Managed Space (DMS):
Data stored in files or on raw devices
Can bypass operating system I/O functions, increasing performance
Ideal for performance-sensitive applications
Increased maintenance and monitoring
CREATE TABLESPACE tbsp1
PAGESIZE 8K
MANAGED BY DATABASE
USING (FILE 'd:\db2data\acc_tbsp' 5000, FILE 'e:\db2data\acc_tbsp' 5000)
Can also have containers automatically resized when they fill up:
CREATE TABLESPACE tbsp2
PAGESIZE 8K
MANAGED BY DATABASE
USING (FILE ' /storage/dms1' 10 M) AUTORESIZE YES
2008 IBM Corporation
DB2 Storage: Table Spaces
Automatic Storage Table Space:
Can be used if database is enabled for automatic storage
Database manager assigns containers automatically, one each storage
path
Automatically handles resizing table spaces
Creates a DMS table space for regular/large table spaces
Creates a SMS table space for user or system temporary table spaces
CREATE DATABASE mydb AUTOMATIC STORAGE YES on c:\storpath1, c:\storpath2, c:\storpath3
CONNECT TO mydb
CREATE TABLESPACE tbsp1 MANAGED BY AUTOMATIC STORAGE
2008 IBM Corporation