0% found this document useful (0 votes)
100 views

Sybase Training

Sybase is a scalable and high performance database with a client/server architecture. It uses a multithreaded server model where user connections are implemented as threads. The key system databases include master, model, tempdb, and sybsecurity. The master database contains critical system metadata like user accounts and server configuration. The model database acts as a template for new databases. Tempdb is used for temporary storage needs. Sybsecurity contains the audit system. Important system objects include system tables, stored procedures, indexes, and user-defined datatypes.

Uploaded by

Praveen Valeti
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Sybase Training

Sybase is a scalable and high performance database with a client/server architecture. It uses a multithreaded server model where user connections are implemented as threads. The key system databases include master, model, tempdb, and sybsecurity. The master database contains critical system metadata like user accounts and server configuration. The model database acts as a template for new databases. Tempdb is used for temporary storage needs. Sybsecurity contains the audit system. Important system objects include system tables, stored procedures, indexes, and user-defined datatypes.

Uploaded by

Praveen Valeti
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 163

SYBASE

TRAINING
Sybase
Architecture
What is Sybase Server?
• Scalabale High Performance Database
– Client/Server Architecture
– Multithreaded Server
– User Connections implemented as threads
– RAM requirement per user = 50KB
– Scalability
– Multithreaded Operation not possible
– Parallel processing not possible
System Databases
• master database
• model database
• sybsystemprocs
• tempdb
• sybsecurity
• sybsyntax
Master Database

• User accounts (in syslogins)


• Remote user accounts (in sysremotelogins)
• Remote servers that this server can interact with
(in sysservers)
• Ongoing processes (in sysprocesses)
• Configurable environment variables (in
sysconfigures)
• System error messages (in sysmessages)
• Databases on SQL Server (in sysdatabases)
Master Database

• Storage space to each database


(sysusages)
• Tapes and disks mounted on the system
(sysdevices)
• Active locks (in syslocks)
• Character sets (in syscharsets) and
languages (in syslanguages)
• Users who hold server-wide roles (in
sysloginroles)
Model Database
• Provides a template for new databases
• Default is 2MB
• Databases cannot be smaller than model
database
• Adding user-defined data types, rules, or
defaults
• Adding users who should have access to
all databases on SQL Server
• Granting default privileges, particularly for
guest accounts
Tempdb database
• Storage area for temporary tables
and other temporary working storage
needs (for example, intermediate
results of group by and order by)
• Space shared among all users
• Default size is 2 MB
Tempdb contd
• Default size is 2MB
• Restart of the server clears tempdb
• At Server restart model is copied on
to tempdb
• Size can be altered by ALTER
DATABASE
Sybsecurity database
• Contains the audit system for SQL
Server.
Consists of :
– sysaudits table, which contains the
audit trail. All audit records are written
into sysaudits
– sysauditoptions table, which contains
rows describing the global audit options
– All other default system tables that are
derived from model
Sybsyntax database
Contains syntax help for
• Transact-SQL commands
• System procedures
• SQL Server utilities

eg. sp_syntax "select"


System Tables
• Track Information
– Server Wide
– Database Specific
• Define Database Structure
• The master contains all system
tables(31)
• User databases Contain a Subset of
System Tables(17)
System tables in user
databases
• Sysusers
• Sysobjects
• Sysprocedures
• Sysindexes
• Sysconstraints
• sysreferences
Database Components

Database Components System Tables

Objects table ,view,default,rule, sysobjects


stored procedure, and
trigger

Indexes sysindexes

Datatype systypes
sysconstraints
Constraints sysreferences
System Procedures
• An easy way to query system tables
• System Procedure is a precompiled
collection of SQL statements
• Are located in sybsystemprocs but
can be executed from any database
System Procedures

sp_help [objname]
sp_helpdb [dbname]

sp_helpindex tabname

sp_spaceused [objname]
Allocating Space
Allocating Storage
• Device
• Database
• Allocation Unit
• Extent
• Page
Devices
• Devices are hard disk files that store
databases, transaction logs, and
backups
• One device can hold many databases
and one database can span multiple
devices
• Only SA can create devices
Creating a Device
DISK INIT
name=‘logical_name’,
physname = ‘physical name’,
vdevno=virtual_dev_no,
size = num_of_2K_blocks
[, VSTART= virtual_address]
DISK INIT
DISK INIT
NAME = ‘hcl_dev1’,
PHYSNAME=‘C:\HCL\DATA\hcl_dat’,
VDEVNO=3,
SIZE = 8192
DISK INIT
NAME = ‘hcl_dev2’,
PHYSNAME=‘C:\HCL\DATA\hcl_log’,
VDEVNO=4,
SIZE = 1024
DISK INIT
• Maps the specified physical disk
operating system file to a database
device name
• Lists the new device in
master..sysdevices
• Prepares the device for database
storage
vdevno
• Used to map sysdevices, sysusages
and sydatabases
• Must be less than device parameter
in master
• Total no. of devices available is 255
Memory Allocation for
devices
• Memory allocated at Server startup
• 50KB/device
• Over configuring can be a waste of
memory
• 20 devices will take up 1 MB RAM
Devices
Configured Value –
Sp_configure “Number of devices”

To see values of vdevno already in use


select distinct low/16777216
from sysdevices
order by low
Info on devices
Sp_helpdevice devicename

eg sp_helpdevice master

Which will be the table used for storing


devices?
Managing Devices
• Setting up default device
sp_diskdefault database_device,
{default_on | default_off}

• Dropping a device
sp_dropdevice logical_name
Managing Devices
sp_diskdefault master, defaultoff

Sp_diskdefault def_1, defaulton

Sp_dropdevice tapedump1
Default devices
• Devices not to be used as default
devices
– Master
– Device used for sybsecurity
– Devices used for transaction logs
Dropping of devices

• Device in use cannot be dropped


• Server has to be restarted after
dropping a device
• Corresponding file has to be dropped
at OS level
Creation of Database
And setting options
Creating databases and logs
on devices

create database database_name


[on {default | database_device} [=
size]
[, database_device [= size]...]
[log on database_device [ = size ]
[, database_device [= size]]...]
Create Database
• Verifies that the database name specified in the
statement is unique.
• Makes sure that the database device names specified
in the statement are available.
• Finds an unused identification number for the new
database.
• Assigns space to the database on the specified
database devices and updates sysusages to reflect
these assignments.
• Inserts a row into sysdatabases.
• Makes a copy of the model database in the new
database space, thereby creating the new database's
system tables.
Create database
Create database newdb
on alpha_disk = 10,
beta_disk = 10,
delta_disk =10,
gamma_disk = 50
Create Database
create database newpubs
on default = 4

Multiple default devices can be used


Eg create newdb on default = 100
could use more than on device
Transaction logs
• Every database has a write ahead log
• First the transaction is written to the
log
• It is the system table syslogs
• Essential to have a log
Log on separate device

create database newdb


on mydata = 8, newdata = 4
log on tranlog = 3
Estimating Log Size
• Amount of update activity in the
associated database
• Frequency of transaction log dumps
• Rule of thumb is 25% of the
database size
Checking Log Size
Use database
go
dbcc checktable(syslogs)
OR
select count(*) from syslogs
Alter database and Drop
database
alter database newpubs
on pubsdata1 = 2, pubsdata2 = 3
log on tranlog

Drop database newpubs


Getting info about database
storage
• To find names of devices on which
database resides
sp_helpdb database name
• To find space used by a database use
sp_spaceused after using the
database
Sp_dboption
• Sets options for databases
• Displays a complete list of the
database options when it is used
without a parameter
• Changes a database option when
used with parameters
• Options can be changed only for user
databases
Database Options
• Sp_helpdb in a database shows the
options set for that database
• Only SA or dbo can change the
options
• None of the master database options
can be changed
Database Options
To use sp_dboption to change the
pubs2 database to read only:
use master
sp_dboption pubs2, "read only", true
Use pubs2
checkpoint
Creation of
Database
Objects
Sybase defned Datatypes
• Exact numeric: decimal(p,s), numeric(p,s)

• App. Numeric : float(n)


• Character : char(n), varchar(n)
• Money : money, smallmoney
• Date and time: datetime, smalldatetime
• Binary : binary(n), varbinary(n)
• Text and Image : text, image
User Defined Datatypes
• Subset of system defined datatype
• Can be used for creating datatypes that
are frequently used
Adding a datatype
sp_addtype datatypename,
phystype [(length) | (precision [,
scale])]
[,"identity |nulltype
Example : sp_addtype tid, "char(6)", "not
null"
User Defined Datatypes
• Sp_help datatype gives information about
that datatype
eg. Sp_help tid gives information about
the datatype tid
• Sp_droptype datatype drops the datatype
• Datatype in use cannot be dropped
Tables
• Entity represented as a table
• 2 billion tables per database
• 250 columns per database
• Column names have to be unique in a
table
Create table
create table titles
(title_id tid,
title varchar(80) not null,
type char(12),
pub_id char(4) null,
price money null,
advance money null,
royalty int null,
total_sales int null,
pubdate datetime)
Indexes
Indexes

• Enforce uniqueness
• Speed up joins
• Speeds data retreival
• Speeds ORDER BY and GROUP BY
Indexing
• Columns to consider for indexing

– Primary Key
– Columns frequently used in joins
– Columns frequently searched in ranges
– Columns retrieved in sorted order
Indexing (contd)
• Columns that should not be indexed
– Columns seldom referenced in query
– Columns that contain few unique
values
– Columns defined with text, image, or
bit datatypes
– When Update performance has a
higher priority than SELECT
performance
Creating An Index

• create [unique] [clustered |


nonclustered]
index index_name
on [[database.]owner.]table_name
( column_name
[, column_name]...)
[with {{fillfactor |
Types and Characteristics of
Indexes
• Types of Indexes
– Clustered
– Nonclustered
Clustered Indexes
• Physical order = Indexed order
• Leaf level = actual data pages of a
table
• Only one clustered index per table
• Requires 1.21*table size space for
creation
• Should be created on PK or column(s)
searched for range of values
Nonclustered Indexes
• Physical order is not the same as
index order
• The leaf level contains pointers to
the rows on the data pages
• Pointers add a level between index
and data
• 249 nonclustered indexes per table
can be created
Fill factor
• Low fillfactor means free space on
indexes
• Not maintained by Sybase
• Has to be maintained by dropping
and recreating index
• Fillfactor of 0 means data and leaf
pages are completely filled and
nonleaf pages to 75%
Creating and
Using
Segments
Segments
• Subsets of database devices
• Can be used in Create table and
Create Index commands
• Every database can have upto 32
segments
System defined segments
• System
• logsegment
• default
Creating Segments
• Initialize the physical device with
disk init
• Make the database device available
to the database by using the on
clause to create database or alter
database
• sp_addsegment segname, dbname,
devname
Example
This statement creates the segment
seg_mydisk1 on the database device
mydisk1:

sp_addsegment seg_mydisk1, mydata,


mydisk1
Creating Objects on
Segments
create table table_name (col_name
datatype …)
[on segment_name]
create [ clustered | nonclustered ]
index index_name
on table_name(col_name)
[on segment_name]
Creating objects on
segments
1. Start by using the master database.
2.Initialize the physical disks.
3.Allocate the new database devices to a database.
4.Use the database.
5.Create new segments that each point to one of
the new devices.
6.Reduce the scope of the default and system
segments so that they do not point to the new
devices.
7.Create the objects, giving the new segment
names.
Commands to create
objects on segments
• Use master
• Disk init to create devices
• Alter database to add devices
• Sp_addsegment to create segments
• Sp_dropsegment to drop default and
system segments
• Create table/create index to create
objects
Reducing the scope of log
and data segment
• Sp_dropsegment “default”,
mydata,mydisk1
• Sp_dropsegment
system,mydata,mydisk1
Dropping a segment

sp_dropsegment segname, dbname


drops segment from the specified
database
Getting info on segments
sp_helpsegment – info about all segments in
the database
sp_helpsegment "default“
sp_helpsegment seg1
sp_helpdb dbname – all segments for that
database
Sp_help tablename – segments used by table
Sp_helpindex table – segments used by
indexes
Clustered Indexes
• Table and Clustered Index on the
same segment
• If you have placed a table on a
segment, and you need to create a
clustered index, be sure to use the
on segment_name clause, or the
table will migrate to the default
segment.
Object Placement
Log on separate device
Spread large , heavily tables
across devices
Tables and non-clustered indexes
on separate devices
Tempdb on separate device
Problems due to data
storage
• Single-user performance satisfactory, but
response time increases as no. of processes
increase
• Query performance degrades as system
table activity increases
• Maintenance activities seem to take a long
time
• Stored procedures seem to slow down as
they create temporary tables
• Insert performance is poor on heavily used
tables
How Indexes affect
performance
• Avoid table scans when accessing
data
• Target specific data pages for point
queries
• Avoid data pages completely when
an index covers a query
• Use ordered data to avoid sorts
Index Requirements
• Only one clustered index per table,
since the data for a clustered index is
ordered by index key
• You can create a maximum of 249
nonclustered indexes per table
• A key can be made up of as many as
31 columns. The maximum number
of bytes per index key is 600
Choosing Indexes
• What indexes are associated currently with a given table?
• What are the most important processes that make use of the
table?
• What is the ratio of select operations to data modifications
performed on the table?
• Has a clustered index been created for the table?
• Can the clustered index be replaced by a nonclustered index?
• Do any of the indexes cover one or more of the critical
queries?
• Is a composite index required to enforce the uniqueness of a
compound primary key?
• What indexes can be defined as unique?
• What are the major sorting requirements?
• Do some queries use descending ordering of result sets?
• Do the indexes support joins and referential integrity checks?
Logical Keys and Indexing
Keys
• Logical keys define the relationship
between tables
• Logical Keys may not be used for
indexing
• Create indexes on columns that
support the joins, search arguments
and ordering requirements in queries
Clustered Indexes
• Clustered Indexes provide very good
performance for range queries
• In high transaction environment do
not create clustered index on a
steadily increasing value such as
IDENTITY column
Index Usage Criteria
• An index will be used if either of the
following is true –
– The query contains a column in a valid
search argument
– The query contains a column that
matches atleast the first column of the
index
Index Covering
• Mechanism for using the leaf level of
a nonclustered index the way data
page of clustered index would work
• Index covering occurs when all
columns referenced in the query are
contained in the index itself
• Leaf level of the index has all the
required data
Index Covering
• As the leaf index rows are much
smaller than data rows, a
nonclustered index that covers a
query is faster then clustered index
eg. An index on royalty and price
will cover the following query –
Select royalty from titles
where price between $10 and $20
Index Covering
• Provides performance benefits for
queries containing aggregates
• Typically first column of the index
has to be used in where clause for
the index to be used but aggregates
don’t have to satisfy this condition
Index Covering
• Select avg(price) from titles can use
index on price and scan all leaf pages
• Select count(*) from titles where
price > $7.95 will use index on price
to find the firdt leaf row where price
> 7.95 and then just scan to the end
of the index counting the number of
leaf rows
Index Covering
• Select count(*) from titles also is
satisfied by nonclustered index as
the number of rows in any index will
be the same as the total no. of rows
in the table
• Sybase Optimiser uses the
nonclustered index with the smallest
row size
Composite Indexes vs
Multiple Indexes
• At times better to have many narrow
indexes than have large composite
indexes
• More indexes give the optimiser
more alternatives to look at in
deriving the optimal plan
• If the first column is not used in the
where clause , the index will not be
used
Composite Indexes vs
Multiple Indexes
• Flip side of multiple indexes is the
overhead to maintain many indexes
• All queries must be examined in the
database and indexes should be
designed accordingly
Stored
Procedure
Optimization
Stored
Stored Procedures
Procedures

• SQL Query Stored Procedure Call

Parse Locate Procedure

Validate Names
Check Protection

Check Protection
Substitute Parameters
Optimize

Compile

Execute
Stored Procedures
Main performance gain is the capability
of Sybase server to save the
optimised query plan generated by
the first execution of the stored
procedure in procedure cache and to
reuse it for further execution
Execution of SP
FIRST Execution –
• Locate SP on disk and load into
cache
• Substitute parameter values
• Develop optimisation plan
• Compile optimisation plan
• Execute from cache
Execution of SP
Subsequent Executions
• Locate SP in cache
• Substitute parameter values
• Execute from cache
Stored Procedures
• Advantaqge of cost based optimizer
is that it has the capability of
generating the optimal query plan for
all plans based on search criteria
• For certain type of queries (eg range
queries) optimiser may at times
generate different plans
Stored Procedures
• In situations where parameter values
can be different for every execution,
use CREATE PROCEDURE WITH
RECOMPILE option
• In case a particular execution has to
use a different plan use EXECUTE
with RECOMPILE
Stored Procedures
• If an index used by a stored
procedure is dropped, Sybase
detects it and recompiles the
procedure
• Adding additional indexes or running
UPDATE STATISTICS does not cause
automatic recompilation
Stored Procedures
• Statistics Updation has to be followed
by sp_recompile <table name>to
generate a new query plan
• Addition of an index has also to be
followed by sp_recompile
Stored Procedures
Create proc get_order_data (@flag tinyint,
@value smallint)
as
If @flag=1
Select * from orders where price=@value
Else
Select * from orders where qty=@value
should be converted to …..
Stored Procedures
Create proc get_orders_by_price (@price
smallint)
as
select * from orders
where price =@value

Create proc get_orders_by_qty (@qty


smallint) as
select * from orders where qty =@value
Stored Procedures
A separate procedure to call the appropriate
procedure depending on the value of flag

create proc get_order_data(@flag tinyint,


@value smallint)
as
if @flag=1
exec get_orders_by_price
else
exec get_orders_by_qty
Triggers

274
Why Triggers
• Cascading Actions are not available
with DRI i.e. when Primary Key is
Updated or Deleted, Corresponding
Foreign Keys do not get
automatically changed
• Maintaining duplicate data
• Keeping derived columns current

275
Special Tables for Triggers
• Inserted and Deleted
• Are available only to triggers
• Have the same structure as the
trigger table
• Can be joined to other tables in the
database

281
INSERT Trigger

Trigger Table

Inserted

282
INSERT Trigger
eg. CREATE TRIGGER loan_ins
ON loan for insert
AS
UPDATE copy
SET ON_LOAN=‘y’
FROM COPY,inserted
WHERE copy.isbn=inserted.isbn
AND copy.cop_no=inserted.copy_no

284
DELETE Trigger

Trigger Table

Deleted

285
UPDATE Trigger
Updated Table
Table

Inserted

Deleted
287
UPDATE Trigger
CREATE TRIGGER mem_upd
ON member
FOR UPDATE
AS
IF UPDATE(MEMBER_NO)
BEGIN
RAISERROR (‘Trnxn cannot be processed. \
**** Member cannot be updated.’,10,1)
ROLLBACK TRANSACTION
END

288
Transaction control in
Triggers
• Rollback transaction in a trigger rolls
back the entire transaction
• Rollback to a savepoint name rolls
back to the savepoint
• Rollback trigger rolls back the data
modification that fired the trigger
and any statements in the trigger
that are part of the transaction
292
Trigger Considerations
• Overhead is very low
• Inserted and Deleted tables are in
memory
• Location of other tables referenced by the
trigger determines the amount of time
required
• INSERT,DELETE or UPDATE in the trigger
is a part of the transaction
• Nested triggers are set to true by default
• Self recursion of triggers does not happen
unless set
293
Cursors
Benefits of Cursors
Allow a program to take action on
each row of a query result set rather
than on the entire set of rows
• Provide the ability to delete or
update a row in a table based on
cursor position

205
Cursors
A cursor consists of the following parts
Cursor result set : set of rows
resulting from execution of the
associated select statement
Cursor position : a pointer to one row
within the cursor result set

206
Cursor scope
• Session – Starts when a client logs
into SQL Server and ends on log out
• SP – Starts when SP begins
execution and ends when SP
completes execution
• Trigger – Starts when trigger begins
execution and ends when it
completes execution
207
Cursors
• Declare – declare the cursor for select
statement. Checks SQL syntax
• Open – Executes the query and creates
the result set . Positions the cursor before
the first row of the result set
• Fetch – fetches the row to which cursor
points
• Close – Closes the result set, but the
compiled query plan remains in memory
• Deallocate – drops the query plan from
memory

208
Resource requirements of
Cursors
• Memory allocated at the time of
declaration of cursor
• On open intent table lock is acquired
• When a row is fetched, page lock is
acquired
Read Only Cursors

• Read-only mode uses shared page


locks. It is in effect if you specify for
read only or if the cursor's select
statement uses distinct, group by,
union, or aggregate functions
Updatable Cursors

• Update mode uses update page


locks. It is in effect if:

– You specify for update.

– The select statement does not include


distinct, group by, union, a subquery,
aggregate functions,
Index requirements
• For read only cursors any index can
be used
• Same query result will be obtained
as will be obtained for a select
statement
Index Requirements
• Updatable cursors require unique
indexes on the tables whose columns
are being updated
Performance issues with
cursors
• Cursor performance issues are:

– Locking at the page and table level


– Network resources
– Overhead of processing instructions

• Use cursors only if necessary


Optimizing Tips for cursors

• Optimize cursor selects using the cursor, not ad hoc


queries.

• Use union or union all instead of or clauses or in lists.

• Declare the cursor's intent.

• Specify column names in the for update clause.

• Use the shared keyword for tables.


Query
Optimisation
Goals and steps of Query
Optimisation
• Goals – Minimise logical and physical
page accesses
• Steps –
1) Parse and normalise the query
validating syntax and object references
2) Optimise the query and generate plan
3) Compile query plan
4) Execute the query and return result to
user
Optimisation Step
• Phase 1 – Query Analysis
– Find the SARG
– Find the ORs
– Find the joins
• Phase 2 – Index Selection
– Choose the best index for each SARG
– Choose the best method for ORs
– Choose best indexes for any join clause
– Choose best index to use for each table
• Phase 3 – Join Order Selection
SARG
• SARG - Enables the optimiser to limit the
rows searched to satisfy a query
• SARG is matched with an index
• SARG is a where clause comparing a
column to a constant
Col operator const_expr
Valid operators are =,>,<,>=,<=
!= or <> cannot be used to match a value
against an index unless query is covered
Search arguments
• Valid SARGs
flag = 7
salary > 10000
city = ‘Pune’ and state = ‘MH’
• Invalid SARGs
lname=fname
ytd/months > 1000
ytd/12 = 1000
flag !=0
Improvising SARG by
Sybase
• BETWEEN becomes >= and <=
price between $10 and $20 becomes
price > = $10 and price < = $20
• LIKE becomes >= and <
eg. au_lname like “Sm%” becomes:
au_lname >= “Sm” and au_lname < Sn”
Like will be improvised into a SARG only if
first character is a constant
OR clauses
Format of OR clause is
• SARG or SARG [or …]
with all columns involved in the OR belonging
to the same table
• Column IN (const 1 ,const 2 , …
also is treated as OR
Examples of OR –
where au_lname = ‘X’ OR au_fname=‘Y’
Where (type=‘ty’ and price > $25) OR pub_id
= ‘1’
Where au_lname in (‘A’,’B’,’C’)
The OR strategy
• OR clause handled either by table scan or
by using OR strategy
• OR strategy used only when indexes occur
on both columns
• OR strategy breaks query into two parts.
Each query is executed and row ids are
kept into a work table. Duplicates are
removed and qualifying rows are retrieved
from the work table
• OR strategy used only if total I/O is less
than the I/Os required for table scan
Join Clause
A join clause is a where clause in the
format
Table1.Column Operator
Table2.Column
Index Selection
• All SARGs, OR clauses and join
clauses are matched with available
indexes and I/O costs are estimated
• Index I/O costs are compared with
each other and against the cost of
table scan to determine the least
expensive access path
• If no useful index is found, a table
scan must be performed
ORDER BY, GROUP BY and
DISTINCT clauses
• Optimiser determines whether work tables
are needed by these clauses
• GROUP BY always needs work table
• DISTINCT - If a unique index exists on the
table and all columns in the unique index
are included in the result set Work table is
not created
• ORDER BY - In case of a clustered index
on the column used in BY, no work table is
created
Potential Optimiser
problems and solutions
• Make sure statistics are upto date
(Run update statistics)
• Check SARGs
• Check Stored Procedures for current
parameters
Guidelines for creating
SARGs

• Avoid functions, arithmetic operations, and other


expressions on the column side of search clauses.

• Avoid incompatible datatypes.

• Use the leading column of a composite index. The


optimization of secondary keys provides less performance.

• Use all the search arguments you can to give the optimizer
as much as possible to work with
Adding SARG to help
optimiser
1. select au_lname, title from titles t, titleauthor ta,
authors a where t.title_id = ta.title_id and
a.au_id = ta.au_id and t.title_id = "T81002"

2.select au_lname, title from titles t, titleauthor ta,


authors a where t.title_id = ta.title_id and a.au_id
= ta.au_id and ta.title_id = "T81002"

3. select au_lname, title from titles t, titleauthor ta,


authors a where t.title_id = ta.title_id and a.au_id
= ta.au_id and t.title_id = "T81002“
and ta.title_id = "T81002
Joins

The process of creating the result set


for a join is to nest the tables, and to
scan the inner tables repeatedly for
each qualifying row in the outer
table.
Choice of inner and outer
tables
• The outer table is usually the one that has:
Smallest number of qualifying rows, and/or
Largest numbers of reads required to locate rows.
• The inner table usually has:
Largest number of qualifying rows, and/or
Smallest number of reads required to locate rows.

For example, when you join a large, unindexed table to a smaller


table with indexes on the join key, the optimizer chooses:

The large table as the outer table. It will only have to read this large
table once.

The indexed table as the inner table. Each time it needs to access the
inner table, it will take only a few reads to find rows.
Combining max and min
aggregates
• When used separately min and max on indexed
columns use special processing if there is no
where clause
• Min aggregates retrieve the first value on the root
page of the index, performing a single read to
find the value
• Max aggregates follow the last entry on the last
page at each index level until they reach the leaf
level. For a clustered index, the number of reads
required is the height of the index tree plus one
read for the data page. For a nonclustered index,
the number of reads required is the height of the
index tree.
Update Operation
• Direct Updates

• Deferred Updates

• Direct updates are faster and are


performed whenever possible
Direct Updates

Sybase performs direct updates in a single


pass, as follows:

• Locates the affected index and data rows

• Writes the log records for the changes to the


transaction log

• Makes the changes to the data pages and any


affected index pages
Deferred Updates
The steps involved in deferred updates are:
•Locate the affected data rows, writing the log records for
deferred delete and insert of the data pages as rows are
located

•Read the log records for the transaction. Perform the


deletes on the data pages and delete any affected index rows

•At the end of the operation, re-read the log, and make all
inserts on the data pages and insert any affected index rows
Guidelines to avoid deferred
updates
• Create at least one unique index on
the table to encourage more direct
updates
• If null values are not used , use not
null in table definition
• Use char datatype instead of varchar
wherever possible
T-SQL
Perofrmance Tips
Greater Than Query
• This query, with an index on int_col:
select * from table where int_col > 3
uses the index to find the first value where
int_col equals 3, and then scans forward to find
the first value greater than 3. If there are many
rows where int_col equals 3, the server has to
scan many pages to find the first row where
int_col is greater than 3.

Efficient way to write this query is - :


• select * from table where int_col >= 4
Not Exists Test

• In subqueries and if statements, exists


and in perform faster than not exists and
not in when the values in the where clause
are not indexed. For exists and in, SQL
Server can return TRUE as soon as a
single row matches. For the negated
expressions, it must examine all values to
determine that there are not matches.
Not Exists Test
Not exists test

if not exists (select * from table where...)


begin

/* Statement Group 1 */
end

else
begin
/* Statement Group 2 */
end contd ..
Not Exists Test
Can be rewritten as

if exists (select * from table where...)


begin
/* Statement Group 2 */
end
else
begin
/* Statement Group 1 */
end
Variable vs Parameter
values in where clause
• The optimizer knows the value of a parameter to a
stored procedure at compile time, but it cannot
predict the value of a declared variable. Providing
the optimizer with the values of search arguments
in the where clause of a query can help the
optimizer make better choices. Often, the solution
is to split up stored procedures:

• Set the values of variables in the first procedure.

• Call the second procedure and pass those


variables as parameters to the second procedure
Variable vs Parameter
values in where clause
For example, the optimizer cannot optimize the final
select in the following procedure, because it cannot
know the value of @x until execution time:
create procedure p
as
declare @x int
select @x = col
from tab where ...
select * from tab2
where indexed_col = @x
Variable vs Parameter
values in where clause
The following example shows procedure split into two .
create procedure base_proc
as
declare @x int
select @x = col
from tab where ...
exec select_proc @x

create procedure select_proc @x int


as
select *
from tab2
where col2 = @x
Count vs Exists
• Do not use the count aggregate in a subquery to do an existence check:

select *

from tab

where 0 < (select count(*) from tab2 where ...)

• Instead, use exists (or in):

select *

from tab

where exists (select * from tab2 where ...)


Count vs Exists
• When you use count, SQL Server does not know
that you are doing an existence check. It counts
all matching values,

• When you use exists, SQL Server knows you are


doing an existence check. When it finds the first
matching value, it returns TRUE and stops looking.

• The same applies to using count instead of in or


any.
Aggregates
• SQL Server uses special optimizations for
the max and min aggregates when there is
an index on the aggregated column

• For min, it reads the first value on the root


page of the index

• For max, it goes directly to the end of the


index to find the last row
Aggregates
min and max optimizations are not applied if:
• The expression inside the max or min is anything
but a column. Compare max(numeric_col*2) and
max(numeric_col )*2, where numeric_col has a
nonclustered index. The second uses max
optimization; the first performs a scan of the
nonclustered index.
• The column inside the max or min is not the first
column of an index. For nonclustered indexes, it
can perform a scan on the leaf level of the index;
for clustered indexes, it must perform the table
scan.
• There is another aggregate in the query
• There is a group by clause
Aggregates

Do not write two aggregates together

eg. select max(price), min(price) from titles

results in a full scan of titles, even if there is an index on price

Rewriting the query as:

select max(price) from titles

select min(price) from titles uses index for both the queries
Joins and datatypes
• When joining between two columns
of different datatypes, one of the
columns must be converted to the
type of the other
• Column whose type is lower in the
hierarchy is converted
• Index cannot be used for converted
column
Joins and datatypes
select * from small_table, large_table
where smalltable.float_column =
large_table.int_column

• In this case, SQL Server converts the


integer column to float, because int is
lower in the hierarchy than float. It cannot
use an index on large_table.int_column,
although it can use an index on
smalltable.float_column
Null vs not null character
columns

• Char null is really stored as varchar


Joining char not null with char null
involves a conversion
• Best to have same datatypes for
frequently joined columns, including
acceptance of nulls
• Can be implemented using user
defined datatypes
Forcing the Conversion to
the other side of the join

• If a join between different datatypes


is unavoidable, and it hurts
performance, you can force the
conversion to the other side of the
join
Forcing the Conversion to
the other side of the join

In the following query, varchar_column must be


converted to char, so no index on varchar_column can
be used, and huge_table must be scanned:

select *
from small_table, huge_table
where small_table.char_col =
huge_table.varchar_col
Forcing the conversion to
the other side of the join
• Performance would be improved if the index on
huge_table could be used. Using the convert
function on the varchar column of the small table
allows the index on the large table to be used
while the small table is table scanned:

select *
from small_table, huge_table
where convert(varchar(50),small_table.char_col) =
huge_table.varchar_col
Parameters and Datatypes
• The query optimizer can use the values of parameters to
stored procedures to help determine costs.

• If a parameter is not of the same type as the column in the


where clause to which it is being compared, SQL Server
has to convert the parameter.

• The optimizer cannot use the value of a converted


parameter.

• Make sure that parameters are of the same type as the


columns they are compared to.
Parameters and Datatypes
eg

create proc p @x varchar(30)


as
select * from tab
where char_column = @x

may get a less optimal query plan than:

create proc p @x char(30)


as
select *
from tab
where char_column = @x
Commands to see how
optimiser is working
• Set showplan on - is used to see the
execution plan for a particular query
• Set noexec on – generally used with set
showplan on . It is a toggle and should be
set to off after the plan has been studied
• Set statistics time on – is used to see
where time is spent
• Dbcc traceon(302) and dbcctraceon(310)
– is used to which indexes are shosen

You might also like