0% found this document useful (0 votes)

11 views

09 Query Eval

The document discusses how a database management system evaluates queries. It describes the information stored in a database catalog, different access paths for tables, algorithms for relational algebra operations like selection and join, and factors that influence the best query evaluation plan.

Uploaded by

Luyeye Pedro Lopes

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

09 Query Eval

Uploaded by

Luyeye Pedro Lopes

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Query Evaluation

UVic C SC 370

Dr. Daniel M. German

Department of Computer Science

July 9, 2003 Version: 1.1.0

9–1 Query Evaluation (1.1.0) CSC 370 [email protected]

Overview

✥ What kind of info does the DBMS store in its catalog?

✥ How does the DBMS answer a query?
✥ What algorithms are used to perform a relational algebra
operation?
✥ What are evaluation plans and how are they represented?
✥ Why do we want to get the best evaluation plan?

9–2 Query Evaluation (1.1.0) CSC 370 [email protected]

Assumptions
✥ We will use the following schema:
Sailors(sid: integer, sname: string, rating: integer, age: real)
Reserves(sid: integer, bid: integer, day: dates, rname: string)

✥ A page is 4k long
✥ The size of Reserves is 40 bytes long (100 tuples/page) and
spawns 1000 pages
✥ The size of Sailors is 50 bytes long (80 tuples/page) and spawns
500 pages.

9–3 Query Evaluation (1.1.0) CSC 370 [email protected]

System Catalog (System Tables)

✥ Each database contains tables about the data contained in it.

✦ Table: Name, attributes, indexes, integrity constraints
✦ Index: Name, structure, search key
✦ View: Name and definition
✥ And also about the DBMS itself:
✦ Size of buffer pool, page size...

9–4 Query Evaluation (1.1.0) CSC 370 [email protected]

System Catalog...

✥ Also statistics:
✦ Cardinality: Number of tuples in R: N T uples(R)
✦ Size: Number of pages in R: N P ages(R)
✦ Index Cardinality: Number of distinct key values for index I:
N Keys(I)
✦ Index Size: Number of pages in index (for B-tree, number of
leaf pages): IN P ages(I)
✦ Index Height: The number of non-leaf levels in an index I:
IHeight(I)
✦ Index Range: Maximum and minimum values for the key of
an index I: ILow(I) and IM ax(I)

9–5 Query Evaluation (1.1.0) CSC 370 [email protected]

Operator Evaluation

✥ Each relational operator has several alternative algorithms that

implement it
✥ For many operators, none is universally better
✥ Several factors influence which algorithms performs best
✦ Size of tables
✦ Buffer pool
✦ Buffer replacement policy

9–6 Query Evaluation (1.1.0) CSC 370 [email protected]

Three Common Techniques

✥ Indexing: For selection or join, use index

✥ Iteration: Examine each tuple
✥ Partitioning: Partition tuples on a sort key then work on smaller
sets (sorting and hashing)

9–7 Query Evaluation (1.1.0) CSC 370 [email protected]

Access Paths

✥ An access path is a way of retrieving tuples from a table

✥ Two ways to do it:
✦ Scan the file
✦ Use an index, retrieve data (for some queries, we might not
need to retrieve data)
✥ Every relational operator accepts one or more tables as input; the
access paths have a big impact in their cost.

9–8 Query Evaluation (1.1.0) CSC 370 [email protected]

Access Paths...

✥ Consider a simple selection which is a conjunction of conditions

of the form attr op value where the op is one of <, ≤, =, 6=, ≥
✥ This types of selections are called to be in conjunctive normal
form (CNF) and each condition is a conjunct
✥ Why are queries in CNF useful?
✥ Intuitively, an index matches a selection condition if the index
can be used to retrieve just the tuples that satisfy the condition.

9–9 Query Evaluation (1.1.0) CSC 370 [email protected]

Access Paths...

✥ A hash index matches a CNF selection if there is a term of the

form attribute = value for each attribute in the index’s search
key
✥ A tree index matches a CNF selection if there is a term of the
form attribute op value for each attribute in a prefix of the index’s
search key:
✦ If hai and ha, bi are prefixes of key ha, b, ci
✦ For a tree index we can also evaluate comparisons different
than equality , but not for a hash index
✥ For a table with an index we have 2 access paths (and potentially
3)

9–10 Query Evaluation (1.1.0) CSC 370 [email protected]

Selectivity of Access Paths

✥ The selectivity of an access path is the number of pages retrieved

(index pages plus data pages) if we use this access path to retrieve
the tuples
✥ The most selective access path is the one that retrieves the fewest
pages
✥ The selectivity of an access path depends on its conjuncts
✥ Each conjunct acts as a filter
✥ The fraction of tuples that satisfy a given conjunct is called its
reduction factor
✥ When there are several conjuncts, the fraction of tuples that
satisfy all of them can be approximated by the product of their
reduction factors (when is this not true?)
9–11 Query Evaluation (1.1.0) CSC 370 [email protected]
Example of Selectivity

✥ Assume we have a hash index H on Reserves with search key

hrname, bid, sidi
✥ We are given the CNF query:
rname =0 Joe0 ∧ bid = 5 ∧ sid = 3
✥ The catalog contains the number of distinct keys for H:
N Keys(H), and the number of pages N P ages(Reserves), so
we can approximate the reduction factor of this access plan:
N P ages(Reserves)
N Keys(H)

9–12 Query Evaluation (1.1.0) CSC 370 [email protected]

Example of Selectivity...

✥ Assume now that we have an index on hbid, sidi and the CNF
query is bid = 5 ∧ sid = 3
✥ If we know the number of different values for bid we can estimate
the reduction factor of the first conjunct
✥ But usually, the DBMS does not, so it approximates to 0.1
✥ So we can approximate this query’s selectivity as 0.01
✥ But the number of pages retrieved depends on whether the index
is clustered or not

9–13 Query Evaluation (1.1.0) CSC 370 [email protected]

Example of Selectivity...

✥ What about a range condition like day >0 8/9/20020 ?

✥ Assume a uniform distribution
✥ If we have a BTree on date the reduction factor is:
High(T ) − value
High(T ) − Low(T )

9–14 Query Evaluation (1.1.0) CSC 370 [email protected]

Algorithms for Selection

For a query σR.attr op value(R) we have the following alternatives:

✥ No index: scan
✥ Index, depends:
✦ it is clustered or unclustered?
✦ what is the reduction factor of the expression?
✥ Rule of thumb: for an unclustered index, if over 5% of tuples are
expected to match, then do scan

9–15 Query Evaluation (1.1.0) CSC 370 [email protected]

Projection

✥ Simple to implement, except when DISTINCT is used

✥ If no duplicates need to be eliminated:
✦ Simple retrieve the tuples and eliminate unwanted columns
✦ We might be able to do this with an index. How?
✥ If we need to drop duplicates, we need to sort the data
✦ 1. Remove columns, sort, eliminate duplicates
✦ 2. Remove columns and do first scan of sort, then keep doing
sort, but in last pass of sort, eliminate duplicates

9–16 Query Evaluation (1.1.0) CSC 370 [email protected]

Projection with Indexes

✥ If we have an index with all the fields in the projection, then we

only need to scan the leaf pages of the index
✥ If DISTINCT and all the attributes are a prefix to the key of a
B-tree, then we don’t have to scan the whole index:
✦ Duplicates are adjacent!

9–17 Query Evaluation (1.1.0) CSC 370 [email protected]

Join

✥ It is a common and expensive operation

✥ For example: a join of Reserves.sid = Sailors.sid
✥ Suppose we have an index for Sailors on the sid column
✥ We can scan Reserves and find the matching Sailor.
✥ This algorithm is known as index nested loops join
✥ Assume we have a hash-based index using Alternative 2 on sid of
Sailors, and takes 1.2 I/Os on average to retrieve index entry.
✦ There is one sailor per sid, hence one tuple to retrieve per
reservation
✦ Reserves is 1000 pages long (100 tuples per page)
✦ Total cost: 2.2 ∗ 105 I/Os

9–18 Query Evaluation (1.1.0) CSC 370 [email protected]

Another Alternative: sort

✥ Sort both tables, then join

✥ Called Sort-Merge Join
✥ Assume we can sort them in 2 passes each
✥ Cost:
✦ Sailors is 500 pages, Reserves is 1000
✦ Cost of sorting: 4 ∗ (500 + 1000) = 6000
✦ One more scan of sorted tables
✦ Total: 7500 I/Os
✥ Cheaper, and data already sorted!

9–19 Query Evaluation (1.1.0) CSC 370 [email protected]

But sometimes index nested loops joins are
desirable

✥ Suppose we only want the join for boat 101

✥ If the index is on bid, we don’t have to read every boat.
✥ The decision of which join algorithm to use is based on the query
as a whole, including selections and projections.

9–20 Query Evaluation (1.1.0) CSC 370 [email protected]

Group By and Set Operations

✥ Set operations require usually sorting the result, to eliminate

duplicates
✥ Group-by is usually implemented with sorting also
✦ Aggregates are implemented with in-memory counters
✦ If there is a clustered index with the group-by attributes, it can
be used to scan the table

9–21 Query Evaluation (1.1.0) CSC 370 [email protected]

Introduction to Query Optimization

✥ One of the most important tasks of a DBMS

✥ The same query can be expressed in many ways
✥ It makes it easy to write queries
✥ A given query can be evaluated in many ways, some cheaper than
others (orders of magnitude difference)
✥ Good performance relies greatly in the quality of the query
optimizer
✥ See figure 12.2

9–22 Query Evaluation (1.1.0) CSC 370 [email protected]

Optimizing Queries

✥ Queries can be seen as σ, π, ./ algebra expression

✥ Optimizing an expression involves two basic steps:
– Enumerating alternative plans for evaluating the expression
– Estimating the cost of each plan
– Choosing the plan with the lowest estimated cost

9–23 Query Evaluation (1.1.0) CSC 370 [email protected]

Query Evaluation Plans

✥ A query evaluation plan (or simply plan) consists of an

extended relational algebra tree, with additional annotations at
each node indicating:
✦ the access methods to use for each table
✦ the implementation method

9–24 Query Evaluation (1.1.0) CSC 370 [email protected]

Query Evaluation Plan...
SELECT S.name FROM Reserves R, Sailors R WHERE
R.sid = S.sid AND R.bid = 100 AND S.rating > 5

✥ This query can be expressed as:

πsname (σbid=100∧rating>5 (Reserves ./sid=sid Sailors))
πsname

σbid=100∧rating>5

./sid=sid

Reserves Sailors

9–25 Query Evaluation (1.1.0) CSC 370 [email protected]

Query Evaluation Plan...

✥ We also have to decide on the implementation of each operator

πsname (On-the-fly)

σbid=100∧rating>5 (On-the-fly)

./sid=sid (Simple nested loops)

(File Scan) Reserves Sailors (File Scan)

✥ This tree is a query evaluation plan for the SELECT
✥ Convention: the outer table is the left child of a join

9–26 Query Evaluation (1.1.0) CSC 370 [email protected]

Multi Operator Queries

✥ When the query involves several operators, sometimes the result

of one is pipelined into the next
✥ In this case, no temporary relation is written to disk
(materialized)
✥ The result is fed to the next operator as soon as it is available
✥ It is cheaper!
✥ When the input table to a unary operator is pipelined into it, we
say it is applied on-the-fly

9–27 Query Evaluation (1.1.0) CSC 370 [email protected]

Pipelining
./

Results tuples of first join are

pipelined into join with C ./ C

A B
✥ Pipelining is a control strategy
✥ Results are produce one page at a time, used and then discarded

9–28 Query Evaluation (1.1.0) CSC 370 [email protected]

The Iterator Interface

✥ Once the evaluation plan is decided, it is executed by calling the

operators in some order (possibly interleaved)
✥ Each operator has one or more inputs
✥ Passes result tuples to the next operator
✥ Materialization is usually done at the input stage of an operator
✥ When is it needed to materialize?
✥ Internally an operator has a uniform iterator interface:
– open, get next, close
– It encapsulates materialization or on-the-fly processing
– It also encapsulates use of indexes

9–29 Query Evaluation (1.1.0) CSC 370 [email protected]

Love Will Come and Find Me Again
90% (30)
Love Will Come and Find Me Again
7 pages
School of Rock
58% (12)
School of Rock
30 pages
New Drawing On The Right Side of The Brain (Workbook) (Team Nanban) (TPB)
94% (17)
New Drawing On The Right Side of The Brain (Workbook) (Team Nanban) (TPB)
161 pages
500 Real Spanish Phrasebook
97% (29)
500 Real Spanish Phrasebook
26 pages
House of Leaves - Mark Z Danielewski
81% (21)
House of Leaves - Mark Z Danielewski
750 pages
240 Vocabulary Words Grade 3 Scholastic
86% (42)
240 Vocabulary Words Grade 3 Scholastic
81 pages
Real Tarot Workbook
97% (124)
Real Tarot Workbook
141 pages
Easy French Step-by-Step PDF
100% (23)
Easy French Step-by-Step PDF
399 pages
2 - Credit Card Math Life Skills Activities
100% (4)
2 - Credit Card Math Life Skills Activities
64 pages
Keep Forgettin - Michael MacDonald
86% (7)
Keep Forgettin - Michael MacDonald
1 page
Know ThySelf, Naim Akbar.
100% (12)
Know ThySelf, Naim Akbar.
46 pages
Outlining Your Novel Workbook: Step-by-Step Exercises For Planning Your Best Book
68% (38)
Outlining Your Novel Workbook: Step-by-Step Exercises For Planning Your Best Book
15 pages
Mobility
100% (34)
Mobility
937 pages
Guide To Lincoln Head Cents
100% (11)
Guide To Lincoln Head Cents
16 pages
The Anarchist Cookbook - William Powell
No ratings yet
The Anarchist Cookbook - William Powell
405 pages
Year of Writing Prompts
100% (5)
Year of Writing Prompts
68 pages
Practice Test Battery 2: Total Solution
100% (10)
Practice Test Battery 2: Total Solution
160 pages
WKBK 2 PDF
100% (11)
WKBK 2 PDF
38 pages
Guitar Fretboard - The Simple Method To Memorize The Fretboard
87% (15)
Guitar Fretboard - The Simple Method To Memorize The Fretboard
53 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Advanced Dbms Unit2
No ratings yet
Advanced Dbms Unit2
17 pages
Adbms Unit 2
No ratings yet
Adbms Unit 2
137 pages
Query Evaluation
No ratings yet
Query Evaluation
51 pages
DBMS_Unit5_Lecture1
No ratings yet
DBMS_Unit5_Lecture1
22 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Final Review
No ratings yet
Final Review
96 pages
13 QP1
No ratings yet
13 QP1
33 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
Unit 1
No ratings yet
Unit 1
23 pages
QEII
No ratings yet
QEII
44 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Ch 13 Updated
No ratings yet
Ch 13 Updated
30 pages
L10-Query Evaluaion
No ratings yet
L10-Query Evaluaion
50 pages
Slides 12
No ratings yet
Slides 12
24 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Chapter 15
No ratings yet
Chapter 15
7 pages
UNIT 4 Query Processing and Different types of Databases
No ratings yet
UNIT 4 Query Processing and Different types of Databases
13 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
DBMS
No ratings yet
DBMS
24 pages
Dbms Seminar
No ratings yet
Dbms Seminar
24 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
ADB Slides 4
No ratings yet
ADB Slides 4
47 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
Query Optimization
No ratings yet
Query Optimization
20 pages
7-Query Processing
No ratings yet
7-Query Processing
47 pages
4.6 Algorithms For Select and Join Operations
No ratings yet
4.6 Algorithms For Select and Join Operations
6 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
2-select-optimization
No ratings yet
2-select-optimization
23 pages
Unit 4
No ratings yet
Unit 4
24 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
53 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Query Processing
No ratings yet
Query Processing
39 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
Query-Processing
No ratings yet
Query-Processing
77 pages
Query Processing in DBMS
No ratings yet
Query Processing in DBMS
22 pages
1 Intro Select Project
No ratings yet
1 Intro Select Project
28 pages
L11 QueryProcessing I
No ratings yet
L11 QueryProcessing I
42 pages
Query-Optimization
No ratings yet
Query-Optimization
51 pages
Course08 - RelEval
No ratings yet
Course08 - RelEval
22 pages
Cs410 Notes Ch15
No ratings yet
Cs410 Notes Ch15
20 pages
UEU Basis Data Pertemuan 14
No ratings yet
UEU Basis Data Pertemuan 14
32 pages
3 - QueryProcessing - Ch15
No ratings yet
3 - QueryProcessing - Ch15
56 pages
DBMS R19 UNIT IV
No ratings yet
DBMS R19 UNIT IV
25 pages
Midterm 13w2
No ratings yet
Midterm 13w2
8 pages
Unit 3
No ratings yet
Unit 3
63 pages
4 SQLQueries
No ratings yet
4 SQLQueries
52 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
Chapter ONE
No ratings yet
Chapter ONE
48 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
unit-2 Query processing and optimization,Query equivalence, Join strategies (1)
No ratings yet
unit-2 Query processing and optimization,Query equivalence, Join strategies (1)
38 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
55 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
The Book of Forbidden Words
100% (9)
The Book of Forbidden Words
258 pages
Statutory Construction Midterm Project Legal Maxims Case Digest
No ratings yet
Statutory Construction Midterm Project Legal Maxims Case Digest
61 pages
Resources For Grant Seekers
100% (2)
Resources For Grant Seekers
12 pages
Guide To Effective Grant Writing
100% (8)
Guide To Effective Grant Writing
96 pages
Tibetan Book of Dead
75% (4)
Tibetan Book of Dead
14 pages
Everything: You Need To Know
100% (3)
Everything: You Need To Know
8 pages
Ai Lecture Notes
No ratings yet
Ai Lecture Notes
124 pages
Human Rights
50% (2)
Human Rights
139 pages
50 Wonderful Word Games
100% (10)
50 Wonderful Word Games
72 pages
5-Day Intro To Improv Practice Plan PDF
100% (1)
5-Day Intro To Improv Practice Plan PDF
17 pages
Gratitude Workbook
95% (21)
Gratitude Workbook
22 pages
Negative Entity Research SSRF
84% (19)
Negative Entity Research SSRF
24 pages