0% found this document useful (0 votes)

217 views7 pages

Chapter 2 Querry Proccessing

The document describes the process of query processing which involves 3 main steps: 1) Parsing and translating the query into an internal representation like a query tree. 2) Optimization to choose the most efficient execution plan by estimating the cost of different plans. 3) Evaluation which executes the chosen plan and returns results to the user. Query optimization is the key step that determines the best execution strategy to minimize resources like time and space. The optimizer considers different access methods, join algorithms, and database statistics to select the lowest cost plan.

Uploaded by

Musariri Talent

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

217 views7 pages

Chapter 2 Querry Proccessing

Uploaded by

Musariri Talent

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

OVERVIEW OF QUERY PROCESSING

Query processing is defined as follows:

i) A 3-step process that transforms a high-level query (of relational calculus/SQL)
into an equivalent and more efficient lower-level query (of relational algebra).
ii) Query processing would mean the entire process or activity which involves query
translation into low level instructions, query optimization to save resources, cost
estimation or evaluation of query and extraction of data from the database; the
goal is to find an efficient query execution plan for a given SQL query which
would minimize the cost considerably especially time.
iii) The activities involved in parsing, validating, optimizing and executing a query.
The basic steps in query processing are shown in the diagram below:

I. Parsing and translation

 translate the query into its internal form. This is then translated into relational
algebra.
 Parser checks syntax, verifies relations
II. Optimization
This is the activity of choosing an efficient execution strategy for processing a query.
Amongst all equivalent evaluation plans choose the one with lowest cost. Cost is
estimated using statistical information from the database catalog
 e.g. number of tuples in each relation, size of tuples, etc.
1|Page
III. Evaluation
 The query-execution engine takes a query-evaluation plan, executes that plan,
and returns the answers to the query.

QUERY PROCESSING EXAMPLE

Relations: EMP(ENO, ENAME, TITLE), ASG(ENO,PNO,RESP,DUR)
Query: Find the names of employees who are managing a project ?
– High level query
SELECT ENAME
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO AND DUR > 37
Step 1: Parsing
In this step, the parser of the query processor module checks the syntax of the query, the
user’s privileges to execute the query, the table names and attribute names etc. The correct
table names, attribute names and the privilege of the users can be taken from the system
catalog (data dictionary).
Step 2: Translation
If we have written a valid query, then it is converted from high level language SQL to low
level instruction in Relational Algebra.
– Two possible transformations of the query are:

∗ Expression 1: ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))

∗ Expression 2: ΠENAME(EMP ⋊⋉ENO(σDUR>37(ASG)))

Step 3: Optimizer
The optimizer uses the statistical data stored as part of the data dictionary. The statistical data
are information about the size of the table, the length of records, the indexes created on the
table etc. The optimizer also checks for the conditions and conditional attributes which are
parts of the query.
Step 4: Execution Plan

2|Page
A query can be expressed in many ways. The query processor module at this stage using the
information collected in step 3 to find different relational algebra expressions that are
equivalent and return the result of the one which we have written already. So far we have got
two execution plans. Only condition is that both plans should give the same result.
Step 5: Evaluation
Though we have got many execution plans constructed through statistical data, though they
return the same result they differ in terms of time consumption to execute the query or the
space required for executing the query. Hence it is mandatory to choose one plan which
obviously consumes less cost.
In our case Expression 2 avoids the expensive and large intermediate Cartesian product, and
therefore typically is better.
Output:
The final result is shown to the user.
PHASES IN QUERY PROCESSING

a) QUERY DECOMPOSITION

It aims to transform a high level query into a relational algebra query and to check that the
query is syntactically and semantically correct. The stages involved here include:

a) Analysis- where the query is analyzed; this stage also verifies that the relations and
attributes specifies in the query are defined in the system catalog and that any
operations applied to database objects are appropriate for the object type; E.G.

You have the following information:

Table:Staff

Format Length Primary Null

EmployeeNo Int 4 Yes No
EmployeeName Varchar 25 No Yes
Date_of_birth Date No Yes
Position Char 10 No No
National_id_number Varchar 11 No No

SQL Query

SELECT staffNo FROM Staff WHERE position>10

This query would be rejected on two grounds:

i) The attribute staffNo is not defined in the table Staff

ii) In the WHERE clause, the comparison ‘>10’ is incompatible with the data
type for position which is a character
3|Page
However let us consider the following query:

SELECT EmployeeName FROM Staff WHERE Position = “manager” OR Position

= “clerk” AND EmployeeNo > 1283.

This query will be processed since it is correct. On completion of this stage, the high
level query is transformed into some internal representation that is more suitable for
processing. The internal representation chosen is a QUERY TREE which is
constructed as follows:

 A leaf node is created for each base relation in the query.

 A non leaf node is created for each intermediate relation produced by a
relational algebra operation.
 The root of the tree represents the result of the query.
 The sequence of operations is directed from the leaves to the root.

b) Normalization- this stage converts the query into a normalised form that can be more
easily manipulated. There are two normal forms here:

i) Conjunctive normal form- this is a sequence of conjuncts that are connected

with the ^ (AND) operator. Each conjunct contains one or more terms
connected by the ̌ (OR) operator e.g.

(Position = “manager” ̌ Position = “clerk”) ^ EmployeeNo >1283

N.B. A conjunctive selection contains only those tuples that satisfy all
conjuncts.

ii) Disjunctive normal form- a sequence of disjuncts that are connected with the ̌
(OR) operator. Each disjunct contains one or more terms connected by the ^
(AND) operator. E.g.

(Position = “ manager” ^ EmployeeNo >1283) ̌ ( Position = “clerk” ^

EmployeeNo >1283)

N.B. A disjunctive selection contains those tuples formed by the union of all
tuples that satisfy the disjuncts.

c) Semantic Analysis- the objective is to reject normalised queries that are incorrectly
formulated or contradictory. A query is incorrectly formulated if components do not
contribute to the generation of the result which may happen if some join
specifications are missing. A query is contradictory if its predicate cannot be satisfied
by any tuple e,g, an employee cannot be both a manager and a clerk ( Position =
“manager” ^ Position= “clerk”). To ascertain correctness of queries one can construct
a Relation Connection Graph or a Normalised Attribute Connection Graph.

4|Page
d) Simplification- the objective is to detect redundant qualifications, eliminate common
sub-expressions and transform the query to a semantically equivalent but more easily
and efficiently computed form.
e) Query Restructuring- the query is restructured to provide a more efficient
implementation.

b) QUERY OPTIMIZATION

Purpose of Query Optimization

Query optimization attempts to generate the best execution plan for a SQL statement. The
best execution plan is defined as the plan with the lowest cost among all considered candidate
plans. The cost computation accounts for factors of query execution such as I/O, CPU, and
communication.

The best method of execution depends on myriad conditions including how the query is
written, the size of the data set, the layout of the data, and which access structures exist. The
optimizer determines the best plan for a SQL statement by examining multiple access
methods, such as full table scan or index scans, and different join methods such as nested
loops and hash joins.

Because the database has many internal statistics and tools at its disposal, the optimizer is
usually in a better position than the user to determine the best method of statement execution.
For this reason, all SQL statements use the optimizer.

Consider a user who queries records for employees who are managers. If the database
statistics indicate that 80% of employees are managers, then the optimizer may decide that a
full table scan is most efficient. However, if statistics indicate that few employees are
managers, then reading an index followed by a table access by rowid may be more efficient
than a full table scan.

Optimizer Components

The optimizer contains three main components, which are shown below:

Figure 1: Optimizer Components

5|Page
1. Query transformer

The optimizer determines whether it is helpful to change the form of the query so that
the optimizer can generate a better execution plan.

For some statements, the query transformer determines whether it is advantageous to

rewrite the original SQL statement into a semantically equivalent SQL statement with
a lower cost. When a viable alternative exists, the database calculates the cost of the
alternatives separately and chooses the lowest-cost alternative.

The optimizer employs several query transformation techniques which include OR

Expansion, View Merging, Predicate Pushing, Star Transformation etc

2. Estimator

The optimizer estimates the cost of each plan based on statistics in the data dictionary. The
estimator is the component of the optimizer that determines the overall cost of a given
execution plan.

The estimator uses three different measures to determine cost:

 Selectivity

The percentage of rows in the row set that the query selects, with 0 meaning no rows
and 1 meaning all rows. Selectivity is tied to a query predicate, such as WHERE
last_name LIKE 'A%', or a combination of predicates. A predicate becomes more

6|Page
selective as the selectivity value approaches 0 and less selective (or more unselective)
as the value approaches 1.

Cardinality

The cardinality is the number of rows returned by each operation in an execution plan. This
input, which is crucial to obtaining an optimal plan, is common to all cost functions. The
estimator can derive cardinality from the table statistics collected by DBMS_STATS, or
derive it after accounting for effects from predicates (filter, join, and so on), DISTINCT or
GROUP BY operations, and so on. The Rows column in an execution plan shows the
estimated cardinality.

Cost

This measure represents units of work or resource used. The query optimizer uses disk I/O,
CPU usage, and memory usage as units of work.

3. Plan Generator

The optimizer compares the costs of plans and chooses the lowest-cost plan, known as
the execution plan, to pass to the row source generator.

The plan generator explores various plans for a query block by trying out different access
paths, join methods, and join orders. Many plans are possible because of the various
combinations that the database can use to produce the same result. The optimizer picks the
plan with the lowest cost.

c) CODE GENERATION

A code is generated to execute the selected plan.

d) RUNTIME QUERY EXECUTION

The query is executed at run time and the final result displayed.

7|Page

Farm Management Handbook v2 Horticulture-1
No ratings yet
Farm Management Handbook v2 Horticulture-1
114 pages
Output Primitives
No ratings yet
Output Primitives
31 pages
Functional Dependency (DBMS)
No ratings yet
Functional Dependency (DBMS)
17 pages
DWDM Notes - Unit 1
No ratings yet
DWDM Notes - Unit 1
26 pages
J. PACHELBEL - Canon in D Major
92% (13)
J. PACHELBEL - Canon in D Major
4 pages
Double Magnum en PDF
100% (1)
Double Magnum en PDF
2 pages
Database Design Schema Refinement
No ratings yet
Database Design Schema Refinement
74 pages
Distributed Deadlocks and Transaction Recovery
100% (1)
Distributed Deadlocks and Transaction Recovery
22 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
Threaded Binary Tree
No ratings yet
Threaded Binary Tree
19 pages
B+ Tree & B Tree
No ratings yet
B+ Tree & B Tree
38 pages
Unit-7 Transaction Processing
No ratings yet
Unit-7 Transaction Processing
107 pages
Chapter 3: Recursion, Recurrence Relations, and Analysis of Algorithms
No ratings yet
Chapter 3: Recursion, Recurrence Relations, and Analysis of Algorithms
23 pages
Dbms Gate Notes
No ratings yet
Dbms Gate Notes
574 pages
Binary Search Tree Notes
No ratings yet
Binary Search Tree Notes
7 pages
Floyd Warshall Algorithm
No ratings yet
Floyd Warshall Algorithm
19 pages
DBMS Notes
No ratings yet
DBMS Notes
180 pages
DBMS in 5 Hours
100% (2)
DBMS in 5 Hours
332 pages
Data Structure Lab Record
No ratings yet
Data Structure Lab Record
25 pages
Example 3-16: The Hash Table Algorithm
No ratings yet
Example 3-16: The Hash Table Algorithm
5 pages
Parallel Sorting Algorithms
No ratings yet
Parallel Sorting Algorithms
22 pages
Tcs Theory Notes by Kamal Sir
No ratings yet
Tcs Theory Notes by Kamal Sir
24 pages
DBMS Unit 1 Notes
100% (1)
DBMS Unit 1 Notes
22 pages
Cursor-Based Linked Lists
No ratings yet
Cursor-Based Linked Lists
4 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Data Engineering Interview Preparation Questions
No ratings yet
Data Engineering Interview Preparation Questions
7 pages
Manual For DBMS LAB
No ratings yet
Manual For DBMS LAB
82 pages
Design and Analysis of Algorithm: Binary Tree
No ratings yet
Design and Analysis of Algorithm: Binary Tree
18 pages
Anna University IT 7th Sem Ebooks
No ratings yet
Anna University IT 7th Sem Ebooks
3 pages
ADBMS-UNIT-3 - Functional Dependency
No ratings yet
ADBMS-UNIT-3 - Functional Dependency
9 pages
Access Matrix: Implementation and Comparison
No ratings yet
Access Matrix: Implementation and Comparison
19 pages
Data Structures - Reference Book
No ratings yet
Data Structures - Reference Book
11 pages
Model Test Paper Dbms
No ratings yet
Model Test Paper Dbms
14 pages
DBMS Architecture: 1-Tier, 2-Tier & 3-Tier: What Is Database Architecture?
100% (1)
DBMS Architecture: 1-Tier, 2-Tier & 3-Tier: What Is Database Architecture?
3 pages
HBase
No ratings yet
HBase
31 pages
Data Mining-Rule Based Classification
No ratings yet
Data Mining-Rule Based Classification
4 pages
Dsa Question Paper
No ratings yet
Dsa Question Paper
2 pages
Dbms Lab # 4: SQL Wildcards & Operators
No ratings yet
Dbms Lab # 4: SQL Wildcards & Operators
10 pages
OOPs Interview Questions
No ratings yet
OOPs Interview Questions
26 pages
DBMS Unit 4 Notes
No ratings yet
DBMS Unit 4 Notes
21 pages
Unit-10 PL - SQL Concepts
No ratings yet
Unit-10 PL - SQL Concepts
45 pages
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
No ratings yet
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
4 pages
Dbms Lab Manual
No ratings yet
Dbms Lab Manual
37 pages
Lab 2
No ratings yet
Lab 2
6 pages
SQLPPT
No ratings yet
SQLPPT
82 pages
Model Question Paper
No ratings yet
Model Question Paper
2 pages
LR (0) Parser
No ratings yet
LR (0) Parser
8 pages
Chapter 10
No ratings yet
Chapter 10
46 pages
Blind 75 LeetCode Questions - LeetCode Discuss
No ratings yet
Blind 75 LeetCode Questions - LeetCode Discuss
7 pages
Samson Dbms (R23) FULL NOTES-1 - Removed
No ratings yet
Samson Dbms (R23) FULL NOTES-1 - Removed
25 pages
Lecturenotes Module-5 BCS403 Databasemanagementsystem
No ratings yet
Lecturenotes Module-5 BCS403 Databasemanagementsystem
20 pages
Regular Expressions and Its Applications
No ratings yet
Regular Expressions and Its Applications
6 pages
DR Nazir A. Zafar Advanced Algorithms Analysis and Design
No ratings yet
DR Nazir A. Zafar Advanced Algorithms Analysis and Design
30 pages
Web Technology-Lab-Manual III-II r22 Updated 24
No ratings yet
Web Technology-Lab-Manual III-II r22 Updated 24
114 pages
Numerical Based On Indexing: Problem 1.2
No ratings yet
Numerical Based On Indexing: Problem 1.2
3 pages
C, C++ Questions
No ratings yet
C, C++ Questions
39 pages
Advanced Data Base Note
No ratings yet
Advanced Data Base Note
62 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Java Reflection Complete Self-Assessment Guide
From Everand
Java Reflection Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Java servlet Second Edition
From Everand
Java servlet Second Edition
Gerardus Blokdyk
No ratings yet
Debugging Like a Pro: A Practical Guide with Examples
From Everand
Debugging Like a Pro: A Practical Guide with Examples
William E. Clark
No ratings yet
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
31 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
3D Graphics With OpenGL
No ratings yet
3D Graphics With OpenGL
31 pages
3.how Monitors Work
100% (1)
3.how Monitors Work
17 pages
Presentation On Line Drawing Algorithms
No ratings yet
Presentation On Line Drawing Algorithms
32 pages
2D Viewing Presentation
No ratings yet
2D Viewing Presentation
10 pages
C++ Tutorial
No ratings yet
C++ Tutorial
250 pages
Overview of Graphics Systems
No ratings yet
Overview of Graphics Systems
14 pages
The OSI Security Architecture
No ratings yet
The OSI Security Architecture
28 pages
Output Controls
No ratings yet
Output Controls
7 pages
Security Administration
No ratings yet
Security Administration
24 pages
Dissertation Topic R193863Z
No ratings yet
Dissertation Topic R193863Z
6 pages
Moduleoutline INFO409
No ratings yet
Moduleoutline INFO409
3 pages
BUS
No ratings yet
BUS
3 pages
Data Visualisation and Analytics
No ratings yet
Data Visualisation and Analytics
3 pages
Midlands State University: Faculty of Commerce Department of Information Systems
No ratings yet
Midlands State University: Faculty of Commerce Department of Information Systems
7 pages
LARRY Data Mining in Fast Food Industry
No ratings yet
LARRY Data Mining in Fast Food Industry
5 pages
Assignment 4
No ratings yet
Assignment 4
5 pages
Madhuro Tinashe-Chapter Two
No ratings yet
Madhuro Tinashe-Chapter Two
8 pages
DMC FP1 PDF
No ratings yet
DMC FP1 PDF
51 pages
Book Fall2011
No ratings yet
Book Fall2011
450 pages
9lv-Cms Brochure
No ratings yet
9lv-Cms Brochure
16 pages
Jawa-1 Combined Cycle Gas Turbine Power Plat (1760Mw)
No ratings yet
Jawa-1 Combined Cycle Gas Turbine Power Plat (1760Mw)
2 pages
07.4 - F01 Audio Systems
No ratings yet
07.4 - F01 Audio Systems
108 pages
TETRA System Release 6.0-7.0: Installing The TETRA Dispatcher Workstation (DWS)
No ratings yet
TETRA System Release 6.0-7.0: Installing The TETRA Dispatcher Workstation (DWS)
61 pages
Part B-Unit-1 - Creating Table of Contents
No ratings yet
Part B-Unit-1 - Creating Table of Contents
22 pages
Infs213 Pasco PDF Information Information S
No ratings yet
Infs213 Pasco PDF Information Information S
2 pages
Cisco ACI Endpoint Security Groups (ESGs)
No ratings yet
Cisco ACI Endpoint Security Groups (ESGs)
14 pages
Mrlive: Mudit Raj Sir (Iit-Delhi) (Whatsapp/Telegram: +917840072497) Use Code For 10% Discount
No ratings yet
Mrlive: Mudit Raj Sir (Iit-Delhi) (Whatsapp/Telegram: +917840072497) Use Code For 10% Discount
12 pages
Automated Pavement Distress Detection in Road Maintenance Management Necessity, Innovations, and Challenges - A Literature Review
No ratings yet
Automated Pavement Distress Detection in Road Maintenance Management Necessity, Innovations, and Challenges - A Literature Review
12 pages
FlashGet v1.4 - More Download Simultaneously
No ratings yet
FlashGet v1.4 - More Download Simultaneously
2 pages
IO Quiz Ans
No ratings yet
IO Quiz Ans
3 pages
KST GripperSpotTech 25 en PDF
No ratings yet
KST GripperSpotTech 25 en PDF
53 pages
Andrew Jackson Slides OFW7
No ratings yet
Andrew Jackson Slides OFW7
74 pages
WickedWhims v181c Exception
No ratings yet
WickedWhims v181c Exception
10 pages
Prisma Cloud Complete Guide Kubernetes
No ratings yet
Prisma Cloud Complete Guide Kubernetes
14 pages
Light Cast Pvq
No ratings yet
Light Cast Pvq
3 pages
HR ABAP Functions & Tcodes
No ratings yet
HR ABAP Functions & Tcodes
7 pages
VP Channel Partner Director in NY Resume Mark Prieto
No ratings yet
VP Channel Partner Director in NY Resume Mark Prieto
3 pages
HCX-OSAM - Error - Failed To Perform Fix-Up Operation On Migrated Linux VM. Unable To Find Suitable Kernel
No ratings yet
HCX-OSAM - Error - Failed To Perform Fix-Up Operation On Migrated Linux VM. Unable To Find Suitable Kernel
3 pages
q2 Technology Based Art-Grade 10
100% (2)
q2 Technology Based Art-Grade 10
39 pages
Programming Fundamental
No ratings yet
Programming Fundamental
1 page
HDR Explained
100% (1)
HDR Explained
16 pages
Line Distance Protection System: Grid Solutions
No ratings yet
Line Distance Protection System: Grid Solutions
926 pages
JARVIS AI Guide
No ratings yet
JARVIS AI Guide
3 pages
ANSYS Mechanical APDL Programmer's Manual - Release 14 PDF
No ratings yet
ANSYS Mechanical APDL Programmer's Manual - Release 14 PDF
350 pages
Distributed Systems Principles and Paradigms: Chapter 02: Architectures
No ratings yet
Distributed Systems Principles and Paradigms: Chapter 02: Architectures
32 pages

Chapter 2 Querry Proccessing

Uploaded by

Chapter 2 Querry Proccessing

Uploaded by

OVERVIEW OF QUERY PROCESSING

Query processing is defined as follows:

I. Parsing and translation

QUERY PROCESSING EXAMPLE

∗ Expression 1: ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))

∗ Expression 2: ΠENAME(EMP ⋊⋉ENO(σDUR>37(ASG)))

You have the following information:

Format Length Primary Null

SELECT staffNo FROM Staff WHERE position>10

This query would be rejected on two grounds:

i) The attribute staffNo is not defined in the table Staff

SELECT EmployeeName FROM Staff WHERE Position = “manager” OR Position

 A leaf node is created for each base relation in the query.

i) Conjunctive normal form- this is a sequence of conjuncts that are connected

(Position = “manager” ̌ Position = “clerk”) ^ EmployeeNo >1283

(Position = “ manager” ^ EmployeeNo >1283) ̌ ( Position = “clerk” ^

Purpose of Query Optimization

Figure 1: Optimizer Components

For some statements, the query transformer determines whether it is advantageous to

The optimizer employs several query transformation techniques which include OR

The estimator uses three different measures to determine cost:

A code is generated to execute the selected plan.

d) RUNTIME QUERY EXECUTION

You might also like