29-Query Optimization-04-10-2024
29-Query Optimization-04-10-2024
• SQL
SELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME=‘Aquarius’ AND PNUMBER=PNO
AND ESSN=SSN AND BDATE.‘1957-12-31’;
Steps in converting a query tree during heuristic optimization.
Canonical
query tree
6-16 18-17
a) Executing this tree directly first creates a very large file
containing the CARTESIAN PRODUCT of the entire
EMPLOYEE, WORKS_ON, and PROJECT files. That is
why the initial query tree is never executed, but is
transformed into another equivalent tree that is efficient to
execute. This particular query needs only one record from
the PROJECT relation— for the ‘Aquarius’ project—and
only the EMPLOYEE records for those whose date of birth
is after ‘1957-12-31’.
Moving SELECT operations
down the query tree
6-16 18-19
(b) shows an improved query tree that first applies the
SELECT operations to reduce the number of tuples that
appear in the CARTESIAN PRODUCT.
(c) Applying more
restrictive SELECT operation first
SELECT LNAME
FROM EMPOYEE, WORKS_ON, PROJECT
WHERE PNAME=‘Aquarius’ AND
PUMBER=PNO AND
ESSN=SSN AND
BDATE > ‘DEC-31-1957’
6-17 18-21
c) A further improvement is achieved by switching the
positions of the EMPLOYEE and PROJECT relations in the
tree, as shown in Figure (c). This uses the information that
Pnumber is a key attribute of the PROJECT relation, and
hence the SELECT operation on the PROJECT relation will
retrieve a single record only.
Replacing CARTESIAN PRODUCT and SELECT with JOIN
6-17 18-23
d) We can further improve the query tree by replacing any
CARTESIAN PRODUCT operation that is followed by a join
condition with a JOIN operation, as shown in Figure(d).
Moving PROJECT operations down
6-18 18-25
e) Another improvement is to keep only the attributes
needed by subsequent operations in the intermediate
relations, by including PROJECT (π) operations as early as
possible in the query tree, as shown in Figure (e). This
reduces the attributes (columns) of the intermediate
relations, whereas the SELECT operations reduce the
number of tuples (records).
SQL Query with an Uncorrelated
Subquery
Find the movies with stars born in 1960
MovieStar(name, address, gender, birthdate)
StarsIn(title, year, starName)
SELECT title
FROM StarsIn
WHERE starName IN (
SELECT name
FROM MovieStar
WHERE birthdate LIKE ‘%1960’
);
Parse Tree
<Query>
<SFW>
starName <SFW>
<tuple> IN name
<attribute> birthdate LIKE ‘%1960’
starName MovieStar
Applying the Rewrite Rule
title title
starName=name
StarsIn <condition>
<tuple> IN name StarsIn δ
MovieStar
Improving the Logical Query Plan
title
title
starName=name
starName=name
StarsIn δ
StarsIn name
name
birthdate LIKE ‘%1960’
birthdate LIKE ‘%1960’
MovieStar
MovieStar
SQL Queries and Relational Algebra
(1)
• Example
SELECT Lname, Fname
FROM EMPLOYEE
WHERE Salary > ( SELECT MAX(Salary)
FROM EMPLOYEE
WHERE Dno = 5 )
• Inner block and outer block
• Relation algebra:
PNUMBER, DNUM, LNAME, ADDRESS, BDATE (((PLOCATION=‘STAFFORD’(PROJECT))
DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))