0% found this document useful (0 votes)

56 views67 pages

Queryopt

This document discusses PostgreSQL query optimization. It introduces EXPLAIN and optimization concepts like indexes, execution plans, and the query optimizer. Indexes can improve query performance but slow down updates. The order of columns in a concatenated index is important. The query optimizer determines the best execution plan based on factors like indexes and statistics.

Uploaded by

Phan Tiến Đạt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views67 pages

Queryopt

Uploaded by

Phan Tiến Đạt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

PostgreSQL query optimization

Nguyễn Hải Châu

Email: [email protected]
Trường Đại học Công nghệ
Đại học Quốc gia Hà Nội

N. H. Châu (VNU-UET) Query optimization 1 / 67

Introduction

SQL is a declarative language: SQL statements describe what

users want to get, but not how
Imperative language: Users specify what to do to get desired
results
In PostgreSQL: the database optimizer choose the ”best” way to
execute a SQL statement
The ”best” way is determined by many different factors, such as
storage structures, indexes, and data statistics
Two types of databases: OLTP and OLAP
OLTP (Online Transaction Processing): for business
applications
OLAP (Online Analytical Processing): for BI (Business
Intelligence) and reporting

N. H. Châu (VNU-UET) Query optimization 2 / 67

Roadmap for query optimization

EXPLAIN statement and tools

Theory:
Query processing overview
Algorithm cost models
Index structures
Execution plans

N. H. Châu (VNU-UET) Query optimization 3 / 67

Building blocks of SQL

DDL (Data Definition Language): CREATE, ALTER, DROP,

RENAME
DML (Data Manipulation Language): SELECT, INSERT,
UPDATE, DELETE
DCL (Data Control Language): GRANT, REVOKE
TCL (Transaction Control Language): COMMIT, ROLLBACK,
SAVEPOINT
Querying and Filtering: WHERE, JOIN, UNION, INTERSECT,
GROUP BY, ORDER BY, HAVING

N. H. Châu (VNU-UET) Query optimization 4 / 67

pgadmin EXPLAIN

N. H. Châu (VNU-UET) Query optimization 5 / 67

EXPLAIN visualizer resources

https://www.postgresqltutorial.com/postgresql-tutorial/
postgresql-explain/
https://theartofpostgresql.com/explain-plan-visualizer/ →
https://explain.dalibo.com/
https://explain.depesz.com/

N. H. Châu (VNU-UET) Query optimization 6 / 67

Anatomy of an index

N. H. Châu (VNU-UET) Query optimization 7 / 67

Introduction

An index is a distinct structure in the database that is built

using the CREATE INDEX statement
An index is very similar to the index at the end of a book
A database index is changed frequently
Using index increases search speed but slows down updates
(insert/delete/update statement): the first power of indexing
Important data structures of index: doubly link list and a search
tree

N. H. Châu (VNU-UET) Query optimization 8 / 67

The index leaf nodes

N. H. Châu (VNU-UET) Query optimization 9 / 67

The search tree (B-tree)
B-tree = balanced tree, not binary tree

N. H. Châu (VNU-UET) Query optimization 10 / 67

B-tree traversal

N. H. Châu (VNU-UET) Query optimization 11 / 67

Indexes can be slow

An index lookup requires three steps:

Tree traversal
Following the leaf nodes doubly link list
Fetching the data from tables
Multiple leaf nodes may be need to be read to find matching
entries
Table access may require multiple block read

N. H. Châu (VNU-UET) Query optimization 12 / 67

The WHERE clause
... defines search conditions

N. H. Châu (VNU-UET) Query optimization 13 / 67

The equality operator

The equality operator is the most trivial and most frequently

used
Indexing mistakes can effect WHERE clause with multiple
conditions

N. H. Châu (VNU-UET) Query optimization 14 / 67

Primary key search

1 create table employees (

N. H. Châu (VNU-UET) Query optimization 15 / 67

Primary key search

1 select first_name, last_name from employees where employee_id = 6569;

2 first_name | last_name
3 ----------------+-----------------
4 56M8GMZ4xv0w | K4ATJi4SVLAURe6
5 (1 row)
6
7 explain (analyze, verbose, costs, buffers)
8 select first_name, last_name from employees where employee_id = 6569;
9 QUERY PLAN
10 ------------------------------------------------------------------------------------
11 Index Scan using employees_pkey on public.employees (cost=0.29..8.30 rows=1
,→ width=29) (actual time=0.025..0.028 rows=1 loops=1)
12 Output: first_name, last_name
13 Index Cond: (employees.employee_id = 6569)
14 Buffers: shared hit=3
15 Planning Time: 0.091 ms
16 Execution Time: 0.052 ms
17 (6 rows)

N. H. Châu (VNU-UET) Query optimization 16 / 67

Concatenated index

1 alter table employees add column subsidiary_id integer;

N. H. Châu (VNU-UET) Query optimization 17 / 67

Concatenated index

1 select first_name, last_name from employees where employee_id=214164 and

,→ subsidiary_id = 20;
2 first_name | last_name
3 ---------------+-----------------
4 vtJiHCvQxHtP | syiYv8EhsZ9aMpK
5 (1 row)
6
7 explain (analyze, verbose, costs, buffers)
8 select first_name, last_name from employees where employee_id=214164 and
,→ subsidiary_id = 20;
9 QUERY PLAN
10 ------------------------------------------------------------------------------------
11 Index Scan using update_idx on public.employees (cost=0.42..8.44 rows=1
,→ width=29) (actual time=0.032..0.034 rows=1 loops=1)
12 Output: first_name, last_name
13 Index Cond: ((employees.employee_id = 214164) AND (employees.subsidiary_id
,→ = 20))
14 Buffers: shared hit=4
15 Planning Time: 0.123 ms
16 Execution Time: 0.060 ms
17 (6 rows)

N. H. Châu (VNU-UET) Query optimization 18 / 67

Concatenated index

1 select first_name, last_name from employees where subsidiary_id = 20 limit 1;

2 first_name | last_name
3 --------------+-----------------
4 vtJiHCvQxHtP | syiYv8EhsZ9aMpK
5 (1 row)
6
7 explain (analyze, verbose, costs, buffers)
8 select first_name, last_name from employees where subsidiary_id = 20 limit 1;
9 QUERY PLAN
10 ------------------------------------------------------------------------------------
11 Limit (cost=0.00..0.08 rows=1 width=29) (actual time=0.011..0.011 rows=1
,→ loops=1)
12 Output: first_name, last_name
13 Buffers: shared hit=1
14 -> Seq Scan on public.employees (cost=0.00..6929.00 rows=84573 width=29)
,→ (actual time=0.009..0.009 rows=1 loops=1)
15 Output: first_name, last_name
16 Filter: (employees.subsidiary_id = 20)
17 Buffers: shared hit=1
18 Planning Time: 0.078 ms
19 Execution Time: 0.025 ms

N. H. Châu (VNU-UET) Query optimization 19 / 67

Concatenated index

N. H. Châu (VNU-UET) Query optimization 20 / 67

Concatenated index

A concatenated index (multi-column, composite or combined

index) is one index across multiple columns
Order of columns in concatenated index is important
Multi index slow

N. H. Châu (VNU-UET) Query optimization 21 / 67

Exercise

What happen if we use

1 create unique index update_idx on employees(subsidiary_id, employee_id);

2 -- and
3 create index additional_idx on employees(subsidiary_id);

Conduct experiments to answer

N. H. Châu (VNU-UET) Query optimization 22 / 67

The query optimizer

The query optimizer, or query planner, is the database

component that transforms an SQL statement into an execution
plan: compiling, or parsing
Cost-based optimizers (CBO) generate many execution plan
variations and calculate a cost value for each plan based on the
operations and estimated row numers, then choose the ”best”
exexution plan
Rule-based optimizers (RBO) generate the execution plan using
a hard- coded rule set: less flexible and seldom used

N. H. Châu (VNU-UET) Query optimization 23 / 67

Changing index

1 -- No additional index
2 select first_name, last_name, subsidiary_id, phone_number
3 from employees where last_name = 'xyz' and subsidiary_id = 10;
4 QUERY PLAN
5 ------------------------------------------------------------------------------------
6 Seq Scan on public.employees (cost=0.00..259.57 rows=1 width=346) (actual
,→ time=1.647..1.647 rows=0 loops=1)
7 Output: first_name, last_name, subsidiary_id, phone_number
8 Filter: (((employees.last_name)::text = 'xyz'::text) AND
,→ (employees.subsidiary_id = 10))
9 Rows Removed by Filter: 7376
10 Buffers: shared hit=202
11 Planning Time: 0.069 ms
12 Execution Time: 1.673 ms
13 (7 rows)

N. H. Châu (VNU-UET) Query optimization 24 / 67

Changing index
1 -- create unique index update_idx on employees(subsidiary_id, employee_id)
2 select first_name, last_name, subsidiary_id, phone_number
3 from employees where last_name = 'xyz' and subsidiary_id = 10;
4 QUERY PLAN
5 ------------------------------------------------------------------------------------
6 Bitmap Heap Scan on public.employees (cost=4.56..99.27 rows=1 width=346)
,→ (actual time=1.908..1.909 rows=0 loops=1)
7 Output: first_name, last_name, subsidiary_id, phone_number
8 Recheck Cond: (employees.subsidiary_id = 10)
9 Filter: ((employees.last_name)::text = 'xyz'::text)
10 Rows Removed by Filter: 3711
11 Heap Blocks: exact=84
12 Buffers: shared hit=84 read=22
13 -> Bitmap Index Scan on update_idx (cost=0.00..4.56 rows=37 width=0)
,→ (actual time=0.687..0.688 rows=7376 loops=1)
14 Index Cond: (employees.subsidiary_id = 10)
15 Buffers: shared read=22
16 Planning:
17 Buffers: shared hit=18 read=1
18 Planning Time: 0.282 ms
19 Execution Time: 1.943 ms
20 (14 rows)
N. H. Châu (VNU-UET) Query optimization 25 / 67
Changing index

The query with updated index is slower

Reason: Rows removed by filter is smaller
Choosing the best execution plan depends on
The table’s data distribution
How the optimizer uses statistics about the contents of the
database

N. H. Châu (VNU-UET) Query optimization 26 / 67

Statistics

A cost-based optimizer uses statistics about tables, columns,

and indexes
Column: the number of distinct values, the smallest and largest
values, the number of NULL ocurrences and the column’s data
distribution
Table: size in rows and blocks
Index: the tree depth, the number of leaf nodes, the number of
distinct keys and the clustering factor

N. H. Châu (VNU-UET) Query optimization 27 / 67

Indexes with functions

N. H. Châu (VNU-UET) Query optimization 28 / 67

A case-insensitive search

1 explain (analyze, costs, verbose, buffers)

2 select first_name, last_name, phone_number
3 from employees where upper(last_name)=upper('xyz');
4 QUERY PLAN
5 ------------------------------------------------------------------------------------
6 Seq Scan on public.employees (cost=0.00..312.64 rows=37 width=40) (actual
,→ time=2.766..2.766 rows=0 loops=1)
7 Output: first_name, last_name, phone_number
8 Filter: (upper((employees.last_name)::text) = 'XYZ'::text)
9 Rows Removed by Filter: 7376
10 Buffers: shared hit=202
11 Planning Time: 0.071 ms
12 Execution Time: 2.780 ms
13 (7 rows)

N. H. Châu (VNU-UET) Query optimization 29 / 67

The case-insensitive search with index

1 create index emp_up_name on employees (upper(last_name));

2 explain (analyze, costs, verbose, buffers)
3 select first_name, last_name, phone_number
4 from employees where upper(last_name)=upper('xyz');
5 QUERY PLAN
6 ------------------------------------------------------------------------------------
7 Bitmap Heap Scan on public.employees (cost=4.57..99.28 rows=37 width=40)
,→ (actual time=0.027..0.028 rows=0 loops=1)
8 Output: first_name, last_name, phone_number
9 Recheck Cond: (upper((employees.last_name)::text) = 'XYZ'::text)
10 Buffers: shared read=2
11 -> Bitmap Index Scan on emp_up_name (cost=0.00..4.56 rows=37 width=0)
,→ (actual time=0.025..0.025 rows=0 loops=1)
12 Index Cond: (upper((employees.last_name)::text) = 'XYZ'::text)
13 Buffers: shared read=2
14 Planning:
15 Buffers: shared hit=17 read=1
16 Planning Time: 0.318 ms
17 Execution Time: 0.057 ms
18 (11 rows)

N. H. Châu (VNU-UET) Query optimization 30 / 67

Function-based index

An index whose definition contains functions or expressions is a

so-called function-based index (FBI)
A function-based index applies the function first and puts the
result into the index
The database can use a function-based index if the exact
expression of the index definition appears in an SQL statement

N. H. Châu (VNU-UET) Query optimization 31 / 67

User-defined functions

1 create or replace function get_age(date_of_birth date)

2 returns int as
3 $$
4 begin return round((current_date-date_of_birth)/365.0); end;
5 $$ language plpgsql;
6 explain (analyze, costs, verbose, buffers)
7 select first_name, last_name, get_age(date_of_birth)
8 from employees where get_age(date_of_birth) = 42;
9 QUERY PLAN
10 ------------------------------------------------------------------------------------
11 Seq Scan on public.employees (cost=0.00..2142.49 rows=37 width=33) (actual
,→ time=0.461..14.245 rows=110 loops=1)
12 Output: first_name, last_name, get_age(date_of_birth)
13 Filter: (get_age(employees.date_of_birth) = 42)
14 Rows Removed by Filter: 7289
15 Buffers: shared hit=191
16 Planning:
17 Buffers: shared hit=21
18 Planning Time: 0.250 ms
19 Execution Time: 14.277 ms
20 (9 rows)

N. H. Châu (VNU-UET) Query optimization 32 / 67

Immutable functions
1 create or replace function get_age(date_of_birth date)
2 returns int as
3 $$
4 begin return round((current_date-date_of_birth)/365.0); end;
5 $$ language plpgsql;
6 create index emp_up_name on employees (get_age(date_of_birth));
7 ERROR: functions in index expression must be marked IMMUTABLE
8
9 create or replace function get_age(date_of_birth date)
10 returns int as
11 $$
12 begin return round((current_date-date_of_birth)/365.0); end;
13 $$ immutable language plpgsql;
14 create index emp_up_name on employees (get_age(date_of_birth));
15 CREATE INDEX

User must ensure the function is immutable

Although get_age is declared and accepted by PostgreSQL to
be function-based index, it will not work: get_age is not
deterministic
N. H. Châu (VNU-UET) Query optimization 33 / 67
SQL parameterized queries and bind parameters

Bind parameters, also dynamic parameters or bind variables,

used to pass data to the database using placeholders like ?,
:name or @name
Advantages of bind parameters:
Security: Bind variables are the best way to prevent SQL
injection
Performance: Database with execution plan cache can reuse
execution plance of the same statements
Note: bind parameters are not wildcards (%, _)
Bind parameters cannot change the structure of an SQL
statement. To change the structure of an SQL statement during
runtime, use dynamic SQL

N. H. Châu (VNU-UET) Query optimization 34 / 67

Example: MySQL bind parameters and PHP

1 // No bind parameters
2 $mysqli->query("select first_name, last_name"
3 . " from employees"
4 . " where subsidiary_id = " . $subsidiary_id);
5
6 // Using a bind parameter
7 if ($stmt = $mysqli->prepare("select first_name, last_name"
8 . " from employees"
9 . " where subsidiary_id = ?"))
10 {
11 $stmt->bind_param("i", $subsidiary_id);
12 $stmt->execute();
13 } else {
14 /* handle SQL error */
15 }

N. H. Châu (VNU-UET) Query optimization 35 / 67

PostgreSQL PREPARE and EXECUTE

A prepared statement is a server-side object that can be used to

optimize performance
When the PREPARE statement is executed, the specified
statement is parsed, analyzed, and rewritten
When an EXECUTE command is subsequently issued, the
prepared statement is planned and executed: avoiding repetitive
parse analysis work and allowing the execution plan to depend
on the specific parameter values supplied

N. H. Châu (VNU-UET) Query optimization 36 / 67

PostgreSQL PREPARE and EXECUTE

1 -- Prepare statements
2 prepare add_employee(int, varchar(64), varchar(64), date, varchar(16), int) as
3 insert into employees values ($1, $2, $3, $4, $5, $6);
4
5 prepare get_employee(int) as
6 select * from employees where employee_id = $1;
7
8 -- Execute statements
9 start transaction;
10 execute add_employee(10, 'FN', 'LN', '2005-01-02', 123, 12);
11 execute get_employee(10);
12 rollback;
13
14 -- Deallocate prepare statements
15 deallocate prepare add_employee;
16 deallocate prepare get_employee;
17 deallocate prepare all;

N. H. Châu (VNU-UET) Query optimization 37 / 67

Searching for ranges

The access predicates are the start and stop conditions for an
index lookup. They define the scanned index range
Inequality operators (<, >, between) can use indexes like the
equal operator
Performance risk of range search: leaf node traversal → the
golden rule is to keep the scanned index range as small as
possible
<, >, between may not able to use the order of multi-column
index order
Rule of thumb: Use index for equality first, the inequality

N. H. Châu (VNU-UET) Query optimization 38 / 67

Indexing like filters

LIKE filters can only use the characters before the first wildcard
during tree traversal
Only the part before the first wildcard serves as an access
predicate
The remaining characters are just filter predicates that do not
narrow the scanned index range

N. H. Châu (VNU-UET) Query optimization 39 / 67

Indexing like filters

1 create index ln_idx on employees (last_name);

2 explain (analyze, verbose, costs, settings, buffers, wal, timing, summary)
,→ select * from employees where last_name like 'xyz%';
3 QUERY PLAN
4 ------------------------------------------------------------------------------------
5 Seq Scan on public.employees (cost=0.00..245.49 rows=1 width=52) (actual
,→ time=0.766..0.767 rows=0 loops=1)
6 Output: employee_id, first_name, last_name, date_of_birth, phone_number,
,→ subsidiary_id
7 Filter: ((employees.last_name)::text ~~ 'xyz%'::text)
8 Rows Removed by Filter: 7399
9 Buffers: shared hit=153
10 Planning:
11 Buffers: shared hit=16 read=1
12 Planning Time: 0.167 ms
13 Execution Time: 0.779 ms
14 (9 rows)

N. H. Châu (VNU-UET) Query optimization 40 / 67

Indexing like filters

1 create index ln_idx on employees (last_name varchar_pattern_ops);

2 explain (analyze, verbose, costs, settings, buffers, wal, timing, summary)
3 select * from employees where last_name like 'xyz%';
4 QUERY PLAN
5 ------------------------------------------------------------------------------------
6 Index Scan using ln_idx on public.employees (cost=0.28..8.30 rows=1
,→ width=52) (actual time=0.023..0.023 rows=0 loops=1)
7 Output: employee_id, first_name, last_name, date_of_birth, phone_number,
,→ subsidiary_id
8 Index Cond: (((employees.last_name)::text ~>=~ 'xyz'::text) AND
,→ ((employees.last_name)::text ~<~ 'xy{'::text))
9 Filter: ((employees.last_name)::text ~~ 'xyz%'::text)
10 Buffers: shared read=2
11 Planning:
12 Buffers: shared hit=16 read=1
13 Planning Time: 0.314 ms
14 Execution Time: 0.048 ms
15 (9 rows)

N. H. Châu (VNU-UET) Query optimization 41 / 67

Index merge

One index with multiple columns is better than multiple indexes

separately
One index scan is faster than two or more
Bitmap index is almost unusable for OLTP

N. H. Châu (VNU-UET) Query optimization 42 / 67

Partial index
A partial index is useful for commonly used where conditions
that use constant values
For example, a common query (find unprocessed messages) in
queueing systems:
1 SELECT message FROM messages
2 WHERE processed = 'N' AND receiver = ?

A normal index:
1 CREATE INDEX messages_todo
2 ON messages (receiver, processed)

A better solution is the partial index:

1 CREATE INDEX messages_todo ON messages (receiver)
2 WHERE processed = 'N'

N. H. Châu (VNU-UET) Query optimization 43 / 67

Partial index

1 create index partial_idx on employees (phone_number) where subsidiary_id=10;

N. H. Châu (VNU-UET) Query optimization 44 / 67

Obfuscation conditions

N. H. Châu (VNU-UET) Query optimization 45 / 67

Obfuscation conditions
Obfuscated conditions are where clauses that are phrased in a
way that prevents proper index usage
Most obfuscations involve DATE types
A solution: function-based index, for example:

1 CREATE INDEX index_name ON table_name (TRUNC(sale_date))

but we must always use TRUNC

Alternative solution: Use explicit range condition

1 -- Range query
2 SELECT ... FROM sales
3 WHERE sale_date BETWEEN quarter_begin(?)
4 AND quarter_end(?)
5 -- Index: A straight index on SALE_DATE is enough to optimize this query

N. H. Châu (VNU-UET) Query optimization 46 / 67

Obfuscation conditions PostgreSQL

1 CREATE FUNCTION quarter_begin(dt timestamp with time zone)

2 RETURNS timestamp with time zone AS $$
3 BEGIN
4 RETURN date_trunc('quarter', dt);
5 END;
6 $$ LANGUAGE plpgsql;
7
8 CREATE FUNCTION quarter_end(dt timestamp with time zone)
9 RETURNS timestamp with time zone AS $$
10 BEGIN
11 RETURN date_trunc('quarter', dt)
12 + interval '3 month'
13 - interval '1 microsecond';
14 END;
15 $$ LANGUAGE plpgsql;

N. H. Châu (VNU-UET) Query optimization 47 / 67

Numeric strings

Numeric strings are numbers that are stored in text columns

Normally it is a bad practice although we can create an index for
the numeric string
Rule: Use numeric types to store numbers

N. H. Châu (VNU-UET) Query optimization 48 / 67

The JOIN operation

N. H. Châu (VNU-UET) Query optimization 49 / 67

The JOIN operation

Building block: two tables join

Join order affects performance
Using bind parameters is very important to complex join
statements to avoid recompiling

N. H. Châu (VNU-UET) Query optimization 50 / 67

Nested loop using PHP

1 $qb = $em->createQueryBuilder();
2 $qb->select('e')
3 ->from('Employees', 'e')
4 ->where("upper(e.last_name) like :last_name")
5 ->setParameter('last_name', 'WIN%');
6 $r = $qb->getQuery()->getResult();
7 foreach ($r as $row) {
8 // process Employee
9 foreach ($row->getSales() as $sale) {
10 // process Sale for Employee
11 }
12 }

Indexing for nested loop is like indexing for SELECT

SQL joins are more efficient than nested loops

N. H. Châu (VNU-UET) Query optimization 51 / 67

Hash join

The hash join is to fix the weak spot of nested loop: many
B-tree traversals
A hash join requires an entirely indexing approach than the
nested loop join
Indexing strategy:
No need to index the join columns
Only indexes for independent where predicates improve hash
join performance
Important: Indexing join predicates doesn’t improve hash join
performance
Indexing a hash join is independent of the join order

N. H. Châu (VNU-UET) Query optimization 52 / 67

Hash join example

1 -- Join
2 SELECT *
3 FROM sales s
4 JOIN employees e ON (s.subsidiary_id = e.subsidiary_id
5 AND s.employee_id = e.employee_id )
6 WHERE s.sale_date > trunc(sysdate) - INTERVAL '6' MONTH
7
8 -- Index for WHERE predicate
9 CREATE INDEX sales_date ON sales (sale_date);

MySQL Community Edition supports hash join since version 8.0

N. H. Châu (VNU-UET) Query optimization 53 / 67

Sort merge

The sort-merge join combines two sorted lists like a zipper

Both sides of the join must be sorted by the join predicates
A sort-merge join needs the same indexes as the hash join, that
is an index for the independent conditions to read all candidate
records in one shot.
Indexing the join predicates is useless
then sort-merge join is like hash join
Sort-merge is absolute symmetry and very useful for outer joins

N. H. Châu (VNU-UET) Query optimization 54 / 67

Clustering data

N. H. Châu (VNU-UET) Query optimization 55 / 67

Clustering data

Clustering data: to store consecutively accessed data closely

together so that accessing it requires fewer IO operations
Example: Column oriented databases are common in OLAP
processing: accessing many rows but only a few columns
Indexes allow one to cluster data: the second power of indexing

N. H. Châu (VNU-UET) Query optimization 56 / 67

Index filter predicates used intentionally

Index predicates can be used to group consecutively accessed

data together
WHERE clause predicates that cannot serve as access predicate
are good candidates for this technique
The query performance depends on the physical distribution of
accessed rows
Reordering real data in the disk is impractival because it can
serve only one sequence
The indexing clustering factor: probability that two succeeding
index entries refer to the same table block
One can add many columns to an index so that they are
automatically stored in a well defined order: second power of
indexing

N. H. Châu (VNU-UET) Query optimization 57 / 67

Index-only scan

The index-only scan is one of the most powerful tuning methods

of all
It not only avoids accessing the table to evaluate the where
clause, but avoids accessing the table completely if the database
can find the selected columns in the index itself
To cover an entire query, an index must contain all columns
from the SQL statement: covering index
The performance advantage of an index-only scans depends on
the number of accessed rows and the index clustering factor

N. H. Châu (VNU-UET) Query optimization 58 / 67

Sorting and grouping

N. H. Châu (VNU-UET) Query optimization 59 / 67

Indexing ORDER BY

If the index order corresponds to the order by clause, the

database can omit the explicit sort operation: the same index
that is used for the where clause must also cover the order by
clause
Tip: Use the full index definition in the order by clause to find
the reason for an explicit sort operation

N. H. Châu (VNU-UET) Query optimization 60 / 67

Indexing ASC, DESC and NULLS FIRST/LAST

DBMS can read indexes in both directions

When using mixed ASC and DESC modifiers in the order by
clause, one must define the index likewise in order to use it for a
pipelined order by

N. H. Châu (VNU-UET) Query optimization 61 / 67

Indexing GROUP BY

SQL has two GROUP BY algorithms:

Hash algorithm: aggregates the input records in a temporary
hash table; once all input records are processed, the hash table
is returned as the result
Sort/group algorithm: first sorts the input data by the grouping
key; afterwards, the DBMS just needs to aggregate them
The sort/group algorithm can use an index to avoid the sort
operation

N. H. Châu (VNU-UET) Query optimization 62 / 67

Modifying data

N. H. Châu (VNU-UET) Query optimization 63 / 67

INSERT

The number of indexes on a table is the most dominant factor

for insert performance
The more indexes a table has, the slower the execution becomes
The insert statement is the only operation that cannot directly
benefit from indexing because it has no WHERE clause

N. H. Châu (VNU-UET) Query optimization 64 / 67

Improve INSERT performance

Use indexes deliberately and sparingly, and avoid redundant

indexes
This is also beneficial for delete and update statements
N. H. Châu (VNU-UET) Query optimization 65 / 67
DELETE

Unlike the insert statement, the DELETE statement has a where

clause that can use all the index methods in the WHERE clause
In fact, the delete statement works like a select that is followed
by an extra step to delete the identified rows
N. H. Châu (VNU-UET) Query optimization 66 / 67
UPDATE

An update statement must relocate the changed index entries to

maintain the index order
The response time is basically the same as for the respective
delete and insert statements together
The update performance, just like INSERT and DELETE, also
depends on the number of indexes on the table
N. H. Châu (VNU-UET) Query optimization 67 / 67

Pham Hong Son - Do Thi Thu Huong - DDB Report
No ratings yet
Pham Hong Son - Do Thi Thu Huong - DDB Report
8 pages
Index & Query Optimization
No ratings yet
Index & Query Optimization
21 pages
Practical 5 Implement Indexes
No ratings yet
Practical 5 Implement Indexes
4 pages
Mod 4
No ratings yet
Mod 4
4 pages
Lec6 QP Indexing
No ratings yet
Lec6 QP Indexing
40 pages
Lab 06
No ratings yet
Lab 06
8 pages
Converted Text
No ratings yet
Converted Text
5 pages
SQL Query Optimization Techniques
No ratings yet
SQL Query Optimization Techniques
49 pages
Index
No ratings yet
Index
5 pages
Tuning
100% (2)
Tuning
29 pages
SQL Performance Optimization Guide
No ratings yet
SQL Performance Optimization Guide
66 pages
Database Modeling - notes-VI
No ratings yet
Database Modeling - notes-VI
8 pages
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
100% (1)
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
56 pages
SQL Indexes
No ratings yet
SQL Indexes
4 pages
SQL Optimaztion
No ratings yet
SQL Optimaztion
3 pages
How To Design Indexes Really - 0-2 PDF
No ratings yet
How To Design Indexes Really - 0-2 PDF
72 pages
SQL Query Optimization Best Practices
No ratings yet
SQL Query Optimization Best Practices
12 pages
Module 12 - Managing Indexes
No ratings yet
Module 12 - Managing Indexes
19 pages
Databases LEVEL 3 Notes
No ratings yet
Databases LEVEL 3 Notes
29 pages
SQL Tuning#1
No ratings yet
SQL Tuning#1
49 pages
Lec 8 Indexing & Data Structures For Query Processing
No ratings yet
Lec 8 Indexing & Data Structures For Query Processing
51 pages
Lec 13
No ratings yet
Lec 13
26 pages
SQL Tuning
No ratings yet
SQL Tuning
51 pages
SQL Tuning
100% (6)
SQL Tuning
51 pages
Viewinedx
No ratings yet
Viewinedx
11 pages
Practical Mysql Indexing Guidelines
No ratings yet
Practical Mysql Indexing Guidelines
35 pages
Physical Database Design and Tuning: R&G - Chapter 20
No ratings yet
Physical Database Design and Tuning: R&G - Chapter 20
19 pages
SQL Indexing and Optimization Guide
No ratings yet
SQL Indexing and Optimization Guide
29 pages
Query Processing
No ratings yet
Query Processing
8 pages
SQL Query Tuning Techniques for Oracle
No ratings yet
SQL Query Tuning Techniques for Oracle
55 pages
MySQL-Indexing Best Practices (WEBINAR)
No ratings yet
MySQL-Indexing Best Practices (WEBINAR)
41 pages
Mysql Explain Explained
No ratings yet
Mysql Explain Explained
23 pages
Optimization of SQL Queries in Firebird: Dmitry Yemanov, Firebird Alexey Kovyazin, Ibsurgeon
No ratings yet
Optimization of SQL Queries in Firebird: Dmitry Yemanov, Firebird Alexey Kovyazin, Ibsurgeon
38 pages
Query Processing and Optimization PDF
No ratings yet
Query Processing and Optimization PDF
73 pages
Tips For Writing Efficient SQL Queries. Vigyan Kaushik
No ratings yet
Tips For Writing Efficient SQL Queries. Vigyan Kaushik
6 pages
Types of Oracle Indexes Explained
100% (1)
Types of Oracle Indexes Explained
27 pages
Using An Index When You Don't Have All of The F..
No ratings yet
Using An Index When You Don't Have All of The F..
4 pages
Indexin and Que O Timization in DBMS
No ratings yet
Indexin and Que O Timization in DBMS
3 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Lesson 12 Performance Optimization and Best Practices in SQL
No ratings yet
Lesson 12 Performance Optimization and Best Practices in SQL
66 pages
Query Optimization in Databases
No ratings yet
Query Optimization in Databases
6 pages
Oracle SQL Tuning: Presented by Akin S Walter-Johnson Ms Principal Peerlabs, Inc
No ratings yet
Oracle SQL Tuning: Presented by Akin S Walter-Johnson Ms Principal Peerlabs, Inc
45 pages
Database Indexing
No ratings yet
Database Indexing
4 pages
Query Optimization Part1
No ratings yet
Query Optimization Part1
52 pages
SQL Tuning Basics and Optimization Tips
100% (1)
SQL Tuning Basics and Optimization Tips
42 pages
Indexing Hashing
No ratings yet
Indexing Hashing
34 pages
Oracle SQL Tuning Execution: An Introduction
No ratings yet
Oracle SQL Tuning Execution: An Introduction
92 pages
Tuning SQL Queries For Performance
No ratings yet
Tuning SQL Queries For Performance
5 pages
Indexes
No ratings yet
Indexes
4 pages
Using Indexes: WHERE Emp - Id 'E10001'
No ratings yet
Using Indexes: WHERE Emp - Id 'E10001'
5 pages
Oracle SQL Tuning Executio N: An Introduction
No ratings yet
Oracle SQL Tuning Executio N: An Introduction
92 pages
GPS Tracker Setup Guide
No ratings yet
GPS Tracker Setup Guide
24 pages
KMNS SC015 2023 - 2024 Question
No ratings yet
KMNS SC015 2023 - 2024 Question
11 pages
Real-Time and Embedded Systems - Lecture Slide Chp-1
No ratings yet
Real-Time and Embedded Systems - Lecture Slide Chp-1
23 pages
Exclusive Deals and Offers: Take Advantage of Premium Apple Products On
No ratings yet
Exclusive Deals and Offers: Take Advantage of Premium Apple Products On
2 pages
Phy71 Chapter 1
0% (1)
Phy71 Chapter 1
21 pages
BG 1600 Quick Start Guide
No ratings yet
BG 1600 Quick Start Guide
2 pages
PMP Chapter 4 Test Project Integration Management
No ratings yet
PMP Chapter 4 Test Project Integration Management
3 pages
MSc Project Preparation Guide
No ratings yet
MSc Project Preparation Guide
44 pages
Red Black Tree: Insertion Guide
No ratings yet
Red Black Tree: Insertion Guide
9 pages
Update Physical Count in Oracle Inventory
No ratings yet
Update Physical Count in Oracle Inventory
8 pages
The Advantages of Evolutionary Computation: David B. Fogel
No ratings yet
The Advantages of Evolutionary Computation: David B. Fogel
11 pages
Step-by-Step SmartForms Guide
No ratings yet
Step-by-Step SmartForms Guide
7 pages
Mindtree Brochures Murex Datamart Performance Optimization
100% (2)
Mindtree Brochures Murex Datamart Performance Optimization
4 pages
Unit3 SASD
No ratings yet
Unit3 SASD
27 pages
Natural Language Processing CS 1462: Computer Science 3 Semester - 1444 Dr. Fahman Saeed Faesaeed@imamu - Edu.sa
No ratings yet
Natural Language Processing CS 1462: Computer Science 3 Semester - 1444 Dr. Fahman Saeed Faesaeed@imamu - Edu.sa
15 pages
Meta Tool1
No ratings yet
Meta Tool1
3 pages
Informatica Intelligent Cloud Data Quality
No ratings yet
Informatica Intelligent Cloud Data Quality
3 pages
PKVM - IT Practical SLIPS 9860272494 FOR SCIENCE
100% (1)
PKVM - IT Practical SLIPS 9860272494 FOR SCIENCE
24 pages
A Development of Quality Model For Online Games Based On Iso/Lee 25010
No ratings yet
A Development of Quality Model For Online Games Based On Iso/Lee 25010
4 pages
Algorithms in Python
100% (1)
Algorithms in Python
218 pages
Complete C Programming Notes According To Syllbus
100% (9)
Complete C Programming Notes According To Syllbus
39 pages
Redação para Concursos e Vestibulares - Passo A Passo (Portuguese Edition) by Dad Squarisi, Célia Curto
No ratings yet
Redação para Concursos e Vestibulares - Passo A Passo (Portuguese Edition) by Dad Squarisi, Célia Curto
6 pages
Valorant Keyboard Control Guide - Manuals+
0% (1)
Valorant Keyboard Control Guide - Manuals+
2 pages
Ensoniq TS Series MIDI SysEx Specification
No ratings yet
Ensoniq TS Series MIDI SysEx Specification
69 pages
Developer Kit For Java
No ratings yet
Developer Kit For Java
576 pages
Un Pan 015486
No ratings yet
Un Pan 015486
28 pages
RFP For EBS Migration
No ratings yet
RFP For EBS Migration
131 pages
Experiment - 9 Control Flow Graph
No ratings yet
Experiment - 9 Control Flow Graph
6 pages
Signiant File Transfer Application
No ratings yet
Signiant File Transfer Application
5 pages
Proposal Template
100% (1)
Proposal Template
27 pages