Ch12-Query Processing
Ch12-Query Processing
Primary index
A primary index is an index on a set of fields that
includes the unique primary key for the field and is
guaranteed not to contain duplicates.
Also Called a **Clustered index**.
eg. Employee ID can be Example of it.
Secondary index
A Secondary index is an index that is not a primary
index and may have duplicates.
eg. Employee name can be example of it. Because
Employee name can have similar values.
Database System Concepts - 6th Edition 12.11 Silberschatz, Korth and Sudarshan
Selections Using Indices
Index scan search algorithms that use an index
Selection condition must be on search-key of index
A2 (primary index, equality on key). Retrieve a single record that
satisfies the corresponding equality condition
Cost = (hi + 1) * (tT + tS) (Where hi denotes the height of the index.)
Index lookup traverses the height of the tree plus one I/O to fetch the record;
each of these I/O operations requires a seek and a block transfer.
A3 (primary index, equality on nonkey) Retrieve multiple records
Records will be on consecutive blocks
Let b = number of blocks containing matching records
Cost = hi * (tT + tS) + tS + tT * b
Selections Using Indices
The primary difference in cost between the block nested-loop join and
the basic nested-loop join is that, in the worst case, each block in the
inner relation s is read only once for each block in the outer relation,
instead of once for each tuple in the outer relation.
Worst case estimate: br bs + br block transfers + 2 * br seeks
Each block in the inner relation s is read once for each block in the outer
relation
Each scan of the inner relation requires one seek, and the scan of the
outer relation requires one seek per block, leading to a total of 2 br
seeks.
Clearly, it is more efficient to use the smaller relation as the outer relation,
in case neither of the relations fits in memory.
Best case where the inner relation fits in memory, there will be br + bs
block transfers + 2 seeks (we would choose the smaller relation as the
inner relation in this case).
Example: Block Nested-Loop Join
Computing student takes, using the block nested-loop join
algorithm.
In the worst case, we have to read each block of takes once for
each block of student.
Thus, in the worst case, a total of 100 400+100 = 40,100 block
transfers plus 2100 = 200seeks are required.
This cost is a significant improvement over the 5000400+100 =
2,000,100 block transfers plus 5100 seeks needed in the worst
case for the basic nested-loop join.
The best-case cost remains the samenamely, 100 + 400 = 500
block transfers and 2 seeks.
Indexed Nested-Loop Join
Database System Concepts - 6th Edition 12.30 Silberschatz, Korth and Sudarshan
Handling of Overflows