Ext Sorting
Ext Sorting
EXTERNAL SORTING
(CH. 13 IN THE COW BOOK)
INPUT 1
OUTPUT
INPUT 2
6,6
7,8
9
2/7/17 CS 564: Database Management Systems 6
General External Merge Sort
• Sort a file with N pages using B buffer pages:
– Pass 0: use B buffer pages (run size = B pgs).
Produce éN/Bù sorted runs of B pages each.
– Pass 2, 3, …: merge B-1 runs.
INPUT 1
... ...
INPUT 2
... OUTPUT
INPUT B-1
Disk Disk
B-1 way merge.
Total buffer pages: B Where are the main memory
buffer pages allocated?
2/7/17 CS 564: Database Management Systems 7
Cost of External Sort Merge
• # passes =
• I/O Cost = # passes * 2 N
• Consider sorting a file with a 1000 pages, using 11
buffer pages.
!1000 #
– At the end of the first pass, we have "" $$ = 91 runs of
11
size 11 pages
! 91#
– Next pass produces "" ``$$ = 10 runs
of size 110 pages each
10
– The next pass produces the fully
`` sorted file
INPUT 1
INPUT 1'
INPUT 2
OUTPUT
INPUT 2'
OUTPUT'
b
block size
Disk INPUT k
Disk
INPUT k'
Data Entries
("Sequence set")
Data Records