0% found this document useful (0 votes)
19 views

Ext Sorting

Uploaded by

fovoni
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Ext Sorting

Uploaded by

fovoni
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Spring 2017

EXTERNAL SORTING
(CH. 13 IN THE COW BOOK)

2/7/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1


Motivation for External Sort
• Often have a large (size greater than the available
main memory) that we need to sort.
• Why are we sorting:
– Query processing: e.g. there are sort-based join and
aggregate algorithms
– Bulkload B+-tree: recall you had to sort the data
entries in the leaf level for this.
– One can specify ORDER BY in SQL, which sorts the
output of the query
–…
2/7/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 2
Problem Statement
• Given M memory pages, and a relation of size N pages,
where N > M, sort R on a sort key, to produce an output
relation R’ that is sorted on the sort key.
• Example: Sort the following table on zipcode
CREATE TABLE Tweets (
uniqueMsgID INTEGER, -- unique message id
tstamp TIMESTAMP, -- when was the tweet posted
uid INTEGER, -- unique id of the user
msg VARCHAR (140), -- the actual message
zip INTEGER, -- zipcode when posted
retweet BOOLEAN -- retweeted?
);

• Another example: SELECT * FROM Tweets


WHERE tstamp = TODAY
Note the sort key can be composite
ORDER BY zip
2/7/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 3
Goal of a good sort algorithm
• Sort efficiently! Where does the
memory come from?
• Sort well!
– Able to sort large relations with “small” amounts of
main memory
• What does sort efficiently mean:
– Minimize the number of disk I/Os
– Try using sequential I/Os rather than random I/Os
– Minimize the CPU costs
– Overlap I/O operations with CPU operations
Quick note: Sorting is very important in MapReduce. The reducer
expects data to arrive in sorted order from the mappers.
2/7/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 4
2-Way Sort: Requires 3 Buffers
• Pass 1: Read a page, sort it, write it (a run).
– only one buffer page is used
• Pass 2, 3, …, etc.: Algorithms for
sorting in memory?
– three buffer pages used.

INPUT 1

OUTPUT
INPUT 2

Disk Main memory buffers Disk

2/7/17 CS 564: Database Management Systems 5


Two-Way External Merge Sort
3,4 6,2 9,4 8,7 5,6 3,1 2 Input file
• Read & write entire file in PASS 0
each pass 3,4 2,6 4,9 7,8 5,6 1,3 2 1-page runs
PASS 1
• N pages, # passes = 2,3 4,7 1,3
2-page runs
!"log 2 N #$ +1 4,6 8,9 5,6 2
PASS 2
• So total cost is: 2,3
4,4 1,2
2N ("log 2 N # + 1)
4-page runs
6,7 3,5
6
• Divide and conquer 8,9
PASS 3

How can we utilize more


1,2
2,3

than three buffer pages? 3,4


4,5
8-page runs

6,6
7,8
9
2/7/17 CS 564: Database Management Systems 6
General External Merge Sort
• Sort a file with N pages using B buffer pages:
– Pass 0: use B buffer pages (run size = B pgs).
Produce éN/Bù sorted runs of B pages each.
– Pass 2, 3, …: merge B-1 runs.

INPUT 1

... ...
INPUT 2
... OUTPUT

INPUT B-1
Disk Disk
B-1 way merge.
Total buffer pages: B Where are the main memory
buffer pages allocated?
2/7/17 CS 564: Database Management Systems 7
Cost of External Sort Merge
• # passes =
• I/O Cost = # passes * 2 N
• Consider sorting a file with a 1000 pages, using 11
buffer pages.
!1000 #
– At the end of the first pass, we have "" $$ = 91 runs of
11
size 11 pages
! 91#
– Next pass produces "" ``$$ = 10 runs
of size 110 pages each
10
– The next pass produces the fully
`` sorted file

2/7/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 8


Number of Passes of External Sort
N (# of pages) B=3 B=17 B=257
100 7 2 1
10,000 13 4 2
1,000,000 20 5 3
10,000,000 23 6 3
100,000,000 26 7 4
1,000,000,000 30 8 4
32K pg
size, 32TB @1ms per read, 1111
relation
hours = 46 days!
2/7/17 CS 564: Database Management Systems 9
Size of the
Internal Sort Algorithm: Replacement Sort buffer pool?
Example: M = 2 pages, 2 tuples per page.
Input Sequence: 10, 20, 30, 40, 25, 35, 9, 8, 7, 6, 5, …
1. In-memory 10, 20, 30, 40
2. Read 25, Output 10. In-memory: 20, 25, 30, 40
3. Read 35, Output 20. In-memory : 25, 30, 35, 40
4. Read 9, Output 25. In-memory : 9, 30, 35, 40
5. Read 8, Output 30. In-memory : 8, 9, 35, 40
6. Read 7, Output 35. In-memory : 7, 8, 9, 40
7. Read 6, Output 40. In-memory : 6, 7, 8, 9
8. Read 5, Flush output, Start new run. In-memory …
On Disk: 10, 20, 25, 30, 35, 40

Average length of a run in replacement sort is 2M


2/7/17 CS 564: Database Management Systems 10
Internal Sort Algorithm
• Quicksort is a fast way to sort in memory.
• An alternative is replacement sort, which is also called tournament
sort or heapsort
– Top:Read in M pages of the relation R
– Output:move smallest record to output buffer
– Read in a new record r
– insert r into “sorted heap”
– if r not smallest, then GOTO Output
– else remove r from “heap”
– output “heap” in order; GOTO Top
• Worst-Case: What is min length of a run? How does this arise?
• Best-Case: What is max length of a run? How does this arise?
• Quicksort is faster, but longer runs often means fewer passes!
2/7/17 CS 564: Database Management Systems 11
Blocked I/Os
• So far we reading/writing one page at a time, but we
know that reading a block of pages sequentially is faster.
• Make each buffer (input/output) be a block of pgs.
– Will reduce fan-out during merge passes! Side-effect?
– Reduces per page I/O cost.
– First Pass: Each run 2B pages, ⌈N/2B⌉ runs (where B is the size
of the buffer pool in #pages)
• Which internal sort algorithm are we using?

– Merge Tree Fanout: F = ⌊B/b⌋ - 1, b is block size


– # passes: ⌈logF …⌉ + 1
– In practice, buffer pools are large, so most files are sorted in 2-3
passes
2/7/17 CS 564: Database Management Systems 12
Reduces response time.
Double Buffering What about throughput?

• Overlap CPU and IO processing


• Prefetch into shadow block.
– Potentially, more passes; in practice, 2-3 passes.

INPUT 1

INPUT 1'

INPUT 2
OUTPUT
INPUT 2'
OUTPUT'

b
block size
Disk INPUT k
Disk
INPUT k'

B main memory buffers, k-way merge


2/7/17 CS 564: Database Management Systems 13
Using B+ Trees for Sorting
• Scenario: Table to be sorted has B+ tree index on
sorting column(s).
• Idea: Can retrieve records in order by traversing leaf
pages.
• Is this a good idea?
• Cases to consider:
– B+ tree is clustered Good idea!
– B+ tree is not clustered Could be a very bad idea!

2/7/17 CS 564: Database Management Systems 14


Clustered B+ Tree Used for Sorting
• Go to the left-most leaf,
then retrieve all leaf Index
pages (Directs search)

• If data entry has records,


Data Entries
then we are done! ("Sequence set")
• If the data entries have
rids, each data page is
fetched just once (since Data Records
this is a clustered index)
Faster than
external sorting! Why not scan the data file directly?

2/7/17 CS 564: Database Management Systems 15


Unclustered B+ Tree Used for Sorting
• Unclustered B+-trees only have rids in the data entries
• So, in general, one I/O per data record!

When can this be useful? Index (Directs search)

Data Entries
("Sequence set")

Data Records

2/7/17 CS 564: Database Management Systems 16


Sorting Records!
• Sorting is a competitive sport!
• See http://sortbenchmark.org/
– Task is to sort 100 byte records.
– Different flavors of metrics that people compete on.
– Sort at trillion records as fast as you can,
• using general purpose sorting code (Daytona) or
• code specialized just for the benchmark (Indy)

2/7/17 CS 564: Database Management Systems 17

You might also like