0% found this document useful (0 votes)

15 views12 pages

Ads Unit-1

The document provides an overview of sorting, emphasizing its importance in organizing data for efficient searching. It categorizes sorting techniques into internal and external sorting, detailing various algorithms under each category, such as selection sort, merge sort, and the use of winner and loser trees for optimal merging. Additionally, it discusses the efficiency of sorting methods based on execution time and space requirements.

Uploaded by

kagita

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views12 pages

Ads Unit-1

Uploaded by

kagita

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

UNIT-1

SORTING

Introduction to Sorting
Sorting is nothing but storage of data in sorted order, it can be in ascending or descending
order. The term Sorting comes into picture with the term Searching. There are so many
things in our real life that we need to search, like a particular record in database, roll
numbers in merit list, a particular telephone number, any particular page in a book etc.
Sorting arranges data in a sequence which makes searching easier. Every record which is
going to be sorted will contain one key. Based on the key the record will be sorted. For
example, suppose we have a record of students, every such record will have the following
data:

 Roll No.
 Name
 Age
 Class

Here Student roll no. can be taken as key for sorting the records in ascending or descending
order. Now suppose we have to search a Student with roll no. 15, we don't need to search the
complete record we will simply search between the Students with roll no. 10 to 20.
Sorting Efficiency
There are many techniques for sorting. Implementation of particular sorting technique
depends upon situation. Sorting techniques mainly depends on two parameters. First
parameter is the execution time of program, which means time taken for execution of
program. Second is the space, which means space taken by the program.
Types of Sorting Techniques
Sorting can be classified in two ways:

I. Internal Sorting:

This method uses only the primary memory during sorting process. All data items are held
in main memory and no secondary memory is required this sorting process. If all the data
that is to be sorted can be accommodated a time in memory is called internal sorting. There is
a limitation for internal sorting, they can only process relatively small lists due to memory
constraints.

There are 3 types of internal sorts.

1. Selection Sort:

ADS[UNIT-1] Page 1
1. Selection sort algorithm
2. Heap sort algorithm

2. Insertion Sort:

1. Insertion sort algorithm

2. Shell sort algorithm

3. Exchange Sort:

1. Bubble sort algorithm

2. Quick sort algorithm

II. External Sorting:

Sorting large amount of data requires external or secondary memory. This process uses
external memory such as HDD, to store the data which is not fir into the main memory. So,
primary memory holds the currently being sorted data only. All external sorts are based on
process of merging. Different parts of data are sorted separately and merged together.

 Merge Sort

For a disk, there are three factors contributing to the read/write time:

(1) Seek time: time taken to position the read/write heads to the correct cylinder. This will
depend on the number of cylinders across which the heads have to move.

(2) Latency time: time until the right sector of the track is under the read/write head.

(3) Transmission time: time to transmit the block of data to/from the disk.

The most popular method for sorting on external storage devices is merge sort. This method
consists of two distinct phases.

First, segments of the input list are sorted using a good internal sort method. These sorted
segments, known as runs, are written onto external storage as they are generated.

Second, the runs generated in phase one are merged together until only one run is left.

Example: A list containing 4500 records is to be sorted using a computer with an internal
memory capable of sorting at most 750 records. The input list is maintained on disk and has
a block length of 250 records. We have available another disk that may be used as a scratch

ADS[UNIT-1] Page 2
pad. The input disk is not to be written on. One way to accomplish the sort using the general
function outlined above is to

(1) Internally sort three blocks at a time (i.e., 750 records) to obtain six runs R1 to R6

(2) Set aside three blocks of internal memory, each capable of holding 250 records. Two of
these blocks will be used as input buffers and the third as an output buffer. Merge runs R.
and R 2: This merge is carried out by first reading one block of each of these runs into input
buffers.

We shall assume that each time a block is read from or written onto the disk the maximum
seek and latency times are experienced. Although this is not true in general it will simplify
the analysis. The computing times for the various operations in our 4500- record example are
given in Figure:

ADS[UNIT-1] Page 3
k- Way Merging
The two-way merge function merge (Program 7.7) is almost identical to the merge
function just described (Figure 7.20). In general. if we start with m runs, the! merge
tree corresponding to Figure 7.20 will have fiog2m 1+1 levels. for a total of fiOg2m1
passes over the data list. The number of passes over the data can be reduced by using a
higher order merge

Buffer Handling for Parallel Operation

If k runs are being merged together by a k-way merge, then- we clearly need at least k
input buffers and one output buffer to carry out the merge. This, however. is not enough
if input, output, and internal merging are to be carried out in parallel. For instance, while the
output buffer is being written out, internal merging has to be halted. since there is no place to
collect the merged records. This can be overcome through the use of two output buffers.
While one is being written out, records are merged into the second. If buffer sizes are chosen

ADS[UNIT-1] Page 4
correctly, then the time to output one buffer will be the same as the CPU time needed to fill
the second buffer.

Example: Assume that a two-way merge is carried out using four input buffers, and two
output buffers, ou [0] and ou [I]. Each buffer is capable of holding two records. The first few
records of run 0 have key value I, 3, 5, 7. 8. 9. The first few records of run I have key value
2,4,6, 15.20.25.

Buffering Algorithm:

Step 1: Input the first block of each of the k runs, setting up k linked queues, each having one
block of data. Put the remaining k input blocks into a linked stack of free input blocks. Set ou
to 0.

ADS[UNIT-1] Page 5
Step 2: Let Last Key [i] be the last key input from run i. Let NextRun be the run for which
Last Key is minimum. If LastKey [NextRun ] = +∞, then initiate the input of the next block
from run NextRun.
Step 3: Use a function k-way merge to merge records from the k input queues into
the output buffer ou. Merging continues until either the output buffer gets full or a record
with key +∞ is merged into ou. If, during this merge, an input buffer becomes empty before
the output buffer gets full or before +∞ is merged into ou, the k-way merge advances to the
next buffer on the same queue and returns the empty buffer to the stack of empty buffers.
However, if an input buffer becomes empty at the same time as the output buffer gets full or
+∞ is merged into ou, the empty buffer is left on the queue, and K-way merge does not
advance to the next buffer on the queue. Rather, the merge terminates.
Step 4: Wait for any ongoing disk input/output to complete.
Step 5: If an input buffer has been read, add it to the queue for the appropriate
run. Determine the next run to read from by determining NextRun such that LastKey
[NextRun ] is minimum.

Step 6: If lastKey[nextRun]≠+∞, then initiate reading the next block from run nextRun into a
free input buffer.

Step 7: Initiate the writing of output buffer ou.

Step 8: If a record with key +∞ has not been merged into the output buffer, go back to step 3.
Otherwise wait for the ongoing write to complete and then terminate.

Ex: Each run consists of four blocks of two records each; the last key in the fourth block of each
of these three runs is +∞. We have six input buffers and two output buffers. Figure 7.25 shows
the status of the input buffer queues, the run from which the next’ block is being read, and the
output buffer being output at the beginning of each iteration of the loop of Steps 3 through 8
of the buffering algorithm.

ADS[UNIT-1] Page 6
Run Generation:

It is possible to generate runs those are only as large as the no of records can be held in
internal memory at one time.

Using a tree of losers, it is possible to do better than this.

Winner Trees

Complete binary tree with n external nodes and n - 1 internal nodes.

External nodes represent tournament players.

Each internal node represents a match played between its two children; the winner of the
match is stored at the internal node.

Root is overall winner.

Ex:

ADS[UNIT-1] Page 7
Time To Sort

• Initialize winner tree.

ƒO(n) time

• Remove winner and replay.

ƒO(log n) time

• Remove winner and replay n times.

ƒO(n log n) time

• Total sort time is O(n log n).

• Actually O (n log n).

Winner Tree Operations

• Initialize

ƒO(n) time

• Get winner

ADS[UNIT-1] Page 8
ƒO(1) time

• Remove/replace winner and replay

ƒO(log n) time

ƒmore precisely O (log n)

Loser Tree: Each match node stores the match loser rather than the match winner.

ADS[UNIT-1] Page 9
Replace and Replay:

Analysis of runs: When the input list is already sorted. only one run is generated. On the
average. the run size is almost 2k. The time required to generate all the runs for an 11 run list
is 0(11 log k). as it takes O(log k) time to adjust the loser tree each time a record is output.

Optimal Merging of Runs

The runs generated by function runs may not be of the same size. When tuns are of different
size, the run merging strategy employed so far (i.e., make complete passes over the collection
of runs) does not yield minimum run times.

The circular nodes represent a two-way merge using as input the data of the children nodes.
The square nodes represent the initial runs. We shall refer to the circular nodes as internal
nodes and the square ones as external nodes. Each figure is a merge tree.

ADS[UNIT-1] Page 10
In the first merge tree. we begin by merging the runs of size 2 and 4 to get one of size 6; next
this is merged with the run of size 5 to get a run of size 11; finally this run of size II is merged
with the run of size 15 to get the desired sorted run of size 26. When merging is done using
the first merge tree. some records are involved in only one merge. and others are involved in
up to three merges. In the second merge tree each record is involved in exactly two merges.
Since the time for a merge is linear in the number of records being merged, the total merge
time is obtained by summing the products of the run lengths and the distance from the root
of the corresponding {external nodes. This sum is called the weighted external path length.
For the two trees 01 Figure 7.26, the respective weighted external path lengths are
2 . 3 + 4 . 3 + 5 . 2 + 15 . 1 = 43
and
2 . 2 + 4 . 2 + 2 + 5 . 2 + 15 . 2 = 52
another application for binary trees with minimum weighted external path length. Suppose
we wish to obtain an optimal set of codes for messages M1………..Mn + 1 Each code is a
binary string that will be used for transmission of the corresponding message. At the
receiving end the code will be decoded using a decode tree. A decode tree is a binary tree in
which external nodes represent messages.
The binary bits in the code word for a message determine the branching needed at each
level of the decode tree to reach the correct external node.

If we interpret a zero as a left branch and a one as a right branch. then the decode tree from
above figure corresponds to codes 000. 001; 01, and 1 for messages M1. M2. M3. and M4.
respectively. These codes are called Huffman codes.
A very nice solution to the problem of finding a binary tree with minimum
weighted external path length has been given by D. Huffman.
Ex: Suppose we have the weights q1=2, q2=3, q3=5, q4=7, q5=9a and q6=13. Construct a
Huffman tree and find external path length.
NOTE: The number in a circular node represents the sum of weights of external nodes in
subtree.

ADS[UNIT-1] Page 11
The weighted external path length for above tree is:
2.4+3.4+5.3+13.2+7.2+9.2=93
Analysis of Huffman tree:
Heap initialization takes O(n) time.
Push and pop requires only O(log n) time.
Therefore the asymptotic time for the algorithm is: O(n log n).

ADS[UNIT-1] Page 12

External Merge Sort
No ratings yet
External Merge Sort
13 pages
PC Hardware and Software Troubleshooting
No ratings yet
PC Hardware and Software Troubleshooting
65 pages
O Level Computer Studies Notes ZIMSEC Syllabus PDF
No ratings yet
O Level Computer Studies Notes ZIMSEC Syllabus PDF
98 pages
Manual AS - 74365 - VHX-5000 - UM - 96M13026 - GB - WW - 1067-7
No ratings yet
Manual AS - 74365 - VHX-5000 - UM - 96M13026 - GB - WW - 1067-7
320 pages
Chapter - 3 Algorithms For Query Processing and Optimization PDF
No ratings yet
Chapter - 3 Algorithms For Query Processing and Optimization PDF
100 pages
Ds Mod5
No ratings yet
Ds Mod5
138 pages
Data Structures Notes 3 - TutorialsDuniya
No ratings yet
Data Structures Notes 3 - TutorialsDuniya
166 pages
Chapter 05 - Sorting and Searching Algorithms
No ratings yet
Chapter 05 - Sorting and Searching Algorithms
61 pages
IDS Mod 5
No ratings yet
IDS Mod 5
35 pages
Unit - 1
No ratings yet
Unit - 1
51 pages
I B.SC CS DS Unit V
No ratings yet
I B.SC CS DS Unit V
22 pages
DSA Ch2
No ratings yet
DSA Ch2
44 pages
Practical Consideration of Internal Sorting and External
No ratings yet
Practical Consideration of Internal Sorting and External
20 pages
Unit-II Searching, Sorting, Linked List (DSUC)
No ratings yet
Unit-II Searching, Sorting, Linked List (DSUC)
30 pages
Animation Methods - Becoming A 3D Character Animator PDF
No ratings yet
Animation Methods - Becoming A 3D Character Animator PDF
131 pages
DS unit-IV
No ratings yet
DS unit-IV
39 pages
Merge Sort
No ratings yet
Merge Sort
15 pages
Chapter7 External Sorting
No ratings yet
Chapter7 External Sorting
23 pages
L4
No ratings yet
L4
28 pages
Insertion and Merge Sort
No ratings yet
Insertion and Merge Sort
26 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
24 pages
External Sorting
No ratings yet
External Sorting
16 pages
Dsu Sem Theory
No ratings yet
Dsu Sem Theory
34 pages
External Sorting
No ratings yet
External Sorting
26 pages
External Sorting: Comp 521 - Files and Databases Fall 2010 1
No ratings yet
External Sorting: Comp 521 - Files and Databases Fall 2010 1
21 pages
Sorting and Searching
No ratings yet
Sorting and Searching
17 pages
Searching and Sorting
No ratings yet
Searching and Sorting
30 pages
Sorting Search New
No ratings yet
Sorting Search New
15 pages
Cos 211 Lecture Note
No ratings yet
Cos 211 Lecture Note
18 pages
External Sorting: Why We Need New Algorithms
No ratings yet
External Sorting: Why We Need New Algorithms
15 pages
Unit 10 Sorting: Structure Page Nos
No ratings yet
Unit 10 Sorting: Structure Page Nos
10 pages
Chapter 52
No ratings yet
Chapter 52
23 pages
Daa Miniproject
No ratings yet
Daa Miniproject
20 pages
Lec 04
No ratings yet
Lec 04
25 pages
Gcse Aqa Paper 1
No ratings yet
Gcse Aqa Paper 1
10 pages
Sorting Comp
No ratings yet
Sorting Comp
18 pages
CSC 112 - Lecture 7
No ratings yet
CSC 112 - Lecture 7
8 pages
Ch3. Sorting Techniques
No ratings yet
Ch3. Sorting Techniques
30 pages
Bundle Sorting
No ratings yet
Bundle Sorting
17 pages
Daa Part 2C
No ratings yet
Daa Part 2C
10 pages
Lec9 04
No ratings yet
Lec9 04
21 pages
Answer
No ratings yet
Answer
6 pages
Cosequential Processing (Sorting Large Files)
No ratings yet
Cosequential Processing (Sorting Large Files)
8 pages
Centre For Distance & Online Education: of +91 79966 00444 (Toll Free)
No ratings yet
Centre For Distance & Online Education: of +91 79966 00444 (Toll Free)
11 pages
CS161 Lecture 02
No ratings yet
CS161 Lecture 02
7 pages
Unit 1 - Chapter 3 - Sorting Algorithms
No ratings yet
Unit 1 - Chapter 3 - Sorting Algorithms
10 pages
Module 5 Basic Algorithm
No ratings yet
Module 5 Basic Algorithm
11 pages
Polyphase Merge
100% (1)
Polyphase Merge
9 pages
Online Instructions For Chapter 2: Divide-And-Conquer: Algorithms Analysis and Design (CO3031)
No ratings yet
Online Instructions For Chapter 2: Divide-And-Conquer: Algorithms Analysis and Design (CO3031)
16 pages
Trees & Sort: Linked Lists 1
No ratings yet
Trees & Sort: Linked Lists 1
46 pages
Searching and Sorting
No ratings yet
Searching and Sorting
27 pages
Sorting
No ratings yet
Sorting
6 pages
Cit208 Summary
No ratings yet
Cit208 Summary
22 pages
DAA Viva
No ratings yet
DAA Viva
8 pages
Jayashree External Sorting
No ratings yet
Jayashree External Sorting
5 pages
UNIT IV - Searching and Sorting
No ratings yet
UNIT IV - Searching and Sorting
21 pages
T9 Ans
No ratings yet
T9 Ans
5 pages
Da Unit IV
No ratings yet
Da Unit IV
8 pages
Group Assignment - Theory of Algorithms
No ratings yet
Group Assignment - Theory of Algorithms
11 pages
Unit I
No ratings yet
Unit I
11 pages
External Sorting Using K-Way Merge Sorting
No ratings yet
External Sorting Using K-Way Merge Sorting
8 pages
Parts of Laptop
No ratings yet
Parts of Laptop
25 pages
(A) What Is Randomized Quicksort? Analyse The Expected Running Time of Randomized Quicksort, With The Help of A Suitable Example. Answer
No ratings yet
(A) What Is Randomized Quicksort? Analyse The Expected Running Time of Randomized Quicksort, With The Help of A Suitable Example. Answer
14 pages
EMC Centera 4.0 GlobalServices Release Notes Rev.a37
No ratings yet
EMC Centera 4.0 GlobalServices Release Notes Rev.a37
94 pages
Green and Beige Modern Simple Fashion Article A4 Document
No ratings yet
Green and Beige Modern Simple Fashion Article A4 Document
84 pages
TLE-19 Courses From Deped File, LET - TLE Reviewer
No ratings yet
TLE-19 Courses From Deped File, LET - TLE Reviewer
87 pages
A + (HARDWARE PLUS), 449 QUESTION & ANSWER HCL CDC MEERUT (A+ Related Networking Course Ques & Ans) by Deepak Kumar.
88% (24)
A + (HARDWARE PLUS), 449 QUESTION & ANSWER HCL CDC MEERUT (A+ Related Networking Course Ques & Ans) by Deepak Kumar.
25 pages
Chapter 7 Memory Organization
No ratings yet
Chapter 7 Memory Organization
45 pages
Disks and Drives
No ratings yet
Disks and Drives
18 pages
Exceed ZX-A88M2: AMD A6 7480 2-Core 2-Thread 3.5-3.8ghz
No ratings yet
Exceed ZX-A88M2: AMD A6 7480 2-Core 2-Thread 3.5-3.8ghz
3 pages
Computer Abbreviations - Ritambhara Pandey
No ratings yet
Computer Abbreviations - Ritambhara Pandey
4 pages
How To Fix Raw External Hard Drive Without Formatting - (6 Best Fixes)
No ratings yet
How To Fix Raw External Hard Drive Without Formatting - (6 Best Fixes)
9 pages
HP P6000 Enterprise Virtual Array Release Notes
No ratings yet
HP P6000 Enterprise Virtual Array Release Notes
10 pages
Canon imageRUNNER ADVANCE 8285 Brochure
No ratings yet
Canon imageRUNNER ADVANCE 8285 Brochure
12 pages
ELEC 372 - : Laboratory Manual
No ratings yet
ELEC 372 - : Laboratory Manual
67 pages
Speed Fan README
No ratings yet
Speed Fan README
26 pages
Week 5: The Computer System: Central Bicol State University of Agriculture
No ratings yet
Week 5: The Computer System: Central Bicol State University of Agriculture
28 pages
Server Specifications Specification Sub-Spec
No ratings yet
Server Specifications Specification Sub-Spec
15 pages
Pages 2 User Guide
No ratings yet
Pages 2 User Guide
2 pages
Database For DUmmies - 074553
No ratings yet
Database For DUmmies - 074553
19 pages
Hioki Mr8847 01 Datalogger Datasheet
No ratings yet
Hioki Mr8847 01 Datalogger Datasheet
16 pages
BJR WKT 17
No ratings yet
BJR WKT 17
13 pages
Computer System - Gabriel Umbas
No ratings yet
Computer System - Gabriel Umbas
3 pages
PowerEdge T620 - Configuração Do Sistema - Dell Brasil
No ratings yet
PowerEdge T620 - Configuração Do Sistema - Dell Brasil
9 pages
Solidigm d7 ps1010 d7 ps1030 Product Brief
No ratings yet
Solidigm d7 ps1010 d7 ps1030 Product Brief
7 pages
Ez Recover Canyon 2570 Zipl PDF
No ratings yet
Ez Recover Canyon 2570 Zipl PDF
3 pages
MG03SCA200
No ratings yet
MG03SCA200
2 pages

Ads Unit-1

Uploaded by

Ads Unit-1

Uploaded by

UNIT-1

There are 3 types of internal sorts.

1. Insertion sort algorithm

1. Bubble sort algorithm

II. External Sorting:

Buffer Handling for Parallel Operation

Step 7: Initiate the writing of output buffer ou.

Using a tree of losers, it is possible to do better than this.

Complete binary tree with n external nodes and n - 1 internal nodes.

External nodes represent tournament players.

Root is overall winner.

• Initialize winner tree.

• Remove winner and replay.

• Remove winner and replay n times.

ƒO(n log n) time

• Total sort time is O(n log n).

• Actually O (n log n).

Winner Tree Operations

• Remove/replace winner and replay

ƒmore precisely O (log n)

Optimal Merging of Runs

You might also like