A Variant of Bucket Sort: Shell Sort Vs Insertion Sort
A Variant of Bucket Sort: Shell Sort Vs Insertion Sort
Abstract— Sorting is a technique to rearrange a given list of sort the numbers in a list. The average time complexity for this
elements according to a comparison operator on the algorithm is of the order O(n log(n)).
elements. There are a large number of Sorting Algorithms Merge Sort [4], [5] divides input list into two halves and then
like Insertion Sort, Merge Sort, Bucket Sort, Shell Sort, etc. merges the two sorted halves. Its average time complexity is of
The efficiency of the sorting algorithm depends on many the order O(n log(n)).
factors such as memory usage patterns (the number of times Bucket Sort [6], [7], [8] is a non-comparison sorting algorithm
the sections of memory must be copied or swapped to and in which buckets distributed over a range, are created. Now, all
from the disk), the total number of comparisons and the time elements are taken and are sorted, consecutively, by putting each
requirements for the algorithms to run. one of them in their corresponding bucket.
Algorithm: (Let n is the size of the list)
In this paper, a method is proposed which combines two sorting 1. Create buckets to hold the elements according to their
algorithms (Bucket Sort and Shell Sort) in a way that takes
advantage of the strength of each to improve overall performance. magnitudes.
2. For every list element, insert it into the bucket.
Keywords— Shell Sort, Bucket Sort, Insertion Sort, Sorting, time 3. Apply Insertion Sort to sort each bucket.
complexity, algorithm comparison
4. Concatenate all the sorted buckets.
I. INTRODUCTION
Bucket Sort is mainly preferable when the input is uniformly
When a list of numbers is given, special algorithms are used to
distributed over a range. It is also very effective when an
arrange these numbers in order. These algorithms are called
optimal number of buckets are used, so as to best complement
sorting algorithms. These numbers may be alike, different, in a
the sort. One of the advantages of bucket sort is that it runs in
pattern or completely random. The most frequently used orders
linear time in the average case hence generally, its time
are numerical and lexicographical orders. Sorting is a crucial
requirements are low; as low as O(n+k) where k is a parameter.
step used in various algorithms. Hence, the need to optimize it is
Bucket sort also has some limitations. To know the number of
essential.
buckets required we should be aware of the value of the
There have been many attempts made to analyze the complexity
maximum element in the input list. Another limitation is that if
of sorting algorithms [1] which led to the invention of various
all the elements enter in one bucket then it transforms into a
interesting and efficient algorithms as well as improvements in
normal insertion sort which reduces its efficiency.
the already existing ones. To derive new and improved
Shell Sort [9], [10] is mainly a variation of Insertion Sort where
algorithms like the one discussed in this paper, it is required to
we allow the swapping of non-continuous numbers in a list. In
study and understand the existing algorithms. The following
Shell Sort, we make the given list gap-sorted for a large value of
algorithms have been used to compare the performance of the
gap and keep reducing its value until it becomes 1.
proposed one.
Algorithm:
Insertion Sort [2], [3] is a simple algorithm which iterates,
1. Initialize the value of gap.
checking one input element at once, and growing the sorted
output list. It repeats until no input elements remain. Its average 2. Divide the list into smaller lists of equal interval of gap.
case time complexity is Quadratic in nature; O(n2). 3. Sort these sub-lists at the interval of gap.
java.util.Arrays.sort() is an inbuilt method used to sort a list in 4. Repeat until complete list is sorted.
java. It uses a variation of Quicksort (dual-pivot Quicksort) to
One of the main advantages of Shell Sort is that it is efficient for return {m, floor value of sqrt of length of input}
small lists and their repetitive sorting which would be of great function hash(int i, int[] code) is
significance when arranging elements of the bucket. Also, it is return (i/code[0] * (code[1] - 1))
manifolds faster than Bubble Sort and Insertion Sort. Although end function hash
its time complexity depends on the value of the gap sequence
obtained during the algorithm’s execution, Shell Sort generally function sort(List bucket) is
needs O(n log(n)) time to sort a given list. n <- size of bucket
Shell Sort’s major limitation is that it is complex to understand. for gap := n/2 to (> 0) and (reducing the gap to gap/2
Another disadvantage is that its running time is heavily after each iteration)
dependent on the gap sequence it uses. It is not as efficient as for i := gap to (less than n)
Merge Sort or Quicksort but it can provide improvements over temp <- element at index i in the bucket
Insertion Sort.
for j := i to (element at index (j - gap) > temp and
The improvements provided by Shell Sort are clearly explained
in the subsequent section on the proposed algorithm. j >= gap) and (j= j-gap after each iteration)
element at j in the bucket <= element at (j - gap)
II. PROPOSED WORK end for
As mentioned in the previous section, the proposed algorithm is end for
an improvement over the original Bucket Sort algorithm which end for
is used to sort the numbers of a list by using buckets. The end function sort
buckets define magnitudes (range) of the elements which they
can hold. After the elements are put into their corresponding III. FLOW CHART
buckets, Insertion sort is used to arrange them in the correct
order. In this (proposed) algorithm, Shell Sort is used instead of
Insertion Sort to arrange the elements inside the bucket as it
generally requires lesser time.
A. Detailed Algorithm
The flow chart of this algorithm remains the same as that of the
unimproved Bucket Sort. This is because the change observed is
in the way buckets are sorted (using Shell Sort instead of
Insertion Sort). The following section provides a case study of
the proposed algorithm with a sample list of ten numbers.
IV. CASE STUDY
To explain the flowchart from the previous section, here a case
study has been shown. A list of ten integers has been used to
demonstrate the implementation of the proposed algorithm. The
algorithm first performs Bucket Sort on the given list by putting
all of the elements into their respective buckets:
As seen in the case study, Shell Sort uses a gap sequence to sort
the elements. The following section puts light on how well the
proposed algorithms performs by comparing it with other sorting
techniques which it closely resembles or are one of the most
efficient ones for sorting lists. This section also describes the
experimental setup used for the same.
V. RESULT
This segment elucidates the results obtained for the purpose of
this paper. The setup used for the same is also explained.
A. Experimental Setup
The experimental setup used for the determination of the results
of this paper is as follows.
• While recording the time taken by different algorithms,
Figure 2: Implementation of Bucket Sort algorithm
all connections to any network were cut and all
Now, when the numbers have been put into their respective background processes were reduced to a bare
buckets, Shell Sort is implemented to arrange the numbers of the minimum.
individual buckets. The following is a demonstration of the same • The NetBeans IDE was used to calculate the time
as one of the buckets. Here, one of the above buckets is sorted requirements for different algorithms. Care was taken
using Shell Sort to demonstrate the method: that each sorting algorithm was provided with similar
conditions for taking the readings.
As mentioned before, the proposed algorithm is an upgrade over 2. The second list, List 2, contains integers from 0 to
Bucket Sort as the latter requires the length of the list along with 1,000,000 (one million), 0 included. The numbers are
the list of numbers to be given as parameters which the arranged in the descending order.
improved algorithm does not.
3. The third list, List 3, contains integers from 0 to
Internally, Bucket Sort uses Insertion Sort to arrange the
numbers in a bucket. This algorithm uses Shell Sort for the same 1,000,000 (one million), 0 included. The numbers are
which leads to a reduction in the time requirements. arranged in the ascending order.
The improved algorithm has been put to test and the results have 4. The fourth list, List 4, contains integer '42' written
been computed and compared with other sorting techniques [11], 1,000,000 times, repeatedly.
[12] which it closely resembles or are one of the most efficient 5. The fifth list, List 5, contains integers ranging from 1 to
ones for sorting lists. 100 except for one of them, which is 1,000,000 (one
B. Time Complexity of Various Algorithms Used million). This list contains repetitions.
Method\List Best Average Worst The following section summarizes the results obtained by
mentioning and explaining the limitations surpassed.
Bucket Sort - n+k n2 * k VI. CONCLUSION
Depends on Depends on As seen in the previous section, the proposed algorithm has
Shell Sort n log(n)
gap gap performed poorly in a list with all elements already in sorted
order. However, it has conquered various limitations faced by
Insertion Sort n n2 n2
the corresponding traditional algorithms. It should be noted that
Proposed Depends on Depends on the following conclusions have been drawn by inferring Chart 2
n log(n + k)
Algorithm gap gap from the previous section.
Quicksort n log(n) n log(n) n2 A. Limitations Surpassed
1. There is no need to give the maximum value in the
Merge Sort n log(n) n log(n) n log(n)
input list in the proposed algorithm.
Chart 1: Time taken by various algorithms (n representation) [13], [14].
2. The proposed algorithm overcomes the limitation of
unevenly distributed numbers in the list for Bucket
Sort.
Method\List List 1 List 2 List 3 List 4 List 5
3. If a list contains same number occurring multiple times,
Bucket Sort 2274 4355 70 49 1147092 Shell Sort would take more time than the proposed
algorithm.
Shell Sort 199 33 15 833773 9111
B. Explanation
Insertion Sort 107125 224729 5 6 106269 The original Bucket Sort algorithm uses Insertion Sort
Proposed internally, to sort the elements of a bucket but this improved
278 178 177 115 251 method, uses Shell Sort to do the same.
Algorithm
When two elements in a bucket are to be sorted, a swap occurs.
Quicksort 167 31 6 6 85 The swap occurs in a non-contiguous segment and moves the
item over a greater distance within the overall list. Insertion
Merge Sort 213 112 103 111 169 Sort, on the other hand, only moves the item one position at a
time. This means that in Shell Sort, the items being swapped are
Chart 2: Time taken (in milliseconds) by different algorithms to more likely to be closer to its final position then Insertion Sort.
execute.
Since the items are more likely to be closer to its final position,
the list itself become partially sorted. Also, when the gap (as in
Each list mentioned in the test cases contain 1,000,000 (one
gap between the two elements to be compared) equals 1, Shell
million) integers.
Sort is performing basically like Insertion Sort. This generally
1. The first list, List 1, contains random integers lying in
happens when most of the sorting process has been already
between 0 to 1,000,000 (one million); 0 included. This carried out, leaving the list partially sorted. In this case, Shell
list may contain repetitions. Sort will be able to work very fast, since Insertion Sort is fast
when the list is almost in order.