Insertion Sort Vs Merge Sort in Matlab
Insertion Sort Vs Merge Sort in Matlab
Abstract—This paper aims at comparing two algorithms of merge sort but is also unstable and slows down for very large
different types, iterative and recursive, represented by merge values of ‘n’, but may be improved by reducing the number of
sort and insertion sort algorithm, while simulating real world iterations made and increasing the number of pivots for
cases which we choose to use Matlab for accomplishing a task of stability.
sorting great amount of data. Comparing both algorithms in
terms of runtime and the relation with their time complexity, the Song Qin[2] in the year 2008 evaluates the time
impact to performance, stability and usability for these complexity O(n logn) of merge sort algorithm theoretically
algorithms will be discussed. and empirically. Comparing with insertion sort which have
quadradic value time complexity. Results showed that merge
Keywords—algorithms, sorting algorithms, insertion sort, sort is slightly faster than the insertion sort for cases when n is
merge sort, small and gets faster rapidly than insertion sort for very large
sets of n.
I. INTRODUCTION
Sorting, is the mathematical process of rearranging sets of In this paper we will be comparing two sorting algorithms
data in similar elements in a definite order, the process may be with different types of operation; recursive and iterative to
accomplished through the mathematical approach and also analyze the efficiency of the two in handling very large n. The
computation. The sorting algorithm has a long history and fastest of both groups as tested in [5] are merge sort and
wide range of uses, not only because it is applicable in real- insertion sort, therefore, they will be sampled for recursive and
life problems, ranging from basic sorting tasks to complex iterative type of sorting algorithm and also because of the
search engines algorithms, the presence of the sorting stability [2] and room for improvement [4]. Matlab will be
algorithm is widely used in this modern era of technology. used to simulate the sorting algorithms to present a more
Most programs implement these sorting algorithms according realistic scenario for real-life usage, which also creates a more
to the particular purpose to ensure they run as fast as possible realistic comparison and efficiency analysis.
under even the worst of situations. Creating a fast and efficient This paper is organized as the following. Section II
algorithm which suits a particular use case can save an amount discusses the basic theory of sorting algorithms used in the
of significant time, especially when sorting out thousands to process; merge sort and insertion sort, this section also
millions of data. contains the pseudocode for implementation of both
There are a considerable amount of factors that impact the algorithms and the worst case analysis (O) of the two. Section
performance of a sorting algorithm, which may be a III is used for the comparison data of both algorithm runtime
consideration when choosing a sorting algorithm. These which includes the comparison graph and table, plus the
factors vary from code complexity which leads to the analysis of the pseudocode with the actual runtime of the
algorithm’s time complexity, effective memory usage, and algorithm, breaking down the algorithm to create a more
even the computer hardware. It is merely impossible can cover efficient one. Finally in section IV we draw the conclusions
all the performance weakness, therefore, different algorithms and future improvements towards the paper.
are used for different constraints. II. SORTING ALGORITHM THEORY
Htwe Htwe Aung[5] in the year 2019 did an analysis and A sorting algorithm is used to rearrange elements inside
comparison of efficiency in well known sorting algorithms. an aray to decide a new order, in this case ascending.
Aung splitted the algorithms based on the types, iteration and
recursion and compared them according to time complexity A. Merge Sort algorithm
O(n2) that includes bubble, selection, and insertion sort and Merge sort is a divide-and-conquer based sorting
O(n logn) that includes heap, merge and quick sort. Results algorithm that divides a problem into smaller subproblems,
returned that iteration type sorting algorithms with O(n logn) sorts them out and then combines the subproblem to solve the
time complexity are significantly faster than those which main one. Divide and conquer involves three steps in the
require recursion or multiple arrays to work with time process:
complexity O(n2).
Divide : divide the array into two parts, if the array index
Vignesh R. and Tribikram Pradhan[4], in the year 2016 is odd, include to the first array. Keep dividing until reaching
created a modified merge sort algorithm with O(n) best case the base case.
time complexity and O(n log n) worst case time complexity.
They discovered that their algorithm is faster than the normal Conquer: sort the two base case subarrays
Combine: combine the sorted subarrays together, creating To figure out the worst case running time of Merge Sort
a sorted array from two subarrays, keep combining until the algorithm, we will need the binary tree in figure 1. When we
main sorted array is assembled. divide the array into half it can be represented by the base 2
logarithmic function lg n,and the maximum number of steps
The Merge Sort algorithm can be illustrated as follows: can be represented by lg n+1, and for each level of the tree
adds cn time. Merging the subarrays from the original array
divided to n-elements uses the runtime of cn. Therefore, the
time complexity of the algorithm is:
T(n) = cn lg n + cn (1)
By the hierarchy of growth rate, we ignore the constant
and we get:
T(n) = O(n*lg n) (2)
because merge sort always divides the array into two
halves and merges two halves which takes constant, linear
time, worst case, best case and average time complexity will
be the same as O.
B. Insertion sort algorithm
The insertion sort algorithm works similarly to the way we
sort playing cards in our hand. The way insertion sort works
can be illustrated like so.
When all elements are in reverse order, it becomes: 1000 0.014 0.027
(3)
Time (s) RUNTIME TO DATASET SIZE
Substituting (2) and (3) to (1) the we will get: 6000
(4) 4000
2000
0 n
100 1000 10000 100000 1000000
The worst case running time is expressed as (an2+bn+c) Merge sort Insertion sort
for constants a, b, and c, by the hierarchy, the time
complexity for worst case running time is:
C. Analysis
T(n) = an2 + bn + c = O(n2) (5)
According to the graph of relation between runtime and
the increase of dataset, it is discovered that the insertion sort
has a major weakness when it comes to very large datasets.
III. EXPERIMENT AND ANALYSIS The insertion sort algorithm runtime is still below the merge
The experiment is done to measure the runtime of each sort algorithm for the dataset of range 1-n with the size of n,
sorting algorithm, then we will compare both algorithm with further increase of dataset size, insertion sort runs slower
runtime and analyze the result with the worst case runtime than the merge sort and takes a major leap of time for sorting
from the pseudocode. arrays above 100 000 with a ridiculous 1,5 hour runtime.
Though the data range may be limited to 100 but the runtime
A. Machine Spesification and Conditioning could not go below the worst case time complexity of the
The Machine used for the experiment is a laptop fitted with algorithm, and the insertion sort algorithm process turned
the following specification: worst case to worse.
1) CPU: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz The increase of time of the merge sort algorithm is linear
(12 CPUs), up to 4.10GHz as the size of array gets larger, this goes with the time
2) RAM: DDR4 8GB single channel complexity of merge sort algorithm O(n*lg n). While the
3) GPU: GTX 1050ti 4GB mobile runtime of insertion sort gets drastically the greater the array
4) OS: Windows 10 Home sizes are, which the worst-case time complexity is O(n2), the
graph starts to skew at 105 data and will continue to steepen
5) Matlab version: r2016a
exponentially. Major difference is because of how the
algorithm woks, iterative sorting does not need to go through
all data multiple times, instead it breaks them into manageable REFERENCES
subarrays while the recursive sort goes through each data and
moves them to the correct order. Therefore we can say that [1] Victor S. Adamchik, Sorting, Carnegie Mellon University, [online
overall the merge sort of the iteration sorting types perform document].2009. Available: Carnegie Mellon University Computer
better than the insertion sort of the recursive type. Science Online,
https://www.cs.cmu.edu/~adamchik/15-
It is undeniable that the experiment was not fully finished 121/lectures/Sorting%20Algorithms/sorting.html [Accessed: Mar. 22,
since this process creates instability to the system, it is better 2020].
to avoid damage to the computer . One factor why the process [2] S. Qin, Merge Sort Algorithm. Florida Institute of Technology, [online
of running the algorithm took ages was because Matlab isn’t document], 2008. Available: Semantics Scholar,
truly a compiler, it cannot produce executable codes that are https://www.semanticscholar.org/paper/Merge-Sort-Algorithm-
written in computer language as other compilers do, instead Qin/6804987ab63d1879aa55ba68224dced142ce8774 [Accessed: Mar.
interprets and optimizes code the users input and solve many 22, 2020].
mathematical problems. This paper is written for the purpose [3] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction
to algorithms. Cambridge (Inglaterra): Mit Press, 2009.
of sorting algorithm comparison and analysis and, also the
[4] R. Vignesh, P. Tribikram , Merge Sort Enhanced in Place Sorting
effect of both algorithms when used to try and solve a real Algorithm. Manipal Institute of Technology, [online document], 2016.
world problem with Matlab. Though simpler methods are Available: researcchgate.net,
available like excel, but the goal is to implement algorithms, https://www.researchgate.net/publication/312963714_Merge_sort_en
and since algorithms are not always code but steps to solve hanced_in_place_sorting_algorithm [Accessed: Mar. 22, 2020].
problems then I feel Matlab is best suited for the task [5] H. Aung, “Analysis and Comparative of Soritng Algorithms”. Journal
of Trend in Scientific Research and Development, vol 3, issue 5,
D. Conclusion and future improvement August, 2019. [online serial]. Available:
https://www.academia.edu/40250937/Analysis_and_Comparative_of_
This comparison concludes that algorithms have their own Sorting_Algorithms [Accessed: Mar. 22, 2020].
constraints, merge sort algorithm is fast and stable for sorting [6] Parewa Labs, ”Merge Sort”.[online]. Available:
very large datasets while insertion sort at a certain point slows https://www.programiz.com/dsa/merge-sort [Accessed: Mar 22 2020].
down heavily. The runtime of the algorithms are strictly tied [7] “Merge Sort Algorithm”. [online]. Available:
to the worst-case time complexity, and may go even worse https://www.studytonight.com/data-structures/merge-sort [Accessed:
when using the wrong implementations like language, and Mar. 22, 2020]
compilers. [8] Parewa Labs, ”Insertion Sort”.[online]. Available:
https://www.programiz.com/dsa/insertion-sort [Accessed: Mar 22
Future improvements can be made to enhance the 2020].
performance of the program like minimizing program loops [9] “Insertion Sort Algorithm”. [online]. Available:
and jumps in the program implementation. Also when looking https://www.studytonight.com/data-structures/insertion-sort
for performance and speed, other programming languages [Accessed: Mar. 22, 2020]
offer lightweight execution to achieve better performance
while still based on the sam pseudocode.