Paper 11 PDF
Paper 11 PDF
Review
A R T I C L E I N F O A BS T RAC T
Keywords: Cloud computing is a modern paradigm to provide services through the Internet. Load balancing is a key aspect
Cloud computing of cloud computing and avoids the situation in which some nodes become overloaded while the others are idle
Load balancing or have little work to do. Load balancing can improve the Quality of Service (QoS) metrics, including response
Task scheduling time, cost, throughput, performance and resource utilization.
Hadoop MapReduce
In this paper, we study the literature on the task scheduling and load-balancing algorithms and present a new
classification of such algorithms, for example, Hadoop MapReduce load balancing category, Natural
Phenomena-based load balancing category, Agent-based load balancing category, General load balancing
category, application-oriented category, network-aware category, and workflow specific category. Furthermore,
we provide a review in each of these seven categories. Also. We provide insights into the identification of open
issues and guidelines for future research.
Cloud computing is a modern technology in the computer field to • Milani and Navimipour (2016) have presented a systematic review
provide services to clients at any time. In a cloud computing system, of the existing load balancing techniques. They classified the existing
resources are distributed all around the world for faster servicing to techniques based on different parameters. The authors compared
clients (Dasgupta et al., 2013; Apostu et al., 2013). The clients can some popular load-balancing algorithms and presented their main
easily access information via various devices such as laptops, cell properties, including their advantages and disadvantages. They also
phones, PDAs, and tablets. Cloud computing has faced many chal- addressed the challenges of these algorithms and mentioned the
lenges, including security, efficient load balancing, resource scheduling, open issues. However, their work lacks a discussion regarding the
scaling, QoS management, data center energy consumption, data lock- load balancing and task scheduling techniques in Hadoop
in and service availability, and performance monitoring (Kaur et al., MapReduce that is an issue nowadays.
2014; Malladi et al., 2015). Load balancing is one of the main • Mesbahi and Rahmani (2016) have studied state of the art load
challenges and concerns in cloud environments;(Jadeja and Modi, balancing techniques and the necessary requirements and consid-
2012) it is the process of assigning and reassigning the load among erations for designing and implementing suitable load-balancing
available resources in order to maximize throughput, while minimizing algorithms for cloud environments. They presented a new classifica-
the cost and response time, improving performance and resource tion of load balancing techniques, evaluated them based on suitable
utilization as well as energy saving (Singh et al., 2016; Goyal et al., metrics and discussed their pros and cons. They also found that the
2016). Service Level Agreement (SLA) and user satisfaction could be recent load balancing techniques are focusing on energy saving.
provided by excellent load balancing techniques. Therefore, providing However, their work suffers from the lack of simulating the load
the efficient load-balancing algorithms and mechanisms is a key to the balancing techniques by simulator tools; in addition, a discussion of
success of cloud computing environments. Several researches have open issues and future topics that researchers should focus on is also
been done in the field of load balancing and task scheduling in cloud missing.
environments. However, our studies showed that despite the key role of
load-balancing algorithms in cloud computing, especially in the advent • Kanakala et al. (2015a, 2015b) have analyzed the performance of
of big data, there are a few comprehensive reviews of these algorithms. load balancing techniques in cloud computing environments. They
First, we mention a few recent papers that have reviewed the load- studied several popular load-balancing algorithms and compared
⁎
Corresponding author.
E-mail addresses: [email protected] (E. Jafarnejad Ghomi), [email protected] (A. Masoud Rahmani), [email protected] (N. Nasih Qader).
http://dx.doi.org/10.1016/j.jnca.2017.04.007
Received 31 December 2016; Received in revised form 6 March 2017; Accepted 7 April 2017
Available online 08 April 2017
1084-8045/ © 2017 Elsevier Ltd. All rights reserved.
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
51
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
2.2. Taxonomy of load-balancing algorithms Fig. 2. State of the art classification of load balancing strategies.
In this subsection, we present the existing classification of load- nodes of the clusters achieve load balancing of the system. Static
balancing algorithms. In some studies (Rastogi, 2015; Mishra et al., algorithms are divided into two categories: optimal, and sub-optimal
2015; Bhatia et al., 2012) load-balancing algorithms were classified (Neeraj et al., 2014). In optimal algorithms, the data center controller
based on two factors: the state of the system and person who initiated determines information about the tasks and resources and the load
the process. Algorithms based on the state of the system are classified balancer can make an optimal allocation in a reasonable time. If the
as static and dynamic. Some static algorithms are Round Robin, Min- load balancer could not calculate an optimal decision for any reason, a
Min and Max-Min Algorithms, and Opportunistic Load Balancing sub-optimal allocation is calculated. In an approximate mechanism, the
(OLB) (Aditya et al., 2015). Some of the dynamic algorithms include load-balancing algorithm terminates after finding a good solution,
examples such as Ant Colony Optimization (ACO) (Nishant et al., namely, it does not search the whole solution space. After that, the
2012), Honey Bee Foraging (Babu et al., 2013), and Throttled (Bhatia solution is evaluated by an objective function. In a heuristic manner,
et al., 2012). Nearly all dynamic algorithms follow four steps (Neeraj load-balancing algorithms make reasonable assumptions about tasks
et al., 2014; Rathore and Chana, 2013; Rathore et al., 2013): and resources. In this way, these algorithms make more adaptive
decisions that are not limited by the assumptions. Algorithms in a
• Load monitoring: In this step, the load and the state of the sender-initiated strategy make decisions on arrival or creation of tasks,
while algorithms in a receiver-initiated strategy make load-balancing
resources are monitored
• Synchronization: In this step, the load and state information is decisions on the departure of finished tasks. In a symmetric strategy,
either sender or receiver makes load-balancing decisions (Daraghmi
exchanged.
• Rebalancing Criteria: It is necessary to calculate a new work et al., 2015; Alakeel et al., 2010; Rathore and Channa, 2011). A state of
the art classification schema is shown in Fig. 2.
distribution and then make load-balancing decisions based on this
new calculation.
• Task Migration: In this step, the actual movement of the data 2.3. Policies in dynamic load-balancing algorithms
occurs. When system decides to transfer a task or process, this step
will run. As mentioned before, dynamic load-balancing algorithms use the
current state of the system. For this purpose, they apply some policies
The characteristics of static algorithms are: (Daraghmi et al., 2015; Kanakala et al., 2014; Alakeel et al., 2010;
Yahaya et al., 2011; Mukhopadhyay et al., 2010; Babu et al., 2013;
1. They decide based on a fixed rule, for example, input load Kumar and Rana, 2015). These policies are:
2. They are not flexible
3. They need prior knowledge about the system. Transfer Policy: This policy determines the conditions under
which a task should be transferred from one node to another.
The characteristics of dynamic algorithms are: Incoming tasks enter the transfer policy, which based on a rule
determines the transfer of the task or processes it locally. This rule
1. They decide based on the current state of the system relies on the workload of each of the nodes. This policy includes task
2. They are flexible re-scheduling and task migration.
3. They improve the performance of the system Selection policy: This policy determines which task should be
transferred. It considers some factors for task selection, including
Dynamic algorithms are divided into two classes: distributed and the amount of overhead required for migration, the number of non-
non-distributed. In the distributed approach, all nodes execute the local system calls, and the execution time of the task.
dynamic load-balancing algorithm in the system and the task of load Location Policy: This policy determines which nodes are under-
balancing is shared among them (Rastogi et al., 2015). The interactions loaded, and transfers tasks to them. It checks the availability of
of the system nodes take two forms: cooperative and non-cooperative. necessary services for task migration or task rescheduling in the
In the cooperative form, the nodes work together to achieve a common targeted node.
objective, for example, to decrease the response time of all tasks. In the Information Policy: This policy collects all information regarding
non-cooperative form, each node works independently to achieve a the nodes in the system and the other policies use it for making their
local goal, for example, to decrease the response time of a local task. decision. It also determines the time when the information should be
Non-distributed algorithms are divided into two classes: centralized gathered. The relationships among different policies are as follows.
and semi-distributed. In the centralized form, a single node called the Incoming tasks are intercepted by the transfer policy, which decides
central node executes the load-balancing algorithms and it is comple- if they should be transferred to a remote node for the purpose of load
tely responsible for load balancing. The other nodes interact with the balancing. If the task is not eligible for transferring, it will be
central node. In the semi-distributed approach, nodes in the system are processed locally. If the transfer policy decides that a task should be
divided into clusters and each cluster is of centralized form. The central transferred, the location policy is triggered in order to find a remote
52
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
Table 1
Summary of load balancing policies.
Description Includes: Factors for selection a task to • Find suitable partner for transfer task. • Determine the time when the information
transfer: • Checks the availability of the services
necessary for migration within the Partner.
about nodes has to gather.
node for processing the task. If a remote partner is not found, the 3.5. Emergence of small data centers in cloud computing
task will be processed locally, otherwise, the task will be transferred
to the remote node. Information policy provides the necessary Small data centers are cheaper and consume less energy with
information for both transfer and location policies to assist them respect to large data centers. Therefore, computing resources are
in making their decisions. These descriptions are summarized in distributed all around the world. The challenge here is to design
Table 1. load-balancing algorithms for an adequate response time.
53
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
A large volume of data is produced daily, for example from, 4.1.1.2. Fair scheduler. Facebook developed the fair scheduler
Facebook, Twitter, Telegram, and WEB. These data sources together (Zaharia et al., 2009). In this algorithm, jobs are entered into pools
form big data. Hadoop is an open source framework for the storage and (multiple queues) and in the case of multiple users; one pool is
processing of big data on clusters of commodity machines (Hefny et al., assigned to each user. Fair scheduler distributes the available
2014; Chethana et al., 2016; Dsouza et al., 2015). We have summarized resources among the pools and tries to give each user a fair share of
the architecture of Hadoop in Fig. 3. Hadoop consists of two core the cluster over time, with each pool allocated a minimum number of
components, namely Hadoop Distributed File System (HDFS) for data Map and Reduce slots. If there are free slots in an idle pool, they may
storage and MapReduce for data processing. HDFS and MapReduce be allocated to other pools, while extra capacity in a pool is shared
follow master/slave architecture. A master node in HDFS is called among the jobs. In contrast to FIFO, the fair scheduler supports
NameNode and slaves or workers are called DataNodes. For storing a preemption, therefore if a pool has not received its fair share for a long
file, HDFS splits it into fixed-size blocks (i.e., 64 MB per block) and time, then the scheduler will preempt tasks in pools running over
sends them to DataNodes. NameNode does mapping of blocks to capacity in order to give the slots to the pool running under capacity. In
workers. In MapReduce, the master node is called a JobTracker and this way, a long batch job cannot block short jobs for a long time
slaves are called TaskTrakers. User's jobs are delivered to the (Polato et al., 2014; Xia et al., 2011; Zaharia et al., 2008).
JobTracker that is responsible for managing the jobs over a cluster
and assigning tasks to TaskTrackers. MapReduce provides two inter-
faces called Map and Reduce for parallel processing. In general, the 4.1.1.3. Capacity scheduler. Yahoo! developed the Capacity scheduler
Map and Reduce functions divide the data that they operate on for load to guarantee a fair allocation of resources among a large number of
balancing purposes (Sui et al., 2011). TaskTracker executes each map cluster users (Zaharia et al., 2009). For this purpose, it uses queues
and reduce task in a corresponding slot. Nodes in Hadoop spread over with a configurable number of task slots (Map or Reduce). Available
racks contained in one or several servers. resources are assigned to queues according to the priorities. If there are
free resources in some queues, they are allocated to other queues
4.1.1. Load balancing schedulers in Hadoop (Hefny et al., 2014; Chethana et al., 2016; Polato et al., 2014). Within a
Hadoop simplifies cluster programming as it takes care of load queue, the priority of jobs is determined based on the job arrival time,
balancing, parallelization, task scheduling, and fault tolerance auto- class of the job, and priority settings for users according to the Service
matically (Chethana et al., 2016; Vaidya et al., 2012; Rao et al., 2011). Level Agreement (SLA). When a slot in a TaskTracker becomes free, the
In other words, MapReduce, as the Google privacy strategy, hides the scheduler chooses a job with the longest waiting time from a queue
details of parallelization and distribution. Scheduling in Hadoop with the lowest load. Therefore, the capacity scheduler enforces cluster
MapReduce is achieved at two levels: job level and task level (Dsouza sharing among users, rather than among jobs, as is the case in the fair
et al., 2015). In job level scheduling, jobs are selected from a job queue scheduler (Dsouza et al., 2015; Gautam et al., 2015).
(based on a scheduling strategy); in task-level scheduling, tasks of the
job are scheduled. Scheduling strategies decide when and which
machine a task is to be transferred for processing (load balancing).
4.1.1.4. Delay scheduler. The delay scheduler is an optimization of the
Hadoop uses First-In-First-Out (FIFO) strategy as its default schedul-
fair scheduler, which eliminates the locality issues of the latter (Zaharia
ing, but it is pluggable for new scheduling algorithms. The scheduler is
54
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
• the job may not execute in the system due to its deadline constraints
we have to select a task of the job in front of a queue to process. It is
• Homogeneous
• Users should
processes. These tasks are called speculative tasks. The LATE scheduler
• No-support
Disadvantages
tries to find a slow task and execute an equivalent backup task on
• Dynamic
• It does
• Killing
• Killing
• ItIs use
another node. This execution is called speculative execution. If the new
• Does
• Does
• Only
• Low
• No
copy of the task executes faster, the whole job performance will
improve. The LATE Scheduler assigns priorities to slow or failed
tasks for speculative execution and then selects the fastest nodes for
job
through
time of Hadoop in heterogeneous environments.
on increasing system
and efficient
QoS
4.1.1.6. Deadline constraint scheduler. The deadline constraint
implementation
scheduler was designed to satisfy the user constraints (Kc et al.,
2010). The goals of this scheduler are: (1) to be able to give users
the
response time
• Job response
immediate feedback on whether the job can be completed within the
classification
• ItIt iscanflexible
• Distributes
• Improves
• Improves
• Improves
given deadline or not and proceed with the execution if the deadline
Advantages
• Efficient
• Suitable
• Simple
• Focus
can be met. Otherwise, users have the option to resubmit with modified
• Fast
jobs
deadline requirements, (2) to maximize the number of jobs that can be
•
run in a cluster while satisfying the time requirements of all jobs
[Dsouza et al., 2015, 2015). Experiment results showed that when
Yes
Yes
Yes
No
High
High
High
High
Low
their experiments.
Dynamic
An overview of the current load balancing scheduler in Hadoop MapReduce.
Static
Static
Static
Static
Static
Yes
Yes
Yes
Yes
No
replicas on three different DataNodes with two rules: (1) no two copies
Starvation
are on the same DataNode, (2) no two copies are on the same rack,
provided that there are enough racks. However, in replica placement,
May
Yes
No
No
No
No
High
High
High
High
Yes
Yes
Yes
No
•
Algorithm
static load balancing. The authors have found that the loose coupling
LATE
Delay
FIFO
Fair
55
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
locality for many applications. Rather than viewing the file system decreased substantially by adopting the algorithm.
and execution engine as separate and loosely coupled components, • Vernica et al. (2012) proposed a suite of adaptive techniques to
Cogset combines them closely into a distributed storage system that improve the MapReduce performance. The authors have ignored the
supports parallel processing of data at the actual storage nodes. key assumption of MapReduce that mappers run in isolation. They
Cogset consists of two stages: (1) data storage is distributed over the used an asynchronous channel called the Distributed Meta Data
cluster through partitioning and replications stage, (2) data access is Store (DMDS) to share the situation information between mappers.
achieved through a traversal stage. Due to the importance of load They used these mappers, called Situation-Aware-Mappers (SAMs),
balancing and fault tolerance, the replication mechanism is an to make traditional MapReduce more dynamic: (1) Adaptive
integral part of Cogset. The work provided a system with signifi- Mappers, (2) Adaptive Combiners, (3) Adaptive Sampling and
cantly better performance than Hadoop, in particular for small and Partitioning. Adaptive Mappers merge small partitions into a virtual
moderate data volumes; it is not fully scalable. split thus making more splits that avoid frequent check pointing and
• Ahmad et al. (2012) proposed Tarazu, a suite of optimizations of load imbalance (Doulkeridis et al., 2013). Adaptive Combiners
MapReduce, to address the problem of poor performance of perform a hash-based aggregation instead of sort-based ones. In
MapReduce in heterogeneous clusters. The authors believe that contrast to standard MapReduce, Adaptive Sampling creates local
the poor performance of MapReduce is due to two factors: (1) sampling dynamically, aggregates them, and produces a histogram.
MapReduce causes excessive and burst network communication, (2) Adaptive Partitioning can exploit the global histogram to produce
heterogeneity amplifies the Reduce load imbalance (Fadika et al., partitions of the same size for better load balancing. Although SAMs
2011). Tarazu consists of (1) Communication-Aware Load can solve the data skew problem, they cannot solve the computa-
Balancing of Map computation (CALB) across the nodes, (2) tional skew in reducers (Shadkam et al., 2014). Experimental
Communication-Aware Scheduling of Map computation (CAS) to evaluation showed that the adaptive techniques dramatically im-
avoid burst network traffic, and (3) Predictive Load Balancing of prove the MapReduce performance and especially performance
Reduce computation (PLB) across the nodes. Authors showed by stability.
simulation that using Tarazu significantly improves the performance • Yang and Chen (2015) proposed an adaptive task allocation
over a traditional Hadoop MapReduce in heterogeneous clusters. scheduler to improve MapReduce performance in heterogeneous
• Kolb et al. (2011) proposed a block-based load-balancing algorithm, clouds. The paper makes improvements on the original speculative
BlockSplit, to reduce search space of Entity Resolution (ER). ER is execution method of Hadoop (called Hadoop Speculative) and LATE
the task of identifying entities referring to the same real-world Scheduler by proposing a new scheduling scheme known as
object. ER techniques usually compare pairs of entities by evaluating Adaptive Task Allocation Scheduler (ATAS). The ATAS adopts more
multiple similarity measures. They utilize a blocking key based on accurate methods to determine the response time and backup tasks
the values of one or several entity attributes to divide the input data that affect the system, which is expected to enhance the success ratio
into multiple partitions (blocks) and restrict the subsequent match- of backup tasks and thereby effectively increase the system's ability
ing to entities of the same block. For example, it is sufficient to to respond. Simulation experiments showed that the proposed ATAS
compare entities of the same manufacturer when matching product scheme could effectively enhance the processing performance of
offers. The BlockSplit approach takes the size of the blocks into MapReduce.
account and assigns entire blocks to reduce tasks if this does not • Bok et al. (2016) proposed a scheduling scheme to minimize the
violate the load balancing constraints. Larger blocks are split into deadline miss of jobs to which deadlines are assigned when
smaller chunks based on the input partitions to enable their parallel processing large multimedia data such as video and image in
matching within multiple Reduce tasks (Kolb et al., 2012). The MapReduce frameworks. The proposed scheme improves job task
evaluation in a real cloud environment demonstrated that the processing speed by utilizing a replica node of the same data
proposed algorithm was robust against data skew and scaled with required to process jobs if a node where I/O load is excessive is
the number of available nodes. about to process the jobs. A replica node refers to another node that
• Hsueh et al. (2014) proposed a block-based load-balancing algo- has the data block required to process jobs at available nodes. If
rithm for Entity Resolution with multiple keys in MapReduce. available nodes are not found despite the expected job completion
Actually, the authors extended the BlockSplit algorithm presented time exceeding the deadline, the most non-urgent job is searched
in Kolb et al. (2011) by considering more than one blocking key. In and the corresponding job task is temporarily suspended to fasten
their algorithm, the load distribution in the Reduce phase is more the job completion time. The performance evaluation result showed
precise because an entity pair may exist in a block only when the that the proposed scheme reduced completion time and improved
number of common blocking keys between the pair exceeds a certain the deadline success ratio.
threshold (i.e., kc). Since an entity may have more than one kc key, it • Ghoneem and Kulkarni (2016) introduced an adaptive scheduling
needs to generate all the combinations of kc keys for potential key technique for MapReduce scheduler to increase efficiency and
comparisons. The proposed algorithm features in the combination- performance when it is used in the heterogeneous environment. In
based blocking and load-balanced matching. Experiments using the this model, we make the scheduler aware of cluster resources and
well-known CiteSeerX digital library showed that the proposed job requirement by providing the scheduler with a classification
algorithm was both scalable and efficient. algorithm. This algorithm classifies jobs into two categories execu-
• Hou et al. (2014) proposed a dynamic load-balancing algorithm for table and non-executable. Then the executable jobs are assigned to
Hadoop MapReduce. Their algorithm balances the workload on a the proper nodes to be executed successfully without failures, which
rack, while previous works tried to load balance between individual increase the execution time of the job. This scheduler overcomes the
DataNodes. In the standard MapReduce and its optimizations, there problems of previous schedulers such as small job starvation, a
was no way for Hadoop to guarantee that higher capability racks sticky node in fair scheduler, and the mismatch between resource
have more workload than lower capability racks. In other words, and job. The adaptive scheduler increase performance of
when assigning workload to DataNodes, the processing capacity was MapReduce model in the heterogeneous environment while mini-
irrelevant. Their work has two novelties: (1) They concentrate on mizing master node overhead and network traffic.
load balancing between racks; (2) They use Software Defined • Benifa and Dejey (2017) proposed a scheduling strategy named
Network (SDN) to improve the data transfer. The results of efficient locality and replica-aware scheduling (ELRAS) integrated
simulation experiments showed that by moving the tasks from the with an autonomous replication scheme (ARS) to enhance the data
busiest rack to a less busy one, the finished time of these tasks locality and performs consistently in the heterogeneous environ-
56
Table 3
An overview of the current load balancing strategies for Hadoop MapReduce.
Year Authors Static/ Key Idea Main Objective Advantages Disadvantages Evaluation techniques Journal/ Conference
Dynamic
2017 Ghoneem Dynamic • Handling heterogeneity • Increasing the performance of • Considering job • Finding content information is • Implementing the scheduler on Proceedings of the
and Kulkarni and scalability of MapReduce using an efficient requirements and node computationally expensive and a cluster consisted of three International Conference on
E. Jafarnejad Ghomi et al.
(2016) Hadoop MapReduce scheduling capabilities time-consuming nodes Data Engineering and
• Reducing makespan Communication Technology
• No starvation (Springer)
57
• Promote performance • Using VMWare for managing
• Considering nodes
heterogeneity
• Data locality, job types,
and job importance are
considered
• Backup tasks quickly
2014 Hsueh et al. Dynamic • Solving the Entity • Load balancing among • Using multiple keys is • It may lead to duplicated • Experiments performed in a 30- Twelfth Australian
(2014)] Resolution problem for reducers used to sort the comparison nodes cluster which contains symposium
a huge collection of • Reducing the response time of entities, so the matching three types of nodes,
entities with multiple a job step can be accelerated
blocking keys • Efficiently solves the ER
problem
2014 Hou et al. Dynamic • Balancing the workload • Decreasing the completion • Increasing the entire • Moving data between racks • Using Mumak which is Apache's IEEE, Fourth International
(2014) between different racks time of job tasks. performance of Hadoop make communication overhead Hadoop Map-Reduce simulator Conference on Big Data and
on a Hadoop cluster by • Maintain load balancing of • Decreasing task and consumes network Cloud Computing
considering the clusters completion time bandwidth
capability of DataNodes • Increasing the Map-Reduce • Do load balancing
performance between racks rather
than Data Nodes
2012 Ahmad et al. Dynamic • Proposing a suite of • Eliminating the load • Eliminate the bottleneck • Tarazu considers clusters with • Using a heterogeneous cluster of Proceedings of the
(2012) optimization for Map- balancing problem of due to shuffle or Map two classes of hardware: Atom, 90 servers comprising 10 Xeon- seventeenth international
Reduce traditional Map-Reduce phase Xeon, while literature showed based and 80 Atom-based server conference on Architectural
• Eliminating the • Load balancing in Map- that hardware in clusters has nodes Support for Programming
communication overhead of Side closely-related performance Languages and Operating
traditional MapReduce in • Load balancing in Systems (ACM)
heterogamous clusters Reduce-Side
2012 Vernica et al. Dynamic • Breaking the key Improving cluster Adding new runtime • Situation-Aware Mappers • Running experiments on a 42- ACM, International
(2012) assumption of isolation
• performance and simplify job • options to Hadoop and continuously monitor the node IBM system x iDataPlex Conference on Extending
(continued on next page)
Journal of Network and Computer Applications 88 (2017) 50–71
Table 3 (continued)
Year Authors Static/ Key Idea Main Objective Advantages Disadvantages Evaluation techniques Journal/ Conference
Dynamic
execution of Mappers in tuning made them Situation execution of mappers. However, dx340. Each node had two quad- Database Technology
standard Map-Reduce • Make Map-Reduce more Aware Situation-Aware Mappers core Intel Xeon E5540 64 bit
dynamics • Flexible Map-Reduce cannot handle computational 2.83 GHz processors, 32 GB
E. Jafarnejad Ghomi et al.
• Tasks in Situation- skew at the reducers. RAM, and four SATA disks. The
Aware mappers can cluster consisted of 336 cores
alter their execution at and 168 disks.
runtime
2011 Kolb et al. Dynamic • Even redistribution of • Increasing the effectiveness • Handling data skew in • Considering one block key for • Running experiments with real- Proceedings of the 20th ACM
(2011) data between map and and scalability of Map-Reduce Map-Reduce any entity world datasets on the Amazon international conference on
reduce tasks • Using blocking-techniques to • Suitable for may lead to imbalance in EC2 cloud computing using Information and knowledge
facilitate entity resolution heterogeneous cluster
• Itreduce phase due to using Hadoop management
• Is used for all kind of different-sized sub-block
paired-wise similarity multiple blocking
computation such as
• Itkeyconsider
as many individual
article comparison blocking key that is time-
consuming
2009 Valvåg et al. Static • Deterministic split of • Increasing the system • Ease of implementation • Reconfiguring the placement of • Using a cluster of 12 Dell Power Sixth IFIP International
(2009) input dataset to some efficiencyDecreasing the • Supports appending to, partitions in the presence of Edge 1995 machines Conference on Network and
partitions request response time readings and deleting failure, entail copying a large interconnected by an HP Parallel Computing (IEEE)
files in a name space amount of data between nodes ProCurve 4208VL with a 24-port
• Avoid bottleneck • Does not consider the 1 Gbps switched Ethernet
• Reducing the layering heterogeneity module
overhead of software
running on top of the
Map-Reduce
58
Journal of Network and Computer Applications 88 (2017) 50–71
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
International Conference on
Publication/ Presentation
Conference on Modeling,
considering its popularity and removes the replica as it is idle. The
Conference on Modeling
Simulation, and Applied
Intelligence: Modeling
Applications(Elsevier)
results proved the efficiency of the algorithm for heterogeneous
clusters and workloads.
Techniques and
and Simulation
Optimization Now that we have reviewed some approaches to load balancing in
(Elsevier)
MapReduce, it is time to investigate and analyze them. In Table 3, we
have summarized our analysis. The analysis table contains article year,
authors, key ideas, main objectives, advantages and disadvantages,
on
on
on
Evaluation techniques
CloudSim toolkit
presented. We also showed the name of the publisher.
No simulation or
CloudAnalyst
implementation
policy for detecting over-/ • Does not provide any security policy for • Simulation
• Simulation
• Simulation
4.2. Natural phenomena-based load balancing category
toolkit
•
for lower priority load
make nests for themselves. Cuckoos lay eggs in the nests of other
Lack of scalability
of scalability
power saving
transmission break
Low throughput
Avoidance
birds with similar eggs to raise their young. For this, cuckoos search
VM migration
Disadvantages
for the most suitable nests to lay eggs in order to maximize their
• Starvation
the first step, the COA is applied to detect over-utilized hosts. In the
•
•
•
second step, one or more VMs are selected to migrate from the over-
There is a single result set The task of each
ant is specialized To avoid overloads due to
utilized host to other hosts. For this, they considered all the hosts
Detection of over-/ under-loaded nodes
ant creation, it uses a timer to suicide.
system performance
to migrate all their VMs to the other host and switch them to sleep
and doing operations accordingly
• Improve resource utilization
the throughput
Time (MMT) policy is used for selecting VMs from over-utilized and
migration
• Maximizing
• Improving
• Reducing
•
• Simple
• IsUsing
waiting time of
resource
the completion
maximizing throughput
submitted jobs
• Minimizing
• Minimizing
utilization
Cuckoo • Reducing
Serve (FCFS).
cloud computing
•
balancing in cloud
Using Ant Colony
computing
• Using
• Using
behavior, authors of Kabir et al. (2015) have used ACO for load
balancing. In this algorithm, there is a head node that is chosen in
such a way that it has the highest number of neighbor nodes. Ants
Yakhchi et al.
Nishant et al.
et al. (2013)
Babu et al.
(2013)
(2012)
2015
2013
2013
2012
Year
among the cloud nodes. The main benefit of this approach lies in its
detections of over-loaded and under-loaded nodes and thereby
59
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
performing operations based on the identified nodes. select an AM for migration, (3) acceptance policies which determine
• Babu et al. (2013) proposed a honeybee-based load balancing which VMs should be accepted, and (4) a set of load balancing
technique called HBB-LB that is nature-inspired; it is inspired by heuristics of the front-end to select the initial hosts of VMs.
the honeybee foraging behavior. This technique takes into account Simulation experiments showed that agents, through autonomous
the priorities of tasks to minimize the waiting time of tasks in the and dynamic collaboration, could efficiently balance loads in a
queue. This algorithm has modeled the behavior of honeybees in distributed manner outperforming centralized approaches.
finding and reaping food. In cloud computing environments, when- • Keshvadi and Faghih (2016) proposed a multi-agent load balancing
ever a VM is overloaded with multiple tasks, these tasks have to be system in an IaaS cloud environment. Their mechanism performs
removed and submitted to the under-loaded VMs of the same data both receiver-initiated and sender-initiated approach to balance the
center. Inspired by this natural phenomenon, the authors consid- IaaS load to minimize the waiting time of the tasks and guarantee
ered the removal of tasks from overloaded nodes as the honeybees the Service Level Agreement (SLA). The mechanism presented in the
do. When a task is submitted to a VM, it updates the number of paper comprises of three agents: (1) VMM Agent, (2) Datacenter
priority tasks and the load of that VM and informs other tasks to Monitor (DM), and (3) Negotiator Ant (NA). The VMM agent
help them in choosing a VM. Actually, in this scenario, the tasks are collects the CPU, memory and bandwidth utilization of the indivi-
the honeybees and the VMs are the food sources. The experimental dual VM hosted by different types of tasks to monitor the load. A
results showed that the algorithm improved the execution time and table for storing the state of the VMs supports this agent. The DM
reduced the waiting time of tasks on the queue. agent performs information policy in a datacenter by monitoring the
VMM's information. This agent is supported by a table that
We investigated and analyzed the NPH-based category of load- maintains all information about the status and characteristics of
balancing algorithms. The results are presented in Table 4. The analysis all VMs in a datacenter. It categorizes the VMs based on their
table contains article year, authors, key ideas, main objectives, ad- characteristics. DCM agents initiate NA agents. They move to other
vantages and disadvantages, evaluation techniques, and the journal or datacenters and communicate with the DCM agent of those data-
conference that the article presented. We also showed the name of the centers to acquire the status of VMs there, searching for the desired
publisher. configuration. Simulation results showed that the proposed algo-
rithm was more efficient and there was a good improvement in the
4.3. Agent-based load balancing techniques load-balance, response time, and makespan.
• Tasquier (2015) proposed an agent-based load balancer for multi-
In this section, we have reviewed the literature that proposed agent- cloud environments. The author proposed an application-aware,
based techniques for load balancing in cloud nodes. The dynamic multi-cloud, and load-balancer based on a mobile agent paradigm.
nature of cloud computing is suitable for agent-based techniques. An The proposed architecture uses agents to monitor the status of the
agent is a piece of software that functions automatically and continu- cloud infrastructure and detects the overload and/or under-utiliza-
ously decides for itself and figures out what needs to be done to satisfy tion conditions. The multi-agent framework provides provisioning
its design objectives. A multi-agent system comprises a number of facilities to scale the application automatically to the under-loaded
agents, which interact with each other. To be successful, the agents resources and/or to new resources acquired from other cloud
have to able to cooperate, coordinate and negotiate with each other. providers. Furthermore, the agents are able to deallocate unused
Cooperation is the process of working together, coordination is the resources, thus leading to cost saving. The proposed architecture
process of reaching a state in which their actions are well suited, and in consists of three agents: (1) an executor agent, which represents the
negotiation process, they agree on some parameters (Singha et al., application running in multi-cloud environments, (2) a provisioner
2015; Sim et al., 2011). agent, which is responsible for managing the cloud infrastructure
through adding and removing resources, (3) a monitor agent, which
• Singh et al. (2015) proposed a novel autonomous agent-based load- is responsible for monitoring the overload and/or under-utilization
balancing algorithm called A2LB for cloud environments. Their conditions. Users can overview the current state of the cloud
algorithm tries to balance the load among VMs through three environment through an additional agent called controllers.
agents: load agent, channel agent, and migration agent. Load and Moreover, each agent has mobility capabilities in order to migrate
channel agents are static agents whereas migration agent is an ant, themselves autonomously on the multi-cloud infrastructure. The
which is a special category of mobile agents. Load agent controls the proposed algorithm overcame the provider lock-in challenge in the
information policy and calculates a load of VMs after allocating a cloud and it was flexible to exploit the extreme elasticity.
job. A VM Load Fitness table supports the load agent. The fitness
table maintains the list of all details of the VM properties in a data We investigated and analyzed the agent-based load balancing
center such as id, memory, a fitness value, and load status of all techniques. The results are presented in Table 5. The analysis table
VMs. Channel agent controls the transfer policy, selection policy, contains article year, authors, key ideas, main objectives, advantages
and location policy. Finally, the channel agent initiates the migration and disadvantages, evaluation techniques, and the journal or confer-
agents. They move to other data centers and communicate with the ence that the article presented. We also showed the name of the
load agent of that data center to acquire the status of VMs present publisher.
there, looking for the desired configuration. Result obtained through
implementation proved that this algorithm works satisfactorily. 4.4. General load balancing techniques
• Gutierrez-Garcia and Ramirez-Nafarrate (2015) proposed an agent-
based load balancing technique for cloud data centers. The authors In this section, we have surveyed and overviewed the literature in
proposed a collaborative agent-based problem-solving technique the field of general load balancing techniques. Although several
capable of balancing workloads across commodity and heteroge- algorithms are provided in this category, we have focused on new
neous servers by making use of VM live migration. They proposed an ones. For example, techniques such as First-In-First-Out (FIFO), Min-
agent-based load balancing architecture composed of VM agents, Min, Max-Min, Throttled, and Equally Spread Current Execution Load
server manager agents, and a front-end agent. They also proposed (ESCEL) are all belong to this category.
an agent-based load balancing mechanism for cloud environments
composed of (1) migration heuristics that determines which VM • Komarasamy and Muthuswamy (2016) proposed a novel approach
should be migrated and its destination, (2) migration policies to for dynamic load balancing in a cloud environment. They called it
60
E. Jafarnejad Ghomi et al.
Table 5
An overview of agent-based load balancing techniques.
Year Authors Key Idea Main objectives Advantages Disadvantages Evaluation techniques Journal/ Conference
2016 Keshvadi et al. • Using Multiagent paradigm for • Maximizing resource • Increase the resource • Datacenter management ants do • Simulation using CloudSim International Robotics and
(2015) dynamic load balancing across utilization utilization not have a timer for self- toolkit Automation Journal
virtual machines • Load balancing across • Avoid or reduce dynamic destroying and wait for message • Agents are programmed using (MedCrave)
• Using both senders- initiated and virtual machines migration from parent Java language
receiver-initiated approaches • Reducing the response • Reducing the migration
time cost
• Guarantee the SLA • Reducing the waiting time
of tasks in queue
2015 Singh et al. (2015) • Using software agents for load • Load balancing VMs • Improves resource • Includes heavy computations • Implementation using Java International Conference on
balancing in cloud computing • Reducing service time utilization within a • Migration agent does search for technology Advanced Computing
datacenter and multiple available VMs and is time- Technologies and
datacenters consuming Applications (Elsevier)
61
• Reduces response time
2015 Gutierrez- Garcica • Using agent- based problem-solving • Efficient load • Agents do load balancing • does not estimate VM migration • Experiments performed using Cluster Computing (Springer)
and Ramirez- technique for load balancing in a balancing in a using partial information overhead agent-based test-bed such as
Nafarrate (2015) heterogeneous environment, live distributed manner about cloud datacenters • Provide no usage prediction MapLoad and Red Hat
VM migration • Considering the mechanism • Test-bed was implemented in
heterogeneity of servers • high migration overhead Java and JADE agent platform
and VMs a central approach and is
• Itnotisfully scalable
2015 Tasquier ([ (2015) • Using agent-based paradigm for • Multi-cloud load • Using multi-cloud • Not implemented or simulated • Does not implemented or COLUMBIA International
developing an application aware balancing resources for load balancing • Does not consider Quality of simulated Publishing Journal of Cloud
multi-cloud load balancer • Using full elasticity of • Overcoming the provider Service (QoS) computing Research
cloud environments lock-in challenge in cloud
Is flexible to exploit the
• extreme elasticity
Journal of Network and Computer Applications 88 (2017) 50–71
E. Jafarnejad Ghomi et al.
Table 6
An overview of current GLB-category load balancing techniques.
Year Authors Key Idea Main objectives Advantages Disadvantages Evaluation techniques Journal/Conference
2016 Komarasamy and • Using Bin Packing algorithm and • Load balancing virtual • Handles the user requests • Using the second table, • Simulation in Indian journal of Science and
Muthuswamy (2016) VM reconfiguration for load machines during peak situation reservation table is space CloudSim toolkit Technology
balancing in cloud environment • Reducing job waiting • Improves throughput consuming
time • Increases resource • Does not consider energy saving
utilization
62
• IsImproving time
• and processingresponse
time
2015 Domanal and Reddy • Combining the Divide-and- Conquer • Maximizing resource • Maximize resource • Did not simulate in different • Simulation in IEEE International Conference on
(2015) methodology and throttled utilization utilization workload situation CloudSim toolkit cloud computing in emerging
algorithm to load balancing in cloud • Intelligently assign jobs • Reduces total execution time • Does not consider deadline markets
environment to VM for load considerably constraints
balancing
• Reducing the total
execution time of tasks
2015 Kulkarni and BA • Modifying Active VM Algorithm • Uniform allocation of • Load balancing VMs • Does not allocate the load • Simulation using IEEE International Conference on
(2015) implemented in CloudAnalyst to requests to VMs even well during the uniformly to VMs across CloudAnalyst toolkit Signal Processing, Informatics,
load balancing VMs during peak hours
• Itpeakworks
hours datacenters deployed at different Communication and Energy
• Using reservation table between • Reduce the response • Improve the elasticity geographical locations Systems
selection and allocation phases time
Journal of Network and Computer Applications 88 (2017) 50–71
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
dynamic load balancing with effective bin packing and VM reconfi- We have investigated and analyzed the general load balancing
guration (DLBPR). DLBPR maps jobs into VMs based on the category; the results are presented in Table 6. The analysis table
required processing speed of the job. The main objectives of their contains article year, authors, key ideas, main objectives, advantages
work were process the jobs within their deadline and to balance the and disadvantages, evaluation techniques, and the journal or confer-
load among the resources. In the proposed approach, the VMs are ence that the article presented. We also showed the name of the
dynamically clustered as small, medium and large according to publisher.
process speed and the jobs are mapped into a suitable VM existing in
the cluster. The clusters are sometimes overloaded due to the arrival 4.5. Application oriented load balancing techniques
of a similar kind of job. In that situation, the VMs may either split or
integrate the VMs in the data center based on the request of the job In this section, we have surveyed and overviewed the literature in
using a receiver-initiated approach. After reconfiguration, the VMs the field of application-oriented load balancing techniques.
will dynamically regroup based on the processing speed of the VMs.
The proposed methodology is composed of three tiers: (1) web tier, • Wei et al. (2015) proposed an efficient application scheduling in
(2) schedule tier, (3) resource allocation tier. Users’ requests are mobile cloud computing based on MAX–MIN ant system. Firstly,
submitted to the web tier at any arbitrary time, which are forwarded the authors presented a local mobile cloud model with detail
to the scheduler tier. The deadline-based scheduler classifies and application scheduling structure. Secondly, they presented a sche-
prioritizes the incoming jobs. These jobs are processed efficiently by duling algorithm for the mobile cloud model based on MAX–MIN
VMs in the resource allocation tier. The proposed approach auto- Ant System (MMAS). Experiments results showed that the algo-
matically improves the throughput and also increases the utilization rithm could effectively promote the performance of the mobile
of the resources. cloud.
• Domanal and Reddy (2015) proposed a hybrid scheduling algorithm • Wei et al. (2013) defined the Hybrid Local Mobile Cloud Model
for load balancing in a distributed environment by combining the (HLMCM) consisting of cloudlet and mobile devices where cloudlet
methodology of Divide-and-Conquer and Throttled algorithms re- plays the role of a central broker while both neighboring mobile
ferred to as DCBT. The authors defined two scenarios. In scenario 1, devices and cloudlet play the role of service provider. The objective
they deployed a distributed environment that consists of a client, a of application scheduling is to maximize the profit as well as a
load balancer and n nodes, which act as Request Handlers (RH) or lifetime of HLMCM while considering the capacity limitations of
servers. The requests come from different clients and the load service providers. They proposed the Hybrid Ant Colony-based
balancer assigns incoming requests or tasks to the available RHs Application Scheduling (HACAS) algorithm to solve the scheduling
or servers. In scenario 2, the CloudSim simulator was used for problem. The algorithm only considers the available resources and
simulation which consisted of a data center, VMs, servers, and the does not consider overhead when calculating the advantage ratio of
load balancer. Here, the client's requests were coming from the mobile devices for joining the cloudlet. Simulation results revealed
Internet users. In both scenarios, the DCBT algorithm was used for that when the load of the system was heavy, HACAS algorithm could
scheduling the incoming client's requests to the available RHs or select those applications with maximum profit and minimum energy
VMs depending on a load of each machine. The proposed DCBT consumption.
utilizes the VMs more efficiently while reducing the execution time • Deye et al. (2013) proposed an approach to make load balancing
of the tasks. more dynamic to better manage the QoS of multi-instance applica-
• Chien et al. (2016) proposed a novel load-balancing algorithm based tions in the cloud, the approach mainly limits the number of
on the method of estimating the end of service time. In their requests through a load balancer equipped with a queue for
algorithm, they considered the actual instant processing power of incoming user requests at given time to send and process the
VM and size of assigned jobs. They included two factors in the requests effectively. Simulation results showed that the approach
method of estimating the end-of-service time in VMs: (1) the improved the system performance.
selected VM should be able to finish it as soon as possible, (2) on • Sarood et al. (2012) developed techniques that reduce the gap
the next allocation request, the load-balancing algorithm has to between application performance on cloud and supercomputers.
estimate the time that all queuing jobs and the next incoming job are The scheme uses object migration to achieve load balance for tightly
completely done in every VM. The VM that corresponds to the coupled parallel applications executing in virtualized environments
earliest will be chosen to distribute the job. The simulation results that suffer from interfering jobs. While restoring load balance, it not
showed that the proposed algorithm improves response time and only reduces the timing penalty caused by interfering jobs but also
processing time. reduces energy consumption significantly.
• Kulkarni and BA (2015) proposed a novel VM load-balancing
algorithm that ensures a uniform assignment of requests to VMs We have investigated and analyzed the application-oriented load
even during peak hours (i.e., when the frequency of received balancing techniques; the results are presented in Table 7. The analysis
requests in the data center is very high) to ensure faster response table contains article year, authors, key ideas, main objectives, ad-
times to users. They modified the active VM algorithm implemented vantages and disadvantages, evaluation techniques, and the journal or
in the CloudAnalyst toolkit that has problems during the peak traffic conference that the article presented. We also showed the name of the
situation. For this purpose, in addition to an allocation table, they publisher.
used a reservation table between the phases of selection and
allocation of VMs. The reservation table maintains the information 4.6. Network-aware task scheduling and load balancing
of the VM reservations suggested by the load balancer to data center
controller, but they did not update the allocation table until the In this section, we have surveyed and overviewed the literature in
notification arrives from allocation phase. The proposed load the field of network-aware task scheduling and load balancing techni-
balancer takes into account both reservation table entry and ques.
allocation statistics table entry for a particular VM id to select a
VM for the next request. The simulations results showed that the • Shen et al. (2016) proposed a probabilistic network-aware task
algorithm allocated requests to VM uniformly even during peak placement for MapReduce scheduling to minimize overall data
traffic situations. transmission cost and delays and hence to reduce job completion
time while balancing the transmission cost reduction and resource
63
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
utilization. They found that a task is faced with three challenges: (1)
Processing Workshops(IEEE)
International Conference on
the available servers for running tasks dynamically change due to
mathematics (Hindawi)
Conference on Parallel
resource allocation and release over time; (2) the data fetching time
of reduce tasks depends on both the placement of reduce tasks and
Journal of Applied
41st International
the locations and sizes of the intermediate data produced by map
Data (IEEE)
tasks; (3) the link load on the routing path also has a significant
impact on the data access latency. In order to reduce the latency, the
link status of the network must be considered in the scheduling
decision. The experimental results showed that the scheduling
CloudSim
CloudSim
implementation in a public
considering dynamic resource
network resources.
time of application in the
•
load balancer is invoked
Shen et al. (2016) proposed a new cloud job scheduler with elastic
scheduling algorithm
number of instances
cloud
• No
• No
• No
• No
• No
• No
rithms.
•
profit
response time
• Inapplication
• Decreasing
• Decreasing
• scheduling
• Integrated
• Reducing
• Reducing
• Reducing
Advantages
•
•
interference of sharing
improved efficiency.
Main Objectives
resources
dynamic
Table 8. The analysis table contains article year, authors, key ideas,
•
•
•
the definitions of •
•
a Bio-inspiring •
application scheduling
application processing
algorithm
• Providing
• Verifying
• Limiting
• Using
•
et al. (2012)
Deye et al.
Wei et al.
Authors
Sarood
(2013)
(2013)
2016
2013
2013
2012
Year
queue has proposed to hold the requests, which have been removed
temporarily from the VM due to the arrival of higher priority request
64
E. Jafarnejad Ghomi et al.
Table 8
An overview of network-aware task scheduling and load balancing.
Year Authors Key Ideas Main Objectives Advantages Disadvantages Evaluation Techniques Publication/Presentation
2016 Shen et al. • Considering network • Minimizing data • Reducing job Completion • The optimality of exponential model • Implementing the algorithm on Apache IEEE International
(2016) topology and transmission transmission cost time is not knownPerformance of the HadoopConduct experiments on a Conference on cluster
cost in job scheduling • Balancing transmission • Increasing cluster model did not evaluate under different high-performance computing platform computing
cost reduction and utilization network conditions
resource utilization • Minimizing delay
2016 Shen et al. • Using elastic bandwidth • Finding a job schedule to • Minimizing total job • No automatic bandwidth reservation • Using Facebook synthesized workload IEEE International
(2016) reservation in clouds satisfy the deadline rewards • Simulated datacenter with a rack Conference on Cloud
requirements • Reducing job execution consists of 40 machinesImplementing Computing Technology an
time the algorithm Science
• Efficiency and
effectivenessReal
65
implementation
2016 Kliazovich • Using a communication- • Assigning processors to • Considering the dynamics • No practical validation of proposed • Using Winkler graph generator to Journal of Grid Computing
(2016) aware model in cloud handle computing of cloud environment solution produce workload (Springer)
computing (CA-DAG) jobsUsing network • Using context-data • No considering heterogeneity of • Testbed system architecture composed
resources for information including network cloud environment of a set of identical servers
transmission topology for scheduling
• optimizing makespan
2015 Scharf et al. • Network aware- placement • Extension of OpenStack • Increasing throughput • Fixing time granularity of control A testbed setup of OpenStack “Icehouse” IEEE 24th International
(2015) of instances by taking into Scheduler • More predictable loop to 10 s consisting of a few servers Conference on Computer
account bandwidth performance • Relying on existing filters prevents Communication and
constraints the use of bandwidth as a metric Network (ICCCN)
• No considering the actual topology of
network
Journal of Network and Computer Applications 88 (2017) 50–71
Table 9
An overview of workflow specific scheduling algorithms.
Year Authors Key Idea Main Objectives Advantages Disadvantages Evaluation techniques Journal/Conference Focus on
2017 Cai et al. (2017) • Providing dynamic • Full fill the workflow • Minimizing the cloud • Expectation and variance based task • Using ElasticSim tool Journal of Future Bag of tasks
cloud resource deadline resource renting cost execution time estimation method Generation Computer
E. Jafarnejad Ghomi et al.
scheduling algorithm • Fully use the bag overestimate the practical task Systems (Elsevier)
for BoT workflow structure execution times to some degree
• Agreedysingle type based
method for each
ready BoT
2016 Ghosh and • Priority based service • Improving average • Reducing response time • May cause starvation for low priority CloudSim simulation tool International Conference Priority based
Banerjee (2016) allocation of each user execution time • Reducing execution time requests on Inventive
request • Improving service quality • Long response time for low priority Computation Technology
jobs (IEEE)
2016 Cinque et al. • Providing failure-aware • Improving the execution • Scalable • No, consider the situations where • Implementing on a real grid Proceedings of the 31st Dependable
(2016) scheduling approaches of heavy job batches • Improving performance multicast is not available Annual ACM Symposium scheduling
and scalable monitoring • Reducing bandwidth on Applied Computing
for grid consumption (ACM)
• Improving throughput
2016 Bellavista et al. • Using the publish/ • Providing scalable • Decentralized scheduler • Not fully scalable • implemented and tested in a Journal of Future Dependable
(2016) subscribe paradigm for monitoring and novel troubleshooting • No job completion efficiency real deployment on Generation Computer scheduling
intra-domain scenarios enhanced dependable
• Aalgorithm distributed data centers Systems (Elsevier)
job scheduling • Use standard technology across Europe
to avoid vendor lock-in
• Providing failure-aware
scheduling
2016 Kianpisheh et al. the the probability avoid run-time violations on real world Cluster Computing Workflow
66
(2016)
• Using ant colony system • Minimizing
to develop a robust violation of workflow
• Reducing
of violation of workflow
• No • Simulation
workflow (Springer) Scheduling
workflow scheduler constraints constraints
• No track the workflow at runtime
• Minimizing probability • Reducing expected
of violations penalty at run-time
• Decreasing makespan
and cost
• Considering budget and
deadline of workflows
2015 Moschakisa and • Scheduling of bag of • Optimizing interlinked • Reducing Makespan • The policy of spreading parallel tasks • Proposed approach tested in Journal of Systems and Bag of tasks
Karatzaa (2015) tasks applications cloud systems • Maintaining a good cost- between the clouds is not clear a scientific federated cloud Software (Elsevier)
• Optimizing performance trade-off • No real implementation in cloud
performance and cost • Increasing utilization system
• Improving performance
2015 Zhang and Li • Using an adaptive • Proper mapping of tasks • Reducing response time • Not fully adaptive • CloudSim simulation tool Third International Workflow
(2015) heuristic algorithm for to resources • Optimizing makespan • Focus only on compute intensive Conference on Advanced Scheduling
workflow scheduling • Optimizing load applications Cloud and Big Data
balancing • No considering communication- (IEEE)
• Optimizing failure rate of intensive
tasks
2014 Jaikar et al. • Presenting priority- • Increasing resource • Considering a scientific • No providing migration policies Implementing the algorithm in IEEE 3rd International Priority based
(2014) based VM allocation utilization federated cloud • There is no cost function for cross- OpenNebula Conference on Cloud
algorithm • Reducing energy • Reducing total job cloud VM migration Computing (CloudNet)
consumption execution time
• Improving resource
utilization
• Increasing system
performance
Journal of Network and Computer Applications 88 (2017) 50–71
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
67
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
Table 10
Load balancing QoS metrics in the reviewed techniques.
# References Energy saving Migration time Response time Scalability Resource utilization Throughput Makespan
68
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
Google, Microsoft, and Amazon, other cloud providers are growing too.
In some situations, it is necessary for a cloud provider to send some
workload to another cloud provider for processing for the purpose of
load balancing. In other words, using resources of more than one cloud
provider is a critical requirement for load balancing in the future. In
this case, the cloud providers will face data lock-in problems. Our study
shows that just a few articles have paid attention to these topics.
Therefore, another interesting line for future research can be the
investigation of data lock-in and cross-cloud servicing problems.
References
Abdolhamid, M., Shafi’i, M., Bashir, M.B., 2014. Scheduling techniques in on-demand
grid as a service cloud: a review”. J. Theor. Appl. Inf. Technol. 63 (1), 10–19.
Fig. 8. Studies venue types. Abdullahi, M., Md, Asri Ngadi, Md.A., Abdulhamid, S.M., 2015. Symbiotic organism
search optimization based task scheduling in cloud computing environment. Future
have not been comprehensively and completely addressed. In our Gener. Comput. Syst. 56, 640–650.
Aditya, A., Chatterjee, U., Gobata, S., 2015. A comparative study of different static and
literature review, we found that there is not a perfect technique for dynamic load-balancing algorithm in cloud computing with special emphasis on time
improving the entire load balancing metrics. For example, some factor. Int. J. Curr. Eng. Technol. 3 (5).
techniques considered response time, resource utilization, and migra- Ahmad, F., Chakradhar,S.T., Raghunathan,A., Vijaykumar, T.N., 2012. Tarazu:
optimizing mapreduce on het-erogeneous clusters. International Conference on
tion time, while the others ignored these metrics and considered other Architectural Support for Programming Languages and Operating Systems
metrics. However, it seems that some metrics are mutually exclusive. (ASPLOS). 40(1), 61-74.
For example, relying on VM migration for load balancing may cause an Ahmad, R.W., Gani, A., Hamid, S.H.A., Shiraz, M., 2015. A survey on virtual machine
migration and server consolidation frameworks for cloud data centers. J. Netw.
increase in the response time. Service cost is another metric, which is Comput. Appl. 52, 11–25.
not considered in the studied articles. Presenting a comprehensive Alakeel, A.M., 2010. A guide to dynamic load balancing in distributed computer systems.
technique to improve as many metrics as possible is, therefore, very Int. J. Comput. Sci. Netw. Secur. 10 (6), 153–160.
Apostu, A., Puican, F., Ularu, G., George Suciu, G., Todoran, G., 2013. Study on
desirable. advantages and disadvantages of cloud computing – the advantages of telemetry
Furthermore, our study showed that the energy consumption and applications in the cloud. Recent Adv. Appl. Comput. Sci. Digit. Serv..
carbon emission are two important drawbacks due to the incremental Babu, L.D.D., Krishna, P.V., 2013. Honey bee behavior inspired load balancing of tasks in
cloud computing environments. Appl. Soft Comput. 13 (5), 2292–2303.
growth of the number of datacenters. However, just a few articles
Bellavista, P., Cinque, M., Corradi, A., Foschini, L., Frattini, F., Molina, J.P., 2016.
addressed these two drawbacks. Energy consumption is regarded as an GAMESH: a grid architecture for scalable monitoring and enhanced dependable job
economic efficiency factor while carbon emission is regarded as a scheduling. Future Gener. Comput. Syst..
health-related, and/or an environmental factor. Each of these issues is Benifa, J.V.B., Dejey, 2017. Performance improvement of MapReduce for heterogeneous
clusters based on efficient locality and Replica aware scheduling (ELRAS) strategy.
critically important. Therefore, providing load balancing mechanisms Wirel. Personal. Commun., 1–25.
in a cloud environment while also addressing these two problems is Bhatia, J., Patel, T., Trivedi, H., Majmudar, V., 2012. HTV Dynamic Load-balancing
very desirable too. algorithm for Virtual Machine Instances in Cloud. International Symposium on
Cloud and Services Computing, 15–20.
Recently, a large volume of data is produced daily from social Bok, K., Hwang, J., Jongtae Lim, J., Kim, Y., Yoo, J., 2016. An efficient MapReduce
networks, medical records, e-commerce, e-shopping, e-pay, banking scheduling scheme for processing large multimedia data. Multimed. Tools Appl.,
records, etc. This huge volume of data makes big data, and therefore 1–24.
Cai, Z., Li, X., Ruizc, R., Lia, Q., 2017. A delay-based dynamic scheduling algorithm for
needs near-perfect distribution for fast servicing. Our study showed bag-of-task workflows with stochastic task execution times in clouds. J. Future
that in recent years just a few articles addressed this topic. Further Gener. Comput. Syst. 71, 57–72.
optimization of Hadoop MapReduce for processing big data in the Chethana, R., Neelakantappa, B.B., Ramesh, B., 2016. Survey on adaptive task
assignment in heterogeneous Hadoop cluster. IEAE Int. J. Eng. 1 (1).
future research, is quite promising. Chien, N.K., Son, N.H., HD, 2016. Load-balancing algorithm Based on Estimating Finish
Recently, in addition to the existing popular cloud providers such as Time of Services in Cloud Computing, International Conference on Advanced
69
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
Commutation Technology (ICACT), 228-233. Kianpisheh, S., Charkari, N.M., Kargahi, M., 2016. Ant colony based constrained
Cinque, M., Corradi, A., Luca Foschini,L., Frattini, F., Mol, J.P., 2016. Scalable workflow scheduling for heterogeneous computing systems. Clust. Comput. 19,
Monitoring and Dependable Job Scheduling Support for Multi-domain Grid 1053–1070.
Infrastructures. In: Proceedings of the 31st Annual ACM Symposium on Applied Kliazovich, D., Pecero, J.E., Tchernykh, A., Bouvry, P., Khan, S.U., Zomaya, A.Y., 2016.
Computing. CA-DAG: modeling communication-aware applications for scheduling in cloud
Dagli, M.K., Mehta, B.B., 2014. Big data and Hadoop: a review. Int. J. Appl. Res. Eng. Sci. computing. J. Grid Comput., 1–17.
2 (2), 192. Kolb, L., Thor, A., Rahm, E., 2011. Block-based Load Balancing for Entity Resolution
Daraghmi, E.Y., Yuan, S.M., 2015. A small world based overlay network for improving with MapReduce. International Conference on Information and Knowledge
dynamic load-balancing. J. Syst. Softw. 107, 187–203. Management (CIKM), 2397–2400.
Dasgupta, K., Mandalb, B., Duttac, P., Mondald, J.K., Dame, S., 2013. A Genetic Kolb, L., Thor, A., Rahm, E., 2012. Load Balancing for MapReduce-based Entity
Algorithm (GA) based Load-balancing strategy for Cloud Computing, International Resolution, IEEE In: Proceedings of the 28th International Conference on Data
Conference on Computational Intelligence: Modeling Techniques and Applications Engineering, 618-629.
(CIMTA), 10, 340-347. Komarasamy, D., Muthuswamy, V., 2016. A novel approach for dynamic load balancing
Destanoğlu, O., Sevilgen, F.E., 2008. Randomized Hydrodynamic Load Balancing with effective Bin packing and VM reconfiguration in cloud. Indian J. Sci. Technol. 9
Approach, IEEE International Conference on Parallel Processing, 1, 196-203. (11), 1–6.
Deye, M.M., Slimani, Y., sene, M., 2013. Load Balancing approach for QoS management Koomey, J.G., 2008. Worldwide electricity used in datacenters. Environ. Res. Lett. 3 (3),
of multi-instance applications in Clouds. Proceeding on International Conference on 034008.
Cloud Computing and Big Data, 119–126. Kulkarni, A.K., B, A, 2015. Load-balancing strategy for Optimal Peak Hour Performance
Domanal, S.G., Reddy, G.R.M., 2015. Load Balancing in Cloud Environment using a in Cloud Datacenters. In: Proceedings of theIEEE International Conference on Signal
Novel Hybrid Scheduling Algorithm. IEEE International Conference on Cloud Processing, Informatics, Communication and Energy Systems (SPICES).
Computing in Emerging Markets, 37-42. Kumar, S., Rana, D.H., 2015. Various dynamic load-balancing algorithms in cloud
Doulkeridis, C., Nørvåg, K., 2013. A survey of large-scale analytical query processing in environment: a survey. Int. J. Comput. Appl. 129 (6).
MapReduce. VLDB J., 1–26. Lee, K.H., Choi, H., Moon, B., 2011. Parallel data processing with MapReduce: a survey.
Dsouza, M.B., 2015. A survey of Hadoop MapReduce scheduling algorithms. Int. J. SIGMOD Rec. 40 (4), 11–20.
Innov. Res. Comput. Commun. Eng. 3 (7). Li, R., Hu, H., Li, H., Wu, Y., Yang, J., 2015. MapReduce parallel programming model: a
Fadika, Z., Dede, E., Govidaraju, M., 2011. Benchmarking MapReduce Implementations state-of-the-art survey. Int. J. Parallel Program., 1–35.
for Application Usage Scenarios. In: 2011 IEEE/ACM Proceedings of the 12th Lin, C.Y., Lin, Y.C., 2015. A Load-Balancing Algorithm for Hadoop Distributed File
International Conference on Grid Computing, 0, 90–97. System, International Conference on Network-Based Information Systems.
Farrag, A.A.S., Mahmoud, S.A., 2015. Intelligent Cloud Algorithms for Load Balancing Lua, Y., Xie, Q., Klito, G., Geller, A., Larus, J.R., Greenberg, A., 2011. Join-Idle-Queue: a
problems: A Survey. IEEE In: Proceedings of the Seventh International Conference novel load-balancing algorithm for dynamically scalable web services. Int. J.
on Intelligent Computing and Information Systems (ICICIS 'J 5), 210-216. Perform. Eval. 68, 1056–1071.
Gautam, J.V., Prajapati, H.B., Dabhi, V.K., Chaudhary, S., 2015. A Survey on Job Malladi, R.R., 2015. An approach to load balancing In cloud computing. Int. J. Innov.
Scheduling Algorithms in Big Data Processing. IEEE International Conference on Res. Sci. Eng. Technol. 4 (5), 3769–3777.
Electrical, Computer and Communication Technologies (ICECCT’15), 1-11. Manjaly, J.S., A, CE, 2013. Relative study on task schedulers in Hadoop MapReduce. Int.
Ghoneem, M., Kulkarni, L., 2016. An Adaptive MapReduce Scheduler for Scalable J. Adv. Res. Comput. Sci. Softw. Eng. 3 (5).
Heterogeneous Systems. Proceeding of the International Conference on Data Mesbahi, M., Rahmani, A.M., 2016. Load balancing in cloud computing: a state of the art
Engineering and Communication Technology, 603–6011. survey. Int. J. Mod. Educ. Comput. Sci. 8 (3), 64.
Ghosh, S., Banerjee, C., 2016. Priority Based Modified Throttled Algorithm in Cloud Milani, A.S., Navimipour, N.J., 2016. Load balancing mechanisms and techniques in the
Computing. International Conference on Inventive Computation Technology. cloud environments: systematic literature review and future trends. J. Netw.
Goyal, S., Verma, M.K., 2016. Load balancing techniques in cloud computing Comput. Appl. 71, 86–98.
environment: a review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6 (4). Mishra, N.K., Misha, N., 2015. Load balancing techniques: need, objectives and major
Gupta, H., Sahu, K., 2014. Honey bee behavior based load balancing of tasks in cloud challenges in cloud computing: a systematic review. Int. J. Comput. 131 (18).
computing. Int. J. Sci. Res. 3 (6). Moschakisa, I.A., Karatzaa, H.D., 2015. Multi-criteria scheduling of Bag-of-Tasks
Gutierrez-Garcia, J.O., Ramirez-Nafarrate, A., 2015. Agent-based load balancing in applications on heterogeneous interlinked clouds with simulated annealing. J. Softw.
Cloud data centers. Clust. Comput. 18 (3), 1041–1062. Syst. 101, 1–14.
Hefny, H.A., Khafagy, M.H., Ahmed, M.W., 2014. Comparative study load balance Mukhopadhyay, R., Ghosh, D. , Mukherjee, N., 2010. A Study on the application of
algorithms for MapReduce environment. Int. Appl. Inf. Syst. 106 (18), 41. existing load-balancing algorithms for large, dynamic, and heterogeneous distributed
Hou, X., Kumar, A., Varadharajan, V., 2014. Dynamic Workload Balancing for Hadoop systems ACM, A Study on the Application of Existing Load-balancing algorithms for
MapReduce. Proceeding of International Conference on Big data and Cloud Large, Dynamic, and Heterogeneous Distributed System. In Proceedings of 9th
Computing, 56-62. International Conference on Software Engineering, Parallel and Distributed Systems,
Hsueh, S.C., Lin, M.Y., Chiu, Y.C., 2014. A load-balanced MapReduce algorithm for 238–243 .
blocking-based entity-resolution with multiple keys. Parallel Distrib. Comput. Neeraj, R., Chana, I., 2014. Load balancing and job migration techniques in grid: a survey
(AusPDC), 3. of recent trends. Wirel. Personal. Commun. 79 (3), 2089–2125.
Hwang, K., Dongarra, J., Fox, G.C., 2013. Distributed and Cloud Computing: from Nishant, K., Sharma, P., Krishna, V., Gupta, C., Singh, K.P., Nitin, N., Rastogi,R., 2012.
Parallel Processing to the Internet of Things. Load Balancing of Nodes in Cloud Using Ant Colony Optimization. In: Proceedings of
Ivanisenko, I.N., Radivilova, T.A., 2015. Survey of Major Load-balancing algorithms in the 14th International Conference on Modelling and Simulation, 3-8.
Distributed System. Information Technologies in Innovation Business Conference Nuaimi, K., Mohamed, N., Mariam Al-Nuaimi, M., Al-Jaroodi, J., 2012. A Survey of Load
(ITIB). Balancing in Cloud Computing: Challenges and Algorithms, IEEE In: Proceedings of
Jadeja, Y., Modi, K., 2012. Cloud Computing - Concepts, Architecture and Challenges. the Second Symposium on Network Cloud Computing and Applications.
International Conference on Computing, Electronics and Electrical Technologies Palta, R., Jeet, R., 2014. Load balancing in the cloud computing using virtual machine
[ICCEET]. migration: a review. Int. J. Appl. Innov. Eng. Manag. 3 (5), 437–441.
Jaikar, A., Dada, H., Kim, G.R., Noh, S.Y., 2014. Priority-based Virtual Machine Load Patel, H.M., 2015. A comparative analysis of MapReduce scheduling algorithms for
Balancing in a Scientific Federated Cloud. IEEE In: Proceedings of the 3rd Hadoop. Int. J. Innov. Emerg. Res. Eng. 2 (2).
International Conference on Cloud Computing. Polato, I., Re, R., Goldman, A., Kon, F., 2014. A comprehensive view of Hadoop research
Kabir, M.S., Kabir, K.M., Islam, R., 2015. Process of load balancing in cloud computing – a systematic literature review. J. Netw. Comput. Appl. 46, 1–25.
using genetic algorithm. Electr. Comput. Eng.: Int. J. 4 (2). Rajabioun, R., 2011. Cuckoo optimization algorithm. Appl. Soft Comput. 11, 5508–5518.
Kanakala, V.R.T., Reddy, V.K., 2015a. Performance analysis of load balancing techniques Randles, M., Lamb, D., Tareb-Bendia, A., 2010. A Comparative Study into Distributed
in cloud computing environment. TELKOMNIKA Indones. J. Electr. Eng. 13 (3), Load-balancing algorithms for Cloud Computing, IEEE In: Proceedings of the 24th
568–573. International Conference on Advanced Information Networking and Applications
Kanakala, V.R.T., Reddy, V.K., 2015b. Performance analysis of load balancing techniques Workshops, pp. 551–556.
in cloud computing environment. TELKOMNIKA Indones. J. Electr. Eng. 13 (3), Rao, B.T., Reddy, L.S.S., 2011. Survey on improved scheduling in Hadoop MapReduce in
568–573. cloud environments. Int. J. Comput. Appl. 34 (9).
Kansal, N.J., Inderveer Chana, I., 2012. Cloud load balancing techniques: a step towards Rastogi, G., Sushil, R., 2015. Analytical Literature Survey on Existing Load Balancing
green computing. Int. J. Comput. Sci. Issues 9 (1), 238–246. Schemes in Cloud Computing. International Conference on Green Computing and
Kaur, R., Luthra, P., 2014. Load Balancing in Cloud Computing, International Internet of Things (ICGCloT).
Conference on Recent Trends in Information. Telecommunication and Computing, Rathore, N., Channa, I., 2011. A Cognitive Analysis of Load Balancing and job migration
ITC, pp. 1–8. Technique in Grid World Congress on Information and Communication
Kc, K., Anyanwu, K., 2010. Scheduling Hadoop Jobs to Meet Deadlines. In: Proceedings Technologies Congr. Inf. Commun. Technol. (WICT). pp. 77–82.
of the 2nd IEEE International Conference on Cloud Computing Technology and Rathore, N., Chana, I., 2013. A Sender Initiate Based Hierarchical Load Balancing
Science (CloudCom), 388–392. Technique for Grid Using Variable Threshold Value. Signal Processing, Computing
Keshvadi, S., Faghih, B., 2016. A multi-agent based load balancing system in IaaS cloud and Control (ISPCC), IEEE International Conference.
environment. Int. Robot. Autom. J. 1 (1). Ray, S., Sarkar, A.D., 2012. Execution analysis of load-balancing algorithms in cloud
Khalil, S., Salem, S.A., Nassar, S., Saad, E.M., 2013. Mapreduce performance in computing environment. Int. J. Cloud Comput.: Serv. Archit. (IJCCSA) 2 (5).
heterogeneous environments: a review. Int. J. Sci. Eng. Res. 4 (4), 410–416. Sarood, O., Gupta, A., Kale, L.V., 2012. Cloud Friendly Load Balancing for HPC
Khiyaita, A., Zbakh, M., Bakkali, H.E.I., Kettani, D.E.I., 2012. Load balancing cloud Applications: Preliminary Work. International Conference on Parallel Processing
computing: state of art. Netw. Secur. Syst. (JNS2), 106–109. Workshops, 200–205.
70
E. Jafarnejad Ghomi et al. Journal of Network and Computer Applications 88 (2017) 50–71
Scharf, M., Stein, M., Voith,T., Hilt, V., 2015. Network-aware Instance Scheduling in Network and Parallel Computing, 174–181.
OpenStack. International Conference on Computer Communication and Network Vasic, N., Barisits, M., 2009. Salzgeber, V. Making Cluster Applications Energy-Aware, In
(ICCCN), 1-6. ACDC ’09 In: Proceedings of the 1st Workshop on Automated Control for
Selvi, R.T., Aruna, R., 2016. Longest approximate time to end scheduling algorithm in Datacenters and Clouds, ACM, New York, NY, USA, pp. 37–42.
Hadoop environment. Int. J. Adv. Res. Manag. Archit. Technol. Eng. 2 (6). Vernica, R., Balmin, A., Beyer, K.S., Ercegovac, V., 2012. Adaptive MapReduce using
Shadkam, E., Bijari, M., 2014. Evaluation the efficiency of cuckoo optimization situation-aware mappers. International Conference on Extending Database
algorithm. Int. J. Comput. Sci. Appl. 4 (2), 39–47. Technology (EDBT), 420–431.
Shaikh, B., Shinde, K., Borde, S., 2017. Challenges of big data processing and scheduling Wei, X., Fan, J., Lu, Z., Ding, K., 2013,. Application scheduling in mobile cloud
of processes using various Hadoop Schedulers: a survey. Int. Multifaceted Multiling. computing with load balancing. J. Appl. Math., 1–13.
Stud. 3, 12. Wei, X., Fan, J., Wang, T., Wang, Q., 2015. Efficient application scheduling in mobile
Shen, H., Sarker, A., Yuy, L., Feng Deng, F., 2016. Probabilistic Network-Aware Task cloud computing based on MAX–MIN ant system. Soft Comput., 1–15.
Placement for MapReduce Scheduling. In: Proceedings of the IEEE International Xia, Y., Wang, L., Zhao, Q., Zhang, G., 2011. Research on job scheduling algorithm in
Conference on Cluster Computing. Hadoop. J. Comput. Inf. Syst. 7, 5769–5775.
Shen, H., Yu, L., Chen,L., Li, Z., 2016. Goodbye to Fixed Bandwidth Reservation: Job Yahaya, B., Latip, R., Othman, M., Abdullah, A., 2011. Dynamic load balancing policy
Scheduling with Elastic Bandwidth Reservation in Clouds. In: Proceedings of the with communication and computation elements in grid computing with multi-agent
International Conference on Cloud Computing Technology and Science. system integration. Int. J. New Comput. Archit. Appl. (IJNCAA) 1 (3), 757–765.
Sidhu, A.K., Kinger, S., 2013. Analysis of load balancing techniques in cloud computing. Yakhchi, M., Ghafari, S.M., Yakhchi, S., Fazeliy, M., Patooghi, A., 2015. Proposing a Load
Int. J. Comput. Technol. 4 (2). Balancing Method Based on Cuckoo Optimization Algorithm for Energy
Sim, K.M., 2011. Agent-based cloud computing. IEEE Trans. Serv. Comput. 5 (4), Management in Cloud Computing Infrastructures. Published In: Proceedings of the
564–577. 6th International Conference on Modeling, Simulation, and Applied Optimization
Singh, P., Baaga, P., Gupta, S., 2016. Assorted load-balancing algorithms in cloud (ICMSAO).
computing: a survey”. Int. J. Comput. Appl. 143 (7). Yang, S.J., Chen, Y.R., 2015. Design adaptive task allocation scheduler to improve
Singha, A., Juneja, D., Malhotra, M., 2015. Autonomous Agent Based Load-balancing MapReduce performance in heterogeneous clouds. J. Netw. Comput. Appl. 57,
algorithm in Cloud Computing. International Conference on Advanced Computing 61–70.
Technologies and Applications (ICACTA), 45, 832–841. Zaharia, M., 2009. Job Scheduling with the Fair and Capacity Schedulers 9. Berkley
Sui, Z., Pallickara, S., 2011. A survey of load balancing techniques forData intensive University.
computing. In. In: Furht, Borko, Escalante, Armando (Eds.), Handbook of Data Zaharia, M., Borthakur, D., Sarma, J.S., 2010. Delay Scheduling: A Simple Technique for
Intensive Computing. Springer, New York, 157–168. Achieving Locality and Fairness in Cluster Scheduling, in Proceedings of the
Tasquier, L., 2015. Agent based load-balancer for multi-cloud environments. Columbia European conference on Computer systems (EuroSys'10), 265–278.
Int. Publ. J. Cloud Comput. Res. 1 (1), 35–49. Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I., 2008. Improving
Vaidya, M., 2012. Parallel processing of cluster by Map Reduce. Int. J. Distrib. Parallel MapReduce Performance in Heterogeneous Environments. In: Proceedings of the
Syst. 3 (1). 8th conference on Symposium on Opearting Systems Design and Implementation,
Valvåg, S.V., 2011. Cogset: A High-Performance MapReduce Engine. Faculty of Science 29–42.
and Technology Department of Computer Science, University of Tromsö, 14. Zhang, Y., Li, Y., 2015. An improved Adaptive workflow scheduling Algorithm in cloud
Valvåg, S.V., Johansen, D., 2009. Cogset: A unified engine for reliable storage and environments. In: Proceedings of the Third International Conference on Advanced
parallel processing, In: Proceedings of the Sixth IFIP International Conference on Cloud and Big Data, 112-116.
71