module 4

Aneka is a cloud application development platform that provides a framework for building, deploying, and managing applications on cloud infrastructures. It includes various services such as storage management, accounting, resource reservation, and execution management, categorized into fabric, foundation, and application services. The platform supports multiple programming models and offers tools for infrastructure, platform, and application management, making it suitable for data-intensive computing tasks.

Uploaded by

martinphilson356

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

module 4

Uploaded by

martinphilson356

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

MODULE IV

• ANEKA
• DATA INTENSIVE COMPUTING
Topic 1 :
ANATOMY OF THE ANEKA CONTAINER / SERVICES
PROVIDED BY ANEKA / FRAMEWORK(15MARK)
• Aneka is a cloud application development and management platform
developed by Manjrasoft, an Australian company.
• It provides a framework for building, deploying, and managing
applications on private or public cloud infrastructures.
• Aneka is Manjrasoft's solution for developing, deploying, and
managing Cloud applications.
• Aneka is a Pure PaaS solution for Cloud computing.
• One of the key advantages of Aneka is its extensible set of APIs
associated with different types of programming models.
Services installed in the Aneka container
The services installed in the Aneka container can be classified into
three major categories:

1. fabric services
2. foundation services
3. application services
• Foundation Services
1. Storage Management
Aneka offers two different facilities for storage management:
• A centralized file storage, all files are stored on a single server or
storage system,which is mostly used for the execution of compute-
intensive applications.
-Compute intensive applications mostly require powerful
processors and do not have high demands in terms of storage,
which in many cases is used to store small files that are easily
transferred from one node to another
• A distributed file system, Files are stored across multiple servers
or locations. which is more suitable for the execution of data-
intensive applications.
-Data intensive applications are characterized by a large data files
(gigabytes or terabytes), and the processing power required by
tasks does not constitute a performance bottleneck.
2. Accounting, Billing, and Resource Pricing
-A complete history of application execution and storage as well as
other resource utilization parameter are captured and minded by the
accounting services. This information constitutes the foundation on
top of which users are charged in Aneka.
- Billing is another important feature of accounting. Aneka billing
service provides detailed information about the resource usage of
each user with the associated costs.
-Each resource can be priced differently according to the different set
of services that are available on the corresponding Aneka container or
the installed software in the node.
3. Resource Reservation
• Resource reservation allows reserving resources for exclusive use by
specific applications.
• Resource reservation is built out of two different kinds of services:
Resource Reservation and Allocation Service. The former keeps track
of all the reserved time slots in the Aneka Cloud , while the latter
manages the database of information regarding the allocated slots
on the local node.
• 3 types of reservation:(a) Basic Reservation (b) Libra Reservation
(c) Relay Reservation
• Application Services
1. Scheduling
Common tasks that are performed by the scheduling component are
the following:
● Job-to-node(virtual machines) mapping
● Rescheduling of failed jobs
● Job status monitoring
● Application status monitoring
2. Execution
• Execution services control the execution of single jobs that
compose applications.
• They are in charge of setting up the runtime environment
hosting the execution of jobs.
• unpacking the jobs received from the scheduler
• Retrieval of input files required for the job execution
• Sandboxed execution of jobs
• Submission of output files at the end of execution
• Execution failure management (i.e., capturing sufficient
contextual information useful to identify the nature of the
failure)
• Performance Monitoring
• Fabric Services

1.Profiling and Monitoring

❖ Heartbeat Service
✅ Sends periodic heartbeat signals from worker nodes to the Aneka master to

✅
indicate that they are active.
If a node stops sending heartbeats, it is marked as inactive or failed, triggering

✅
task reallocation.
Ensures fault detection and maintains system reliability by proactively
identifying failures.
❖ Monitoring Service
✅ Continuously tracks the status, and performance of worker nodes and

✅ Detects failures, and resource underutilization to optimize execution.

resources.

✅ Helps in decision-making for workload balancing.

❖ Reporting Service
✅ Collects execution logs, resource usage data, and application performance

✅
metrics.
Generates detailed reports for performance analysis, debugging, and
optimization.
2.Resource Management
❖ Resource Membership (Index Service or Membership Catalogue)
✅
✅
Maintains a catalog of available resources in the Aneka cloud environment.
The Index Service keeps an updated list of all active, idle, and busy resources.
❖ Resource Reservation
✅ Allows users or applications to pre-book specific computing resources for

✅
dedicated use.
Ensures guaranteed availability of resources for critical tasks or high-priority
jobs.
❖ Resource Provisioning
✅ Dynamically allocates and deallocates resources based on workload demand.
✅ Enables elastic scaling, where resources are added when demand is high and
released when demand decreases.
Platform Abstraction Layer (PAL)
• PAL (Programming Abstraction Layer) in Aneka is a software layer that
provides a simplified interface for developers to build and execute
applications on cloud infrastructure
• PAL provides an abstraction layer that simplifies how applications interact
with cloud resources.
• PAL providing a unified API for managing resources, supporting various
programming models, and abstracting the complexities of cloud
infrastructure.
• Helps developers focus on application logic while PAL manages the cloud
execution environment.
• Supports Multiple Programming Models – Allows developers to use
different execution models like Task, Thread, and MapReduce.
Aneka container
• Aneka containers are the execution environments that run those
applications in the cloud.
• Task Isolation – Containers isolate tasks, ensuring they don’t interfere
with each other’s execution.
• Resource Management – They manage resource allocation, ensuring
tasks are distributed efficiently across available nodes.
• Scalability – Containers allow dynamic scaling of applications based
on demand.
• Fault Tolerance – They help ensure reliability by automatically
recovering from failures.
Topic 2 :CLOUD PROGRAMMING AND MANAGEMENT

a. Aneka SDK
• Aneka SDK(software development kit) is a Platform-as-a-Service (PaaS)
framework for developing and deploying parallel and distributed applications
on cloud infrastructure.
• Multiple Programming Models: It supports Task Model, Thread Model, and
MapReduce Model, allowing developers to create applications based on
different computational needs.
• Dynamic Resource Management: Aneka enables scalable and flexible resource
provisioning, utilizing public, private, and hybrid cloud environments efficiently.
• The Aneka SDK provides support for both programming models and services by
means of the Application Model and the Service Model.
a.1.Service Model
The Aneka Service Model defines the basic requirements needed to
implement a service that can be hosted in the Aneka Cloud.
The container defines the runtime environment where services are
hosted.
Each service that is hosted in the container must be compliant with the
iServices interface, which exposes the following methods and
properties:
● Name and status
● Control operations such as
Start, Stop, Pause, and Continue methods Message handling by means
of the Handle Message method
a.2.Application Model
• Aneka SDK provides a flexible application model that enables
developers to design and execute parallel and distributed
applications on cloud infrastructure.
Workflow in Aneka SDK Application Model
• Application Submission: The application is designed using Aneka APIs
and submitted for execution.
• Job Scheduling: Aneka's resource manager schedules tasks or threads
to available computing resources.
• Task Execution: Tasks are executed across distributed cloud resources.
• Result Collection: The results are collected and merged after
execution.
b. Management Tools
1. Infrastructure Management(IAAS)
2. Platform Management(PAAS)
3. Application Management(SAAS)
Topic 3:BUILDING ANEKA CLOUDS

• Aneka is primarily a platform for developing distributed applications

for Clouds.
• As a software platform, it requires infrastructure to be deployed on,
which needs to be managed.
• Infrastructure management tools are specifically designed for this
task, and building Clouds is one of the primary tasks of
administrators.
• Different deployments models for Public, Private and Hybrid Clouds
are supported.
Logical Organization(15M)
The logical organization of Aneka Clouds can be very diverse, since it strongly depends on the
configuration selected for each of the container instances belonging to the Cloud. The most
common scenario is using a master-worker configuration with separated nodes for storage as
shown in Fig. 5.4.
A common configuration of the master node is the following: Index
service (master copy)
● Heartbeat service
● Logging service
● Reservation service
● Resource provisioning service
● Accounting service
● Reporting and monitoring service
● Scheduling services for the supported programming models
Private Cloud Deployment Mode
Public Cloud Deployment Mode
Hybrid Cloud Deployment Mode
Topic 4:DATA INTENSIVE
COMPUTING(15mark)
• Data-intensive computing focuses on processing, analyzing, and
managing extremely large volumes of data using cloud, often
reaching terabytes or petabytes.
• Example: Platforms like Netflix analyze massive datasets of user
preferences to provide personalized recommendations.
• Key features:
1.Large-Scale Data Handling:Manages terabytes to petabytes of data.
2.Distributed and Parallel Processing:Tasks are divided and processed
across multiple servers simultaneously.
3.Scalability on Demand:Resources are scaled up or down based on
workload.
TECHNOLOGIES FOR DATA INTENSIVE
COMPUTING
1.STORAGE SYSTEM
2.PROGRAMMING PLATFORM

1.STORAGE SYSTEM:
Category1:High-Performance Distributed File Systems and Storage Clouds
a)LUSTRE
• High-Speed Parallel File System: Lustre distributes data across multiple
servers to enable fast data access.
• Optimized for Large Workloads: Handles petabytes of data, making it
suitable for big data applications.
• Applications: Commonly used in supercomputing and scientific research
like genomics and climate modeling.
b) IBM General Parallel File System (GPFS)
• Scalable and Fault-Tolerant: Developed by IBM, it supports large-scale data
storage and ensures reliability by replicating data.
• Efficient Data Management: Enables quick access and sharing of large
datasets.
• Applications: Used in enterprise systems, artificial intelligence, and
analytics workloads.
c) Google File System (GFS)
• Distributed and High Throughput: Stores and processes massive datasets
with an emphasis on batch processing over low latency.
• Manages Large Files: Designed for handling terabytes of data efficiently.
• Applications: Powers Google’s services like search engines, Gmail, and
YouTube.
d) Sector
• Open-Source Distributed Storage: Sector is an affordable storage
solution designed for handling big data tasks.
• Optimized for Analytics: Provides high performance and is suitable
for data analytics and computational workloads.
• Applications: Used in academic research and small-scale businesses
for big data projects.
e) Amazon Simple Storage Service (S3).
(Refer module 5 notes-aws)
Category 2:Not Only SQL (NoSQL) Systems

(a) Apache CouchDB and MongoDB.

• Apache CouchDB is an open-source NoSQL database designed to store and
manage large amounts of data in the form of documents.
• It follows a document-oriented model, where data is stored as JSON documents,
making it flexible and easy to work with for applications dealing with semi-
structured data.
• MongoDB is an open-source, document-oriented NoSQL database designed for
storing and managing large volumes of unstructured or semi-structured data.
• It stores data in a flexible, JSON-like format called BSON (Binary JSON), which
makes it ideal for applications that require rapid changes to the schema.
Notes:
1. NoSQL databases typically do not require a fixed schema. NoSQL (Not Only SQL) refers to a category of non-
relational databases designed to store, retrieve, and manage data that doesn't fit neatly into tables and
rows like in traditional relational databases (RDBMS).
2. JSON is text-based, while BSON is binary. SON is a lightweight, text-based data interchange format that is
easy for humans to read and write. BSON is a binary-encoded serialization of JSON-like documents.
(b) Amazon Dynamo
• NoSQL Key-Value Database: Amazon DynamoDB is a fully managed,
serverless NoSQL database designed for high-speed, low-latency
applications, storing data in a key-value format.
• Scalable and Reliable: It automatically scales to handle any amount of
traffic and ensures high availability with built-in replication across multiple
regions.
(c) Google Bigtable
• Google Bigtable is a distributed, scalable NoSQL database designed for
handling large amounts of structured data across many machines.
• It was developed by Google to handle high-volume, low-latency workloads
and is used internally by many of Google's services, such as Search, Gmail,
and Google Maps.
• Integration with Google Cloud: It integrates with other Google Cloud
products like Google Cloud Storage and Google Cloud Datastore for
enhanced functionality.
(d) Apache Cassandra
• Apache Cassandra is an open-source, distributed NoSQL database designed to
manage large amounts of data across multiple servers while ensuring high
availability and fault tolerance
• Cassandra was initially developed by Facebook and now it is part of the Apache
incubator initiative.
• Currently, it provides storage support for several very large Web applications such
as Facebook itself, Digg, and Twitter.
(e) Hadoop Hbase
• HBase is the distributed database supporting the storage needs of the Hadoop
distributed programming platform(open-source framework for processing and
storing large datasets across distributed clusters of computers).
• HBase is designed by taking inspiration from Google Big table, and its main goal is
to offer real time read/write operations for tables with billions of rows and
millions of columns
2. Programming Platforms

2.1. The MapReduce Programming Model

Working
Input: “cat dog cat”
2.2 Variations and Extensions of MapReduce
(a) Hadoop
• Hadoop is an open-source framework for distributed storage and processing of large datasets.
• It allows data to be stored across many machines and processed in parallel, making it highly
scalable and fault-tolerant.
• Hadoop was created by Doug Cutting and Mike Cafarella in 2005.
(b) Pig
• Pig is a high-level platform built on top of Hadoop used for analyzing large datasets. Pig was
developed by Yahoo!
• It provides a scripting language called Pig Latin, which is easier to use than writing raw
MapReduce code.
(c) Hive
• Hive is a data warehouse infrastructure built on top of Hadoop that provides a SQL-like query
language called HiveQL for querying and managing large datasets stored in HDFS (Hadoop
Distributed File System).
• Invented by: Facebook,Developed in 2007.

MNL Solax Ev Charger X1-Ae-7.0kw-En
No ratings yet
MNL Solax Ev Charger X1-Ae-7.0kw-En
16 pages
Aneka
No ratings yet
Aneka
12 pages
Module 4 (9)
No ratings yet
Module 4 (9)
187 pages
Aneka Cloud Platform
0% (1)
Aneka Cloud Platform
31 pages
CC UNIT - 3
No ratings yet
CC UNIT - 3
22 pages
Module 4: Aneka: Cloud Application Platform
67% (3)
Module 4: Aneka: Cloud Application Platform
8 pages
CLOUD COMPUTING Notes
No ratings yet
CLOUD COMPUTING Notes
10 pages
Aneka Platform: Shyam Krishna Khadka
No ratings yet
Aneka Platform: Shyam Krishna Khadka
27 pages
Cloud Computing - Chapter 3
No ratings yet
Cloud Computing - Chapter 3
11 pages
Aneka Cloud Overview
No ratings yet
Aneka Cloud Overview
3 pages
Aneka_Cloud_Overview
No ratings yet
Aneka_Cloud_Overview
3 pages
ManjrasoftAnekaFlyer 2page
No ratings yet
ManjrasoftAnekaFlyer 2page
2 pages
Lecture - 013 - Lecture - 016 Course Title: Cloud Computing and Its Applications
No ratings yet
Lecture - 013 - Lecture - 016 Course Title: Cloud Computing and Its Applications
62 pages
CLOUD COMPUTING MODULE 4
No ratings yet
CLOUD COMPUTING MODULE 4
23 pages
CCA - Module 2 - Aneka
No ratings yet
CCA - Module 2 - Aneka
56 pages
Aneka Cloud Applicationplatform
No ratings yet
Aneka Cloud Applicationplatform
39 pages
Aneka a Cloud Computing Platform[12 (1)
No ratings yet
Aneka a Cloud Computing Platform[12 (1)
11 pages
1 Aneka
No ratings yet
1 Aneka
30 pages
CC Unit 3
No ratings yet
CC Unit 3
17 pages
Cloud Computing UNIT III
No ratings yet
Cloud Computing UNIT III
12 pages
Exp 5 ANEKA Cloud Platform Study
No ratings yet
Exp 5 ANEKA Cloud Platform Study
10 pages
cloud computing unit4
No ratings yet
cloud computing unit4
13 pages
Cloud 4
No ratings yet
Cloud 4
47 pages
Aneka
No ratings yet
Aneka
13 pages
ANEKA
No ratings yet
ANEKA
5 pages
Building Aneka Clouds Final
No ratings yet
Building Aneka Clouds Final
32 pages
CC Unit 4 Notes
No ratings yet
CC Unit 4 Notes
10 pages
Unit 4 Aneka in Cloud Computing
No ratings yet
Unit 4 Aneka in Cloud Computing
16 pages
CC 2nd Assignment Asnwers
No ratings yet
CC 2nd Assignment Asnwers
18 pages
Aneka
No ratings yet
Aneka
11 pages
11 Aneka in Cloud Computing
No ratings yet
11 Aneka in Cloud Computing
14 pages
curriculum guideline
No ratings yet
curriculum guideline
8 pages
CC
No ratings yet
CC
5 pages
Module 2
No ratings yet
Module 2
16 pages
UNIT III Building Aneka Clouds
100% (1)
UNIT III Building Aneka Clouds
15 pages
Aneka Cloud is a Platform That Helps Developers and Businesses Create…
No ratings yet
Aneka Cloud is a Platform That Helps Developers and Businesses Create…
1 page
PowerPoint Slides Chapter 05
No ratings yet
PowerPoint Slides Chapter 05
24 pages
UIII C1 AnekaPlattform
No ratings yet
UIII C1 AnekaPlattform
9 pages
1966334620
No ratings yet
1966334620
33 pages
UNIT-5 - Aneka Architecture
No ratings yet
UNIT-5 - Aneka Architecture
6 pages
Aneka Magazine Article 3
No ratings yet
Aneka Magazine Article 3
12 pages
Assingnment No. 4
No ratings yet
Assingnment No. 4
4 pages
Aneka Cloud Introduction
No ratings yet
Aneka Cloud Introduction
36 pages
Aneka Brochure Nitya
No ratings yet
Aneka Brochure Nitya
6 pages
Aneka
No ratings yet
Aneka
2 pages
Cloud Unit 4
No ratings yet
Cloud Unit 4
16 pages
Unit 3cloud
No ratings yet
Unit 3cloud
12 pages
Aneka Cloud Application Platform and Its Integration With Windows Azure
No ratings yet
Aneka Cloud Application Platform and Its Integration With Windows Azure
30 pages
CC Practicals SSIU 20200330 092422357 PDF
75% (4)
CC Practicals SSIU 20200330 092422357 PDF
87 pages
Dokumen - Tips Introduction To Cloud Computing and The Aneka Platform DR Rajkumar Buyya Cloud
No ratings yet
Dokumen - Tips Introduction To Cloud Computing and The Aneka Platform DR Rajkumar Buyya Cloud
36 pages
Cloud Approach and Azure
No ratings yet
Cloud Approach and Azure
30 pages
Cloud Aneka
100% (1)
Cloud Aneka
49 pages
Aneka User Manual&Instl Guide 5.0
No ratings yet
Aneka User Manual&Instl Guide 5.0
73 pages
Cc mid 2 extra
No ratings yet
Cc mid 2 extra
16 pages
cloud 1st internals
No ratings yet
cloud 1st internals
10 pages
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
From Everand
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
Mario Marinov
No ratings yet
Workflow Management System
No ratings yet
Workflow Management System
16 pages
Unit 4: Aneka Cloud Application Platform
No ratings yet
Unit 4: Aneka Cloud Application Platform
9 pages
53 P16mcae8 2020051902490840
No ratings yet
53 P16mcae8 2020051902490840
30 pages
Aneka Installation Guide
No ratings yet
Aneka Installation Guide
71 pages
Mastering Kubernetes
From Everand
Mastering Kubernetes
Manish Soni
No ratings yet
COMPUTER ANIMATION.pptx - converted
No ratings yet
COMPUTER ANIMATION.pptx - converted
10 pages
module 1 (2)
No ratings yet
module 1 (2)
39 pages
Module 2
No ratings yet
Module 2
20 pages
module 1 (2) - converted
No ratings yet
module 1 (2) - converted
10 pages
S1 - Access Work Environment
No ratings yet
S1 - Access Work Environment
52 pages
Lab 7 DLD
No ratings yet
Lab 7 DLD
6 pages
Department of Mechatronics Engineering University of Engineering & Technology, Peshawar
No ratings yet
Department of Mechatronics Engineering University of Engineering & Technology, Peshawar
9 pages
Huawei-B535-Router-f R-mobilt-bredband Installationsguide 23 Spr k
No ratings yet
Huawei-B535-Router-f R-mobilt-bredband Installationsguide 23 Spr k
266 pages
Reabe Hopper Gauge Calibration Data
No ratings yet
Reabe Hopper Gauge Calibration Data
3 pages
Spekrum Analyzer - Willtek 9101
No ratings yet
Spekrum Analyzer - Willtek 9101
8 pages
Default IP Addresses and Resetting
No ratings yet
Default IP Addresses and Resetting
5 pages
Matatag Template Request Letter
No ratings yet
Matatag Template Request Letter
8 pages
CA 7 Final
87% (15)
CA 7 Final
56 pages
MediaTek - MediaTek Dimensity 900
No ratings yet
MediaTek - MediaTek Dimensity 900
4 pages
A Survey On Multi-Task Learning: Yu Zhang and Qiang Yang
No ratings yet
A Survey On Multi-Task Learning: Yu Zhang and Qiang Yang
20 pages
Making Multimedia
No ratings yet
Making Multimedia
14 pages
1632
No ratings yet
1632
37 pages
Computer Networks
No ratings yet
Computer Networks
110 pages
AllPuzzles_Cats
No ratings yet
AllPuzzles_Cats
10 pages
Mini Project Synopsis
No ratings yet
Mini Project Synopsis
29 pages
Led LCD TV: Service Manual
No ratings yet
Led LCD TV: Service Manual
45 pages
DLD Lab 3
No ratings yet
DLD Lab 3
10 pages
World Handwriting Contest 1
No ratings yet
World Handwriting Contest 1
2 pages
SG300-10SFP Datasheet: Quick Spec
No ratings yet
SG300-10SFP Datasheet: Quick Spec
3 pages
11-Information Security Policy - Telstra Global
No ratings yet
11-Information Security Policy - Telstra Global
1 page
KS - C - CC - 364 System Software
No ratings yet
KS - C - CC - 364 System Software
2 pages
Javanotes
No ratings yet
Javanotes
50 pages
CitrixXenApp 6 Scripting With Loadrunner Best Practices v2.31 PDF
100% (1)
CitrixXenApp 6 Scripting With Loadrunner Best Practices v2.31 PDF
22 pages
SDM630 Modbus V2 Manual Incl Protocoll
No ratings yet
SDM630 Modbus V2 Manual Incl Protocoll
47 pages
Honeywell Experion LX Specification Fault Tolerant Ethernet Spec
No ratings yet
Honeywell Experion LX Specification Fault Tolerant Ethernet Spec
10 pages
Lab 1 Basic Logical Functions and Gates
No ratings yet
Lab 1 Basic Logical Functions and Gates
9 pages
Fabric Defect Final Black Book Abcdeffg
No ratings yet
Fabric Defect Final Black Book Abcdeffg
64 pages
Avid - PT201 2021 - Ch10 Advanced Mixing and Finishing Techniques
No ratings yet
Avid - PT201 2021 - Ch10 Advanced Mixing and Finishing Techniques
30 pages