0% found this document useful (0 votes)
91 views37 pages

1.1 System Models For Distributed and Cloud Computing

The document discusses various system models for distributed and cloud computing, including clusters of cooperative computers, grid computing infrastructures, peer-to-peer networks, and cloud computing over the Internet. It highlights the architecture, design issues, and challenges associated with these systems, emphasizing the importance of middleware and resource management. Additionally, it outlines the advantages of cloud computing, such as cost-effectiveness and scalability, while addressing security and reliability concerns.

Uploaded by

2k5preethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views37 pages

1.1 System Models For Distributed and Cloud Computing

The document discusses various system models for distributed and cloud computing, including clusters of cooperative computers, grid computing infrastructures, peer-to-peer networks, and cloud computing over the Internet. It highlights the architecture, design issues, and challenges associated with these systems, emphasizing the importance of middleware and resource management. Additionally, it outlines the advantages of cloud computing, such as cost-effectiveness and scalability, while addressing security and reliability concerns.

Uploaded by

2k5preethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

SYSTEM MODELS FOR DISTRIBUTED

AND CLOUD COMPUTING


SYSTEM MODELS FOR DISTRIBUTED AND CLOUD

COMPUTING
Clusters of Cooperative Computers
• Cluster Architecture
• Single-System Image
• Hardware, Software, and Middleware Support
• Major Cluster Design Issues
• Grid Computing Infrastructures
• Computational Grids
• Grid Families
• Peer-to-Peer Network Families
• P2P Systems
• Overlay Networks
• P2P Application Families
• P2P Computing Challenges
• Cloud Computing over the Internet
• Internet Clouds
• The Cloud Landscape
SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING

• Distributed and cloud computing systems are built over a large number of
autonomous computer nodes.

• These node machines are interconnected by SANs, LANs, or WANs in a hierarchical


manner.

• With today’s networking technology, a few LAN switches can easily connect hundreds
of machines as a working cluster.

• A WAN can connect many local clusters to form a very large cluster of clusters.

• In this sense, one can build a massive system with millions of computers connected to
edge networks.
Clusters of Cooperative Computers

• Clusters are most popular in supercomputing applications.


• In 2009, 417 of the Top 500 supercomputers were built with cluster architecture.

• Laid the necessary foundation for building large-scale grids and clouds.

• P2P networks appeal most to business applications.

• Many national grids built in the past decade were underutilized for lack of reliable middleware or well-coded
applications.

• Potential advantages of cloud computing include its low cost and simplicity for both providers and users.

• A computing cluster consists of interconnected stand-alone computers which work cooperatively as a single
integrated computing resource.

• In the past, clustered computer systems have demonstrated impressive results in handling heavy workloads with
large data sets.
Architecture of a typical server cluster built around a low-latency, high
Cluster Architecture
bandwidth interconnection network.
• This network can be as simple as a SAN (e.g., Myrinet) or a LAN (e.g., Ethernet).

• To build a larger cluster with more nodes, the interconnection network can be
built with multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand switches.

• Through hierarchical construction using a SAN, LAN, or WAN, one can build
scalable clusters with an increasing number of nodes.

• The cluster is connected to the Internet via a virtual private network (VPN)
gateway.

• The gateway IP address locates the cluster.

• Most clusters have loosely coupled node computers.

• All resources of a server node are managed by their own OS.


Single-System Image
• Ideal cluster should merge multiple system images into a single-system image
(SSI).

• Cluster designers desire a cluster operating system or some middleware to support


SSI at various levels, including the sharing of CPUs, memory, and I/O across all
cluster nodes.

• An SSI is an illusion created by software or hardware that presents a collection of


resources as one integrated, powerful resource.

• SSI makes the cluster appear like a single machine to the user.

• A cluster with multiple system images is nothing but a collection of independent


computers.
Hardware, Software, and Middleware Support

• Clusters exploring massive parallelism are commonly known as MPPs.

• Almost all HPC clusters in the Top 500 list are also MPPs.

• The building blocks are computer nodes (PCs, workstations, servers, or SMP), special communication
software such as PVM or MPI, and a network interface card in each computer node.

• Most clusters run under the Linux OS.

• The computer nodes are interconnected by a high-bandwidth network (such as Gigabit Ethernet,
Myrinet, InfiniBand, etc.).

• Special cluster middleware supports are needed to create SSI or high availability (HA).

• Both sequential and parallel applications can run on the cluster

• Using virtualization, one can build many virtual clusters dynamically, upon user demand.
Major Cluster Design Issues

• A cluster-wide OS for complete resource sharing is not available yet.

• Middleware or OS extensions were developed at the user space to


achieve SSI at selected functional levels.

• Without this middleware, cluster nodes cannot work together effectively


to achieve cooperative computing.

• The software environments and applications must rely on the


middleware to achieve high performance.
Grid Computing Infrastructures

• In the past 30 years, users have experienced a natural growth path from
Internet to web and grid computing services.

• Internet services such as the Telnet command enables a local computer


to connect to a remote computer.

• A web service such as HTTP enables remote access of remote web pages.

• Grid computing is envisioned to allow close interaction among


applications running on distant computers simultaneously.
Computational Grids
• Like an electric utility power grid, a computing grid offers an infrastructure that couples computers,
software/middleware, special instruments, and people and sensors together.

• The grid is often constructed across LAN, WAN, or Internet backbone networks at a regional, national, or global
scale.

• The computers used in a grid are primarily workstations, servers, clusters, and supercomputers.

• Personal computers, laptops, and PDAs can be used as access devices to a grid system.

• Figure 1.16 shows an example computational grid built over multiple resource sites owned by different
organizations. applications.
• The resource sites offer complementary computing resources, including workstations, large servers, a mesh of

processors, and Linux clusters to satisfy a chain of computational needs.

• The grid is built across various IP broadband networks including LANs and WANs already used by enterprises

or organizations over the Internet.

• The grid is presented to users as an integrated resource pool as shown in the upper half of the figure.

• Special instruments may be involved such as using the radio telescope in SETI@Homesearch of life in the galaxy.

• At the server end, the grid is a network.

• At the client end, we see wired or wireless terminal devices.

• The grid integrates the computing, communication, contents, and transactions as rented services.

• Enterprises and consumers form the user base, which then defines the usage trends and service characteristics.
Grid Families
• Grid technology demands new distributed computing models, software/middleware
support, network protocols, and hardware infrastructures.

• National grid projects are followed by industrial grid platform development by IBM,
Microsoft, Sun, HP, Dell, Cisco, EMC, Platform Computing, and others.

• New grid service providers (GSPs) and new grid applications have emerged rapidly,
similar to the growth of Internet and web services in the past two decades.

• In Table 1.4, grid systems are classified in essentially two categories: computational or
data grids and P2P grids.

• Computing or data grids are built primarily at the national level.


Peer-to-Peer Network Families

• An example of a well-established distributed system is the client-


server architecture.

• Client machines (PCs and workstations) are connected to a central


server for compute, e-mail, file access, and database applications.

• The P2P architecture offers a distributed model of networked systems.

• First, a P2P network is client-oriented instead of server-oriented.


P2P Systems

• In a P2P system, every node acts as both a client and a server, providing
part of the system resources.

• Peer machines are simply client computers connected to the Internet.

• All client machines act autonomously to join or leave the system freely.

• This implies that no master-slave relationship exists among the peers.

• No central coordination or central database is needed.

• The system is self-organizing with distributed control.


• Figure 1.17 shows the architecture of a P2P network at two abstraction levels.

• Initially, the peers are totally unrelated.

• Each peer machine joins or leaves the P2P network voluntarily.

• Only the participating peers form the physical network at any time.

• Unlike the cluster or grid, a P2P network does not use a dedicated interconnection network.

• The physical network is simply an ad hoc network formed at various Internet domains randomly using the TCP/IP and NAI protocols.

• Thus, the physical network varies in size and topology dynamically due to the free membership in the P2P network.
Overlay Networks

• Data items or files are distributed in the participating peers.

• Based on communication or file-sharing needs, the peer IDs form an overlay network at the
logical level.

• This overlay is a virtual network formed by mapping each physical machine with its ID,
logically, through a virtual mapping as shown in Figure 1.17.

• When a new peer joins the system, its peer ID is added as a node in the overlay network.

• When an existing peer leaves the system, its peer ID is removed from the overlay network
automatically.

• Therefore, it is the P2P overlay network that characterizes the logical connectivity among the
peers.
P2P Application Families

• Based on application, P2P networks are classified into four groups, as shown in Table 1.5.

• The first family is for distributed file sharing of digital contents (music, videos, etc.) on the P2P
network.

• This includes many popular P2P networks such as Gnutella, Napster, and BitTorrent, among others.

• Collaboration P2P networks include MSN or Skype chatting, instant messaging, and collaborative
design, among others.

• The third family is for distributed P2P computing in specific applications. For example, SETI@home
provides 25 Tflops of distributed computing power, collectively, over 3 million Internet host machines.

• Other P2P platforms, such as JXTA, .NET, and FightingAID@home, support naming, discovery,
communication, security, and resource aggregation in some P2P applications.
P2P Computing Challenges
• P2P computing faces three types of heterogeneity problems in hardware, software, and network
requirements.

• There are too many hardware models and architectures to select from; incompatibility exists between
software and the OS; and different network connections and protocols make it too complex to apply in real
applications.

• We need system scalability as the workload increases.

• System scaling is directly related to performance and bandwidth.

• Data location is also important to affect collective performance.

• Data locality, network proximity, and interoperability are three design objectives in distributed P2P
applications.
• P2P performance is affected by routing efficiency and self-organization by participating
peers.

• Fault tolerance, failure management, and load balancing are other important issues in
using overlay networks.

• Lack of trust among peers poses another problem. Peers are strangers to one another.

• Security, privacy, and copyright violations are major worries by those in the industry in
terms of applying P2P technology in business applications

• In a P2P network, all clients provide resources including computing power, storage space,
and I/O bandwidth.

• The distributed nature of P2P networks also increases robustness, because limited peer
failures do not form a single point of failure.
• By replicating data in multiple peers, one can easily lose data in failed nodes.

• On the other hand, disadvantages of P2P networks do exist. Because the system is not
centralized, managing it is difficult.

• In addition, the system lacks security. Anyone can log on to the system and cause
damage or abuse.

• Further, all client computers connected to a P2P network cannot be considered


reliable or virus-free.

• In summary, P2P networks are reliable for a small number of peer nodes.

• They are only useful for applications that require a low level of security and have no
concern for data sensitivity.
Cloud Computing over the Internet

• Computational science is changing to be data-intensive. Supercomputers must be


balanced systems, not just CPU farms but also Peta scale I/O and networking arrays.

• In the future, working with large data sets will typically mean sending the
computations (programs) to the data, rather than copying the data to the
workstations.

• This reflects the trend in IT of moving computing and data from desktops to large
data centers, where there is on-demand provision of software, hardware, and data as
a service.

• This data explosion has promoted the idea of cloud computing


• Cloud computing has been defined differently by many users and designers.

• For example, IBM, a major player in cloud computing, has defined it as follows: A cloud is a
pool of virtualized computer resources.

• A cloud can host a variety of different workloads, including batch-style backend jobs and
interactive and user-facing applications.”

• Based on this definition, a cloud allows workloads to be deployed and scaled out quickly
through rapid provisioning of virtual or physical machines.

• The cloud supports redundant, self-recovering, highly scalable programming models


that allow workloads to recover from many unavoidable hardware/software failures.

• Finally, the cloud system should be able to monitor resource use in real time to enable
rebalancing of allocations when needed.
Internet Clouds
• Cloud computing applies a virtualized platform with elastic resources on demand by provisioning hardware,
software, and data sets dynamically (see Figure 1.18).

• The idea is to move desktop computing to a service-oriented platform using server clusters and huge
databases at data centers.

• Cloud computing leverages its low cost and simplicity to benefit both users and providers.

• Machine virtualization has enabled such cost-effectiveness.

• Cloud computing intends to satisfy many user applications simultaneously.

• The cloud ecosystem must be designed to be secure, trustworthy, and dependable.

• Some computer users think of the cloud as a centralized resource pool.

• Others consider the cloud to be a server cluster which practices distributed computing over all the servers
used.
The Cloud Landscape

• Traditionally, a distributed computing system tends to be owned and operated by an


autonomous administrative domain (e.g., a research laboratory or company) for on-
premises computing needs.

• However, these traditional systems have encountered several performance bottlenecks:


constant system maintenance, poor utilization, and increasing costs associated with
hardware/software upgrades.

• Cloud computing as an on-demand computing paradigm resolves or relieves us from these


problems.

• Figure 1.19 depicts the cloud landscape and major cloud players, based on three cloud
service models.
• Internet clouds offer four deployment modes: private, public, managed,
and hybrid .

• These modes demand different levels of security implications.

• The different SLAs imply that the security responsibility is shared


among all the cloud providers, the cloud resource consumers, and the
thirdparty cloud-enabled software providers.

• Advantages of cloud computing have been advocated by many IT


experts, industry leaders, and computer science researchers.
The following list highlights eight reasons to adapt the cloud for upgraded

• Internet applications and web services:

1. Desired location in areas with protected space and higher energy efficiency

2. Sharing of peak-load capacity among a large pool of users, improving overall utilization

3. Separation of infrastructure maintenance duties from domain-specific application development

4. Significant reduction in cloud computing cost, compared with traditional computing paradigms

5. Cloud computing programming and application development

6. Service and data discovery and content/service distribution

7. Privacy, security, copyright, and reliability issues

8. Service agreements, business models, and pricing policies

You might also like