0% found this document useful (0 votes)
14 views

Lecture 01

The document discusses distributed systems and provides examples. It defines a distributed system and its key characteristics. The document also discusses different networking concepts and protocols related to distributed systems like HTTP.

Uploaded by

atik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture 01

The document discusses distributed systems and provides examples. It defines a distributed system and its key characteristics. The document also discusses different networking concepts and protocols related to distributed systems like HTTP.

Uploaded by

atik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

CSE-813(Distributed & Cloud Computing) -Distributed Systems

Even-’22

Dr. Atiqur Rahman


ড. আতিকু র রহমান
Ph.D.(CQUPT, China), MS.Engg.(CU), B.Sc.(CU)
Associate Professor
Department of Computer Science and Engineering
University of Chittagong

Lecture 1: Welcome, and Introduction


Our Main Goal Today
To Define the Term Distributed System
Can you name some examples of
Operating Systems?
Can you name some examples of
Operating Systems?

Linux WinXP Vista 7/8 Unix FreeBSD Mac OSX
2K Aegis Scout Hydra Mach SPIN
OS/2 Express Flux Hope Spring
AntaresOS EOS LOS SQOS LittleOS TINOS
PalmOS WinCE TinyOS iOS

What is an Operating System?
What is an Operating System?

• User interface to hardware (device driver)


• Provides abstractions (processes, file system)
• Resource manager (scheduler)
• Means of communication (networking)
• …
FOLDOC definition
(FOLDOC = Free On-Line Dictionary of Computin

Operating System-The low-level software


which handles the interface to peripheral
hardware, schedules tasks, allocates storage,
and presents a default interface to the user
when no application program is running.
Can you name some examples of
Distributed Systems?
Can you name some examples of
Distributed Systems?
• Client-Server (NFS)
• The Web
• The Internet
• A wireless network
• DNS
• Gnutella or BitTorrent (peer to peer overlays)
• A “cloud”, e.g., Amazon EC2/S3, Microsoft Azure
• A datacenter, e.g., NCSA, a Google datacenter, The Planet
What is a Distributed System?
FOLDOC definition

A collection of (probably heterogeneous) automata whose distribution is transparent to


the user so that the system appears as one local machine. This is in contrast to a network,
where the user is aware that there are several machines, and their location, storage
replication, load balancing and functionality is not transparent. Distributed systems
usually use some kind of client-server organization.
Textbook definitions

• A distributed system is a collection of independent computers that


appear to the users of the system as a single computer.
[Andrew Tanenbaum]

• A distributed system is several computers doing something together.


Thus, a distributed system has three primary characteristics: multiple
computers, interconnections, and shared state.
[Michael Schroeder]
Unsatisfactory
• Why are these definitions short?
• Why do these definitions look inadequate to us?
• Because we are interested in the insides of a
distributed system
– design and implementation
– Maintenance
– Algorithmics (“protocols”)
“I shall not today attempt further to define the kinds of material I
understand to be embraced within that shorthand description; and
perhaps I could never succeed in intelligibly doing so. But I know it
when I see it, and the motion picture involved in this case is not that.”

[Potter Stewart, Associate Justice, US Supreme Court (talking about


his interpretation of a technical term laid down in the law, case
Jacobellis versus Ohio 1964) ]
Which
Which isis aa Distributed
Distributed System
System –– (A)
(A) or
or (B)?
(B)?

(A)

(A) Facebook Social Network Graph among humans


Source: https://www.facebook.com/note.php?note_id=469716398919
(B)

(B) The Internet (Internet Mapping Project, color coded by ISPs)


A working definition for us
A distributed system is a collection of entities, each of which is
autonomous, programmable, asynchronous and failure-prone, and
which communicate through an unreliable communication medium.

• Entity=a process on a device (PC, PDA)


• Communication Medium=Wired or wireless network
• Our interest in distributed systems involves
– design and implementation, maintenance, algorithmic
Gnutella Peer to Peer System

What are the “entities”


(nodes)?

Source: GnuMap Project What is the


communication medium
(links)?
Web Domains
What are the “entities”
(nodes)?

What is the
communication medium
(links)?

Source: http://www.vlib.us/web/worldwideweb3d.html
Datacenter

What are the “entities”


(nodes)?

What is the
communication medium
(links)?
The Internet – Quick Refresher
• Underlies many distributed systems.
• A vast interconnected collection of computer networks of many types.
• Intranets – subnetworks operated by companies and organizations.
• Intranets contain subnets and LANs.
• WAN – wide area networks, consists of LANs
• ISPs – companies that provide modem links and other types of
connections to users.
• Intranets (actually the ISPs’ core routers) are linked by backbones –
network links of large bandwidth, such as satellite connections, fiber
optic cables, and other high-bandwidth circuits.
• UC2B? Google Fiber?
An Intranet & a distributed system
email server Desktop
computers
print and other servers

Local area
Running over this Web server network
Intranet is a
distributed file system
email server
print
File server
other servers

the rest of
the Internet
router/firewall

prevents unauthorized messages from leaving/entering;


implemented by filtering incoming and outgoing messages
Networking Stacks
Application Underlying
Application layer protocol transport protocol
Distributed System Protocols! Networking Protocols
e-mail smtp [RFC 821] TCP
remote terminal access telnet [RFC 854] TCP
Web http [RFC 2068] TCP
TCP=Transmission Control Protocol
UDP=User Datagram Protocol
file transfer ftp [RFC 959] TCP
(Implemented via sockets)
streaming multimedia proprietary TCP or UDP
(e.g. RealNetworks)
remote file server NFS TCP or UDP
Internet telephony proprietary typically UDP
(e.g., Skype)
The History of Internet Standards

Source: http://xkcd.com/927/
The Heart of the World Wide Web:
the HTTP Standard
HTTP: hypertext transfer protocol htt
• pr
WWW’s application layer protocol equ
PC running h tt es t
• client/server model pr
Explorer es p
o ns
– client: browser that requests, receives, e
and “displays” WWW objects
st
– server: WWW server, which is storing ue
the website, sends objects in response p re q nse Server
o
to requests htt re sp Running
h ttp Apache
• http1.0: RFC 1945
Web
• http1.1: RFC 2068 server
– Leverages same connection to download Mac running
images, scripts, etc. Safari
The HTTP Protocol: More
http: TCP transport service: http is “stateless”
• client initiates a TCP connection • server maintains no information
about past client requests
(creates socket) to server, port 80
• server accepts the TCP Why?
Protocols that maintain session “state” are
connection from client complex!
• http messages (application-layer • past history (state) must be maintained
protocol messages) exchanged and updated.
between browser (http client) and • if server/client crashes, their views of
WWW server (http server) “state” may be inconsistent, and hence
• TCP connection closed must be reconciled.
• RESTful protocols are stateless.
HTTP Example
Suppose user enters URL www.cs.uiuc.edu/ (contains text,
references to 10
jpeg images)
1a. http client initiates a TCP connection to 1b. http server at host www.cs.uiuc.edu
http server (process) at waiting for a TCP connection at port
www.cs.uiuc.edu. Port 80 is default for 80. “accepts” connection, notifying
http server. client
2. http client sends a http request message
(containing URL) into TCP connection
socket 3. http server receives request messages,
forms a response message containing
requested object (index.html), sends
message into socket

time
HTTP Example (cont.)
4. http server closes the TCP connection (if
5. http client receives a response necessary).
message containing html file,
displays html, Parses html file,
finds 10 referenced jpeg objects
6. Steps 1-5 are then repeated for each of 10 jpeg
objects

time For fetching referenced objects, have 2 options:


• non-persistent connection: only one object fetched per TCP connection
– some browsers create multiple TCP connections simultaneously - one per object
• persistent connection: multiple objects transferred within one TCP connection
Your Shell as a browser
1. Telnet to your favorite WWW server:

telnet www.google.com 80 Opens TCP connection to port 80


(default http server port) at www.google.com
Anything typed in sent
to port 80 at www.google.com
2. Type in a GET http request:
GET /index.html By typing this in (may need to hit
Or return twice), you send
GET /index.html HTTP/1.0 this minimal (but complete)
3. Look at response message sent by http server! GET request to http server
What do you think the response is?
Does our Working Definition work for the http
Web?
A distributed system is a collection of entities, each of which is
autonomous, programmable, asynchronous and failure-prone, and
that communicate through an unreliable communication medium.

• Entity=a process on a device (PC, PDA)


• Communication Medium=Wired or wireless network
• Our interest in distributed systems involves
– design and implementation, maintenance, study, algorithmic
“ Important” Distributed Systems
Issues
• No global clock; no single global notion of the correct time
(asynchrony)
• Unpredictable failures of components: lack of response may be due to
either failure of a network component, network path being down, or a
computer crash (failure-prone, unreliable)
• Highly variable bandwidth: from 16Kbps (slow modems or Google
Balloon) to Gbps (Internet2) to Tbps (in between DCs of same big
company)
• Possibly large and variable latency: few ms to several seconds
• Large numbers of hosts: 2 to several million
Many Interesting Design Problems


• Real distributed systems
– Cloud Computing, Peer to peer systems, Hadoop, key-value stores/NoSQL, distributed file
systems, sensor networks, measurements, graph processing, stream processing, …
• Classical Problems
– Failure detection, Asynchrony, Snapshots, Multicast, Consensus, Mutual Exclusion,
Election, …
• Concurrency
– RPCs, Concurrency Control, Replication Control, …
• Security
– Byzantine Faults, …
• Others…

Typical Distributed Systems
Design Goals
• Common Goals:
– Heterogeneity – can the system handle a large variety of types of PCs and devices?
– Robustness – is the system resilient to host crashes and failures, and to the network
dropping messages?
– Availability – are data+services always there for clients?
– Transparency – can the system hide its internal workings from the users?
– Concurrency – can the server handle multiple clients simultaneously?
– Efficiency – is the service fast enough? Does it utilize 100% of all resources?
– Scalability – can it handle 100 million nodes without degrading service?
(nodes=clients and/or servers) How about 6 B? More?
– Security – can the system withstand hacker attacks?
– Openness – is the system extensible?
“ Important” Issues
• If you’re already complaining that the list of topics we’ve
discussed so far has been perplexing…
– You’re right!
– It was meant to be (perplexing)

• The Goal for the Rest of the Course: see enough examples and
learn enough concepts so these topics and issues will make sense
– We will revisit many of these slides in the very last lecture of the course!
“ Concepts” ?
• Which of the following inventions do you
think is the most important?
1. Car
2. Wheel
3. Bicycle

“What lies beneath?” Concepts!


How will We Learn?
• Textbook, Recommended but not Limited to
– Distributed and Cloud Computing- From Parallel Processing to the Internet of
Things by Kai Hwang, Geoffery C. Fox, Jack J. Dongarra (Any edition)
– Fundamentals of Cloud Computing by A. Kannammal
– Cloud Computing by Michael Miller
• Lectures
• Homework's
– Approx. one every two-three weeks
– Solutions need to be typed, figures can be hand-drawn
– May have extra problems for 4-credit students
• Programming assignments (3-4)
• Exams/quizzes
– CTS’ + Final
What assistance is available to
you?
• Lectures
– lecture slides will be placed online at Google Classroom or Students’ Group
• “Tentative” version before lecture
• “Final” version after lecture
• Homework's – office hours and discussion forum to help you (without
giving you the solution).
• Programming Assignments (MPs) – First two assignments will be in C
– Office hours and Piazza discussion forum to help you (without giving you the
solution).

• Course Prerequisite: CSE511 or CSE613 or equivalent


OS/Networking course (latter need instructor permission)
You can find us almost
everywhere
In person:
• Myself (Dr. Atiq): Associate Professor, CSE department, CU.
– Office Hours Every Tuesday and Thursday right after class.
• TAs:
(Check Google Classroom or Group for office hours. There is > 0 office hours every day of the
week.)

Virtually:
• Discussion Forum: Piazza (most preferable, monitored daily) – 24 hour turnaround time for
questions!
• Email (turnaround time may be longer than Piazza) – use [email protected]
• Email individuals (instructor, TA) only if absolutely necessary (e.g., private matter)
Wrap-Up
• (Reading for today’s lecture: Relevant parts of Chapter 1)

• All students: Must join the Google Classroom and Group


ASAP.

• Next lecture
– Topic: “Introduction to Cloud Computing”

You might also like