Distributed Systems: Course Information
Distributed Systems: Course Information
Course Information
DISTRIBUTED SYSTEMS
Examination: written
(TDDD25)
Basic Issues
Labss&Lessons:
Sergiu Rafiliu
Institutionen för Datavetenskap (IDA) 4. Design Issues with Distributed Systems
email: [email protected]
http://www.ida.liu.se/~serra
phone: 28 2281 5. Preliminary Course Topics
B building, 329:228
Automatic banking (teller machine) system Automotive system (a distributed real-time system)
Gateway
Safety critical network
Teller
machines
Bank_1 data Bank_1 backup
Gateway
• Primary requirements: security and reliability. Non-safety critical high-speed network
• Consistency of replicated data.
• Concurrent transactions (operations which involve Non-safety critical low-speed network
accounts in different banks; simultaneous access Entertainment network
from several users, etc).
• Fault tolerance
• Scheduling with hard time constraints Reliability (fault tolerance): if some of the machines
crash, the system can survive.
• Real-time communication
Incremental growth: as requirements on processing
• Fault tolerance power grow, new machines can be added
incrementally.
• Communication
Networking problems: several problems are created by
the network infrastructure, which have to • Performance & scalability
be dealt with: loss of messages,
overloading, ... • Heterogeneity
• Openness
Security problems: sharing generates the problem of • Reliability & fault tolerance
data security.
• Security
• Replication transparency
☞ How to achieve the single system image? - the system is free to make additional copies of
files and other resources (for purpose of
☞ How to "fool" everyone into thinking that the collection performance and/or reliability), without the
of machines is a "simple" computer? users noticing.
Example: several copies of a file; at a certain
request that copy is accessed which is the
closest to the client.
• Access transparency
• Concurrency transparency
- local and remote resources are accessed using
identical operations. - the users will not notice the existence of other
users in the system (even if they access the
same resources).
• Location transparency
- users cannot tell where hardware and software • Failure transparency
resources (CPUs, files, data bases) are locat-
ed; the name of the resource shouldn’t encode - applications should be able to complete their
the location of the resource. task despite failures occurring in certain
components of the system.
• Migration (mobility) transparency
• Performance transparency
- resources should be free to move from one
location to another without having their names - load variation should not lead to performance
changed. degradation.
This could be achieved by automatic
reconfiguration as response to changes of the
load; it is difficult to achieve.
Heterogeneity Openness
☞ Distributed applications are typically heterogeneous: ☞ One of the important features of distributed systems
- different hardware: mainframes, workstations, is openness and flexibility:
PCs, servers, etc.; - every service is equally accessible to every
- different software: UNIX, MS Windows, IBM OS/2, client (local or remote);
Real-time OSs, etc.; - it is easy to implement, install and debug new
- unconventional devices: teller machines, services;
telephone switches, robots, manufacturing - users can write and install their own services.
systems, etc.;
- diverse networks and protocols: Ethernet,
FDDI, ATM, TCP/IP, Novell Netware, etc.
☞ Key aspect of openness:
- Standard interfaces and protocols (like Internet
The solution communication protocols)
Middleware, an additional software layer to mask - Support of heterogeneity (by adequate
heterogeneity middleware, like CORBA)
Middleware
Middleware
platform 1
Operating Operating
platform 2
System System
Operating System
"the platform" Hardware: Hardware:
Comp.&Netw. Comp.&Netw.
Hardware: Computer&Network
Node 1 Node 2
Data on the system must not be lost, and copies stored Distributed systems should allow
redundantly on different servers must be kept consistent. communication between programs/users/
resources on different computers.
• The more copies kept, the better the availability, but
keeping consistency becomes more difficult.
• Introduction
Basics - just finished!
• Introduction
• Models of Distributed Systems • Communication in Distributed Systems
• Communication in Distributed Systems - Message passing and the client/server model
- Remote Procedure Call
Middleware - Group Communication
• Distributed Heterogeneous Applications and CORBA
• Distributed Heterogeneous Applications and CORBA
• Peer-to-Peer Systems
- Heterogeneity in distributed systems
- Middleware
Theoretical Aspects/Distributed Algorithms
- Objects in distributed systems
• Time and State in Distributed Systems
- The CORBA approach
• Distributed Mutual Exclusion
• Election and Agreement
• Peer-to-Peer Systems
- Basic design issues
Distributed Data and Fault Tolerance
- The Napster file sharing system
• Replication
- Peer-to-peer middleware
• Recovery and Fault Tolerance
• Time and State in Distributed Systems
Distributed Real-Time Systems
- Time in distributed systems
- Logical clocks
- Vector clocks
- Causal ordering of messages
- Global states and state recording
• Replication
- Motivation for replication
- Consistency and ordering
- Total and causal ordering
- Update protocols and voting