Chapter 1 -_characterization_of_distributed_systems

Objectives
To define what is a distributed system
To know the consequences of the
definition
To identify the challenges in designing and
building a distributed system

Chapter Outlines
Introduction
Examples of Distributed Systems
Challenges of Distributed Systems

♦ Data are Distributed
» If data must exist in multiple computers for admin and ownership
reasons
♦ Computation is Distributed
» Applications taking advantage of parallelism, multiple processors,
» particular feature
» Scalability and heterogeneity of Distributed System
♦ Users are Distributed
» If Users communicate and interact via application (shared objects)

Definition of A Distributed System
“A distributed system is a collection of independent
computers that appear to the users of the system as a
single computer.” [Tanenbaum]
“A distributed system is a collection of autonomous
computers linked by a network with software designed
to produce an integrated computing facility.”
[Coulouris, Dollimore, Kindberg]
“A system of multiple autonomous processing
elements, cooperating in a common purpose or to
achieve a common goal.” [Burns & Wellings 1997]

”A system that consists of a collection of two or more
independent computers, that are connected by a
network, which coordinate their processing through the
exchange of synchronous or asynchronous message
passing. They may be on separate continents, in the
same building or the same room”.
Definition from our textbook

Introduction
Motivation for A Distributed System
 Load balancing / distribution
 breaking a problem into smaller pieces enables you to solve larger
problems without resorting to larger computer
 MAINFRAME – 10 X faster but 1000 X expensive
 Increased Processing Power
 independent processors working on the same task
 Distributed systems consisting of collections of microcomputers
may have processing powers that no single computer will ever
achieve
 10000 CPUs, each running at 50 MIPS, yields 500000 MIPS 
instruction to be executed in 0.002 nsec  equivalent to light
distance of 0.6 mm  any processor chip of that size would melt
immediately

Introduction
Motivation for A Distributed System
Fault tolerance
 If any of the machine gets down, others can still run
Availability
 Anytime, anywhere access
Resource sharing
 All clients can be server or vice versa to provide resources
(data, files, services)
 The main motivator for DS

Examples
Source : http://setiathome.ssl.berkeley.edu/

Examples
Source : http://www.girardin.org/fabien/blog/wp-content/mobile_computing_30s.png
Distributed Computing

Examples

Challenges in Distributed System

Challenge: Heterogeneity
Heterogeneity = variety and difference
Heterogeneity of
underlying network infrastructure (ethernet, ISDN, token ring etc),
computer hardware and software (e.g., operating systems, compare UNIX
sockets and Winsock calls),
programming languages (java, C, python : in particular, data
representations),
implementations by different developers
Heterogeneity needs to be masked

determines whether the system can be
extended and re-implemented in various
ways
Determined primarily by the degree to
which new resource-sharing services can be
added and be made available for use by
variety of client programs
Detailed interfaces of components need to
be standardized and published.
Challenge: Openness

Security for information resources has three components:
Confidentiality
 protection against disclosure to unauthorized individuals
Integrity
 protection against alteration or corruption
Availability
 protection against interference with the means of accessing the
resources
The challenge: sending sensitive information in a network
message in a secure manner efficiently
Not just to conceal the info but to ensure that the sender and
recipients are the rightful owners of the messages
Challenge: Security

Challenge: Scalability
A distributed system is scalable if it remains effective
as the number of users and/or resources increase
Challenges:
Controlling resource costs
Controlling performance loss
Preventing resources from running out
Avoiding performance bottlenecks

Challenge: Failure Handling
Failures more common than in centralized systems
and usually partial
Failure handling includes
Detection (may be impossible)
Masking/hiding
Tolerance
Recovery
Redundancy

Detection
Some possible (e.g., using transmission errors via
checksums)
Some impossible (crashed remote server vs. slow
remote server)
Challenge: manage failures that cannot be detected, but
suspected

Masking/hiding
Some failures can be hidden or made less severe
Replication in space/time
 Space: e.g., writing to multiple disks
 Time: e.g., transmission of multiple messages
May not work in worst cases, e.g., all disks may have
been corrupted

Tolerance
Sometimes not feasible to hide all failures
E.g., user has to tolerate if web service has failed rather than wait
until service is up again
Only feasible for certain classes of applications/systems, e.g., DNS
vs. Internet addresses
Recovery
Restoring a correct system state
Roll back using log files

Redundancy
Tolerate failures by using redundant components
Provided through replication
E.g., redundant routes in network, replication of name tables in
multiple domain name servers
Goal of failure handling: high availability
availability of a system is a measure of the proportion of time that
the system is available for use

Challenge: Concurrency
Concurrency control
Handling several simultaneous requests for a resource
 Consistent scheduling of concurrent threads (so that
dependencies are preserved, e.g., in concurrent transactions)
Synchronized operations (semaphores)
 Safest, but limits throughput
Shared objects/resources must guarantee correctness in a
concurrent environment
Avoidance of deadlocks

Challenge: Transparency
Concealing the heterogeneous and distributed nature
of the system so that it appears to the user like one
system
Eight types (ANSA/ISO)
access, location, concurrency, replication, failure,
mobility, performance and scaling transparencies

•Access transparency:
•enables local and remote resources to be accessed using identical
operations.
For instance, from a user's point of view, access to a remote service such as
a printer should be identical with access to a local printer.
From a programmers point of view, the access method to a remote object
may be identical to access a local object of the same class.
•E.g., Same user interface and operations offered in order to access either
local or remote resources

•Location transparency:
•enables resources to be accessed without knowledge of their location.
•The details of the topology of the system should be of no concern to the
user.
•The location of an object in the system may not be visible to the user or
programmer.
•This differs from access transparency in that both the naming and access
methods may be the same. Names may give no hint as to location.
•E.g., URL or e-mail addresses.
• www.google.com (IP address is the physical location)

•Concurrency transparency:
•enables several processes to operate concurrently using shared resources
without interference between them.
•E.g., no conflict occur when 2 or more users accessing the same system

•Replication transparency:
•enables multiple instances of resources to be used to increase reliability
and performance without knowledge of the replicas by users.
•This kind of transparency should be mainly incorporated for the distributed file
systems, which replicate the data at two or more sites for more reliability. The client
generally should not be aware that a replicated copy of the data exists. The clients
should also expect operations to return only one set of values.
•The examples are Distributed DBMS and Mirroring of Web pages.
•Failure transparency:
•enables the concealment of faults, allowing users and application
programs to complete their tasks despite the failure of hardware or
software components.
•E.g., retransmission of e-mail messages
•Mobility transparency:
•allows the movement of resources and clients within a system without
affecting the operation of users or programs.Distributed Computing

•Mobility transparency:
•allows the movement of resources and clients within a system without
affecting the operation of users or programs.
•E.g., caller and callee undergoing different places while on the phone

•Performance transparency:
•allows the system to be reconfigured to improve performance as loads
vary.
•Eg:Video On Demand (VOD) System
•Scaling transparency:
•allows the system and applications to expand in scale without change to
the system structure or the application algorithms.
•Eg: P2P apps

Transparency Description
Access
Hide differences in data representation and how a
resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation
Hide that a resource may be moved to another
location while in use
Replication Hide that a resource may be replicated
Concurrency
Hide that a resource may be shared by several
competitive users
Failure Hide the failure and recovery of a resource
Persistence
Hide whether a (software) resource is in memory or on
disk

1. Name a program that is using distributed computing and freely
available to the masses
2. Name 1 research field that is relying heavily on distributed computing

Summary
A distributed system is a collection of independent
and autonomous computers that appear to the
users of the system as a single computer.”
Based on the above definition, there are three
significant consequences :
Concurrency
No global clock
Independent failures
Several challenges need to be addressed in
designing and building DS
Heterogeneity, openness, security, scalability, failure
handling, concurrency and transparency.

Chapter 1 -_characterization_of_distributed_systems

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Chapter 1 -_characterization_of_distributed_systems

Similar a Chapter 1 -_characterization_of_distributed_systems (20)

Último

Último (20)

Chapter 1 -_characterization_of_distributed_systems