SlideShare una empresa de Scribd logo
1 de 37
Chapter 1 - Introduction
Ayalew Belay (Dr.)
with thanks to Dr. Mulugeta Lebsie
2
1.1 Introduction and Definition
 Before the mid-80s, computers were
 Very expensive (hundred of thousands or even millions of
dollars)
 Very slow (a few thousand instructions per second)
 Not connected among themselves
 After the mid-80s: two major developments
 Cheap and powerful microprocessor-based computers
appeared
 Computer networks
 LANs at speeds ranging from 10 to 1000 Mbps (now
even 10, 40, and 100Gbps)
 WANs at speed ranging from 64 Kbps to gigabits/sec
 Consequence
 Feasibility of using a large network of computers to work for
the same application; this is in contrast to the old centralized
systems where there was a single computer with its
peripherals
3
 Definition of a Distributed System
 A distributed system is:
A collection of independent computers that appears to its users
as a single coherent system - computer (Tanenbaum & Van
Steen)
 Other Definitions
A distributed system is a system designed to support the
development of applications and services which can exploit a
physical architecture consisting of multiple, autonomous
processing elements that do not share primary memory but
cooperate by sending asynchronous messages over a
communication network (Blair & Stefani)
 The definitions has two aspects:
1. Hardware: autonomous machines
2. Software: a single system view for the users
4
A distributed system is one that stops you getting any work
done when a machine you have never even heard of crashes
(Leslie)
5
 Why Distributed?
 Resource and Data Sharing
 Printers, databases, multimedia servers, ...
 Availability, Reliability
 The loss of some instances can be hidden
 Scalability, Extensibility
 The system grows with demand (e.g., extra servers)
 Performance
 Huge power (CPU, memory, ...) available
 Inherent distribution, communication
 Organizational distribution, e-mail, video
6
 Problems of Distribution
 Concurrency, Security
 Clients must not disturb each other
 Privacy
 e.g., when building a preference profile such as using
cookies
 Unwanted communication such as spam
 Partial failure
 We often do not know where the error is (e.g., RPC)
 Location, Migration, Relocation, Replication
 Clients must be able to find their servers
 Heterogeneity
 Hardware, platforms, languages, management
7
 Characteristics of Distributed Systems
 Differences between the computers and the ways they
communicate are hidden from users
 Users and applications can interact with a distributed system in
a consistent and uniform way regardless of location
 Distributed systems should be easy to expand and scale
 A distributed system is normally continuously available, even if
there may be partial failures
8
1.2 Goals of a Distributed System
 To support heterogeneous computers and networks and to
provide a single-system view, a distributed system is often
organized by means of a layer of software called middleware
that extends over multiple machines
A distributed system organized as middleware; note that the middleware
layer extends over multiple machines, and offers each application the
same interface
Ack: most diagrams in all slides are taken from the textbook
9
 A distributed system should
 Easily connect users with resources (printers, computers,
storage facilities, data, files, Web pages, ...)
 Some of the reasons
 Economics: sharing resources such as printers and high-
speed computers
 To collaborate and exchange information
 Groupware: software for collaborative editing,
teleconferencing, etc.
 e-commerce: buying and selling goods
 Be transparent: hide the fact that the resources and processes
are distributed across multiple computers
 Be open
 Be scalable
 Transparency in a Distributed System
 A distributed system that is able to present itself to users and
applications as if it were only a single computer system is said
to be transparent
10
 Different forms of transparency in a distributed system
Transparency Description
AccessHide differences in data representation
(endianness, file naming, ...) and how a resource
is accessed
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their wireless
laptops and moving from place to place
Replication Hide that a resource is replicated (for availability
and performance); all replicas have the same name
Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state; through locking
Failure Hide the failure and recovery of a resource
 But trying to achieve all distribution transparency may be
impossible or may not be a good idea
11
 Openness in a Distributed System
 A distributed system should be open
 We need well-defined interfaces
 Interoperability
 Components of different origin can communicate
 Portability
 Components work on different platforms
 Another goal of an open distributed system is that it should be
flexible and extensible; easy to configure the system out of
different components; easy to add new components, replace
existing ones; easier said than done
 An Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
 Standards - a necessity
12
 Scalability in Distributed Systems
 A distributed system should be scalable; there are three
dimensions
 Size: adding more users and resources to the system
 Geographically: users and resources may be far apart
 Administratively: should be easy to manage even if it spans
many administrative organizations
 But a scalable system may exhibit performance problems
 In distributed systems, such services are often specified through
interfaces often described using an Interface Definition
Language (IDL)
 Specify only syntax: the names of the functions, types of
parameters, return values, possible exceptions, ...
 Semantics given in an informal way by means of natural
languages
13
Concept Example
Centralized services
Single server for all users-mostly for
security reasons
Centralized data A single on-line telephone book
Centralized algorithms
Doing routing based on complete
information
Examples of scalability limitations
 Scalability problems leading to low performance
 Scaling Techniques: how to solve scaling problems
 The problem is mainly performance, and arises as a result of
limitations in the capacity of servers and networks (for
geographical scalability with high latency and mostly unreliable
links)
 Three possible solutions: hiding communication latencies,
distribution, and replication
14
a. Hide Communication Latencies
 Try to avoid waiting for responses to remote service requests
 Let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 Good for batch processing and parallel applications since
independent tasks can be scheduled while another task is
waiting for communication to complete or use multithreading for
non-parallel programs
 Hiding communication latencies is not in general applicable for
interactive applications
 For interactive applications, try to reduce communication; move
part of the job to the client to reduce communication; e.g., filling
a form to access a database and checking the entries
15
(a) A server checking the correctness of field entries
(b) A client doing the job
 e.g., checking the completeness of mandatory fields
 Shipping code is now supported in Web applications using Java
Applets and ActiveX controls (with some security issues)
16
b. Distribution
 Means splitting a component into smaller parts and spreading
those parts across the system
 e.g., DNS - Domain Name System (abebe@aau.edu.et)
 Divide the name space into nonoverlapping zones
 For details, see later in Chapter 5 - Naming
An example of dividing the DNS name space into zones
17
c. Replication
 Replicate components across a distributed system to increase
availability and for load balancing, leading to better
performance
 Replication is decided by the owner of a resource
 Caching (a special form of replication) also reduces
communication latency; decided by the user
 But, caching and replication may lead to consistency problems
(see Chapter 7 - Consistency and Replication)
18
 Pitfalls when Developing Distributed Systems
 Because of false assumptions made by first time developers (of
distributed systems) which are related to the properties of
distributed systems and do not occur in nondistributed
applications
 The network is reliable (making it difficult to achieve failure
transparency)
 The network is secure
 The network is homogeneous
 The topology does not change
 Latency is zero
 Bandwidth is infinite
 Transport cost is zero
 There is one administrator
19
1.3 Types of Distributed Systems
 Three types: distributed computing systems, distributed
information systems, and distributed pervasive/embedded
systems
1. Distributed Computing Systems
 Used for high-performance computing tasks
 Two types: cluster computing and grid computing
 Cluster Computing
 A collection of similar workstations or PCs
(homogeneous), closely connected by means of a high-
speed LAN
 Each node runs the same operating system
 Used for parallel programming in which a single compute
intensive program is run in parallel on multiple machines
20
An example of a cluster computing system
 A master node runs a middleware (containing libraries for
parallel programs) and controls other compute nodes; it
 Allocates tasks
 Provides an interface to users
 etc.
21
 Grid Computing
 “Resource sharing and coordinated problem solving in
dynamic, multi-institutional virtual organizations” (Ian Foster)
 High degree of heterogeneity: no assumptions are made
concerning hardware, operating systems, networks,
administrative domains, security policies, etc.
 Globus is a software system for Grid Computing; read about
the Globus Alliance at http://www.globus.org/
2. Distributed Information Systems
 Many networked applications
 Problem: interoperability
 At the lowest level: wrap a number of requests into a single
larger request and have it executed as a distributed
transaction; all or none of the requests would be executed
 How to let applications communicate directly with each other,
i.e., Enterprise Application Integration (EAI)
22
a. Transaction Processing Systems
 Consider database applications
 Special primitives are required to program transactions, supplied
either by the underlying distributed system or by the language
runtime system
 Exact list of primitives depends on the type of application;
procedure calls, ordinary statements, etc. can also be included
Primitive Description
BEGIN_TRANSACTION Mark the start of a transaction
END_TRANSACTION
Terminate the transaction and try to
commit
ABORT_TRANSACTION
Kill the transaction and restore the old
values
READ
Read data from a file, a table, or
otherwise
WRITE
Write data to a file, a table, or
23
 The Transaction Model
 The model for transactions comes from the world of business
 A supplier and a retailer negotiate on
 Price
 Delivery date
 Quality
 etc.
 Until the deal is concluded they can continue negotiating or one
of them can terminate
 But once they have reached an agreement they are bound by
law to carry out their part of the deal
 Transactions between processes is similar with this scenario
24
 e.g., assume the following banking operation
 Withdraw an amount x from account 1
 Deposit the amount x to account 2
 What happens if there is a problem after the first activity is
carried out?
 Group the two operations into one transaction; either both are
carried out or neither
 We need a way to roll back when a transaction is not
completed
25
(a) Transaction to reserve three flights commits
(b) Transaction aborts when the third flight is unavailable
BEGIN_TRANSACTION
reserve Man  Heathrow;
reserve Heathrow  Bole;
reserve Bole  Lalibella;
END_TRANSACTION
(a)
BEGIN_TRANSACTION
reserve Man  Heathrow;
reserve Heathrow  Bole;
reserve Bole  Lalibella full 
ABORT_TRANSACTION
(b)
 e.g. reserving a seat from Manchester to Lalibella through
Heathrow and AA Bole airports
26
 Properties of transactions, often referred to as ACID
1. Atomic: to the outside world, the transaction happens
indivisibly; a transaction either happens completely or not at
all; intermediate states are not seen by other processes
2. Consistent: the transaction does not violate system
invariants; e.g., in an internal transfer in a bank, the amount
of money in the bank must be the same as it was before the
transfer (the law of conservation of money); this may be
violated for a brief period of time, but not seen to other
processes
3. Isolated or Serializable: concurrent transactions do not
interfere with each other; if two or more transactions are
running at the same time, the final result must look as
though all transactions run sequentially in some order
4. Durable: once a transaction commits, the changes are
permanent; see later in Chapter 8 - Fault Tolerance
27
 Classification of Transactions
 A transaction could be flat, nested or distributed
 Flat Transaction
 Consists of a series of operations that satisfy the ACID
properties
 Simple and widely used but with some limitations
 Do not allow partial results to be committed or aborted
 i.e., atomicity is also partly a weakness
 In our airline reservation example, we may want to
accept the first two reservations and find an alternative
one for the last
 Some transactions may take too much time
28
 Nested Transaction
 Constructed from a number of subtransactions; it is logically
decomposed into a hierarchy of subtransactions; the flight
reservation can be split into three transactions, each
accessing a different database
 The top-level transaction forks off children that run in parallel,
on different machines; to gain performance or for
programming simplicity
 Each may also execute one or more subtransactions
 Permanence (durability) applies only to the top-level
transaction; commits by children should be undone
 Distributed Transaction
 A flat transaction that operates on data that are distributed
across multiple machines
 Problem: separate algorithms are needed to handle the
locking of data and committing the entire transaction; see
later in Chapter 8 for distributed commit
29
(a) A nested transaction
(b) A distributed transaction
30
b. Enterprise Application Integration
 How to integrate applications independent from their
databases
 Transaction systems rely on request/reply
 How can applications communicate with each other; by
means of a middleware
Middleware as a communication facilitator in enterprise application
integration
31
 There are different communication models
 RPC (Remote procedure Call)
 RMI (Remote Method Invocation)
 MOM (Message-Oriented Middleware)
 Stream-Oriented Communication
 Multicast Communication
 See later in Chapter 4 - Communication
3. Distributed Pervasive Systems
 The distributed systems discussed so far are characterized
by their stability; fixed nodes having high-quality connection
to a network
 There are also mobile and embedded computing devices
which are small, battery-powered, mobile, and with a
wireless connection
32
 Three requirements for pervasive applications
 Embrace contextual changes: a device is aware that its
environment (location, identities of nearby people and
objects, time of the day, season, temperature, etc.) may
change all the time, e.g., by changing its network access
point; hence its operations and services must be adapted to
the current context
 Encourage ad hoc composition: devices are used in different
ways by different users
 Recognize sharing as the default: devices join a system to
access or provide information
 Examples of pervasive systems
 Home Systems that integrate consumer electronics
 Electronic Health Care Systems to monitor the well-being of
individuals
 Sensor Networks
 Read pages 26 - 30
33
 Different approaches to distribution - Lost in the forest of
distribution
 Distributed System
 N autonomous computers (sites): n administrators, n
data/control flows
 An interconnection network
 User view: one single (virtual) system
 (traditional) programmer view: client-server
 Parallel System
 1 computer, n nodes: one administrator, one scheduler,
one power source
 Memory: it depends
 Programmer view: one single machine executing parallel
codes; various programming models (message passing,
distributed shared memory, …)
34
 Cluster Computing
 Use of PCs interconnected by a (high performance) network
as a parallel (cheap) machine
 Network Computing
 From LAN (cluster) computing to WAN computing
 Set of machines distributed over a MAN/WAN that are used
to execute parallel loosely coupled codes
 Depending on the infrastructure, network computing comes
in many flavours: grid computing, P2P, Internet computing,
etc.
a. Grid Computing
 “Resource sharing and coordinated problem solving in
dynamic, multi-institutional virtual organizations” (Ian
Foster)
b. Peer-to-Peer Computing
 A site is both client and server
 Application: mostly file sharing, but also others like
Internet Telephony (Skype)
35
 2 approaches:
 Centralized management: Napster
 Distributed management: Gnutella, Kazaa
c. Internet Computing
 Use of (idle) computers interconnected by Internet for
processing large throughput applications
 Programmer view: a single master, n servants
 Cloud Computing
 practice of using a network of remote servers hosted on the
Internet to store, manage, and process data, rather than a local
server or a personal computer.
 A general term for anything that involves delivering hosted
services over the Internet
 A model for enabling convenient, on-demand network
access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and
released with minimal management effort or service
provider interaction
 Service models: Software as a Service - SaaS; Platform
as a Service – PaaS; Infrastructure as a Service - IaaS
36
37

Más contenido relacionado

La actualidad más candente

Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
Rupsee
 

La actualidad más candente (20)

Structure of shared memory space
Structure of shared memory spaceStructure of shared memory space
Structure of shared memory space
 
Input output in linux
Input output in linuxInput output in linux
Input output in linux
 
Distributed Coordination-Based Systems
Distributed Coordination-Based SystemsDistributed Coordination-Based Systems
Distributed Coordination-Based Systems
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
Design Goals of Distributed System
Design Goals of Distributed SystemDesign Goals of Distributed System
Design Goals of Distributed System
 
Concurrency Control Techniques
Concurrency Control TechniquesConcurrency Control Techniques
Concurrency Control Techniques
 
File replication
File replicationFile replication
File replication
 
Uml in software engineering
Uml in software engineeringUml in software engineering
Uml in software engineering
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
multi processors
multi processorsmulti processors
multi processors
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architecture
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 
Cloud Application architecture styles
Cloud Application architecture styles Cloud Application architecture styles
Cloud Application architecture styles
 
11. dfs
11. dfs11. dfs
11. dfs
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating System
 
Distributed system Tanenbaum chapter 1,2,3,4 notes
Distributed system Tanenbaum chapter 1,2,3,4 notes Distributed system Tanenbaum chapter 1,2,3,4 notes
Distributed system Tanenbaum chapter 1,2,3,4 notes
 
system sequence diagram
system sequence diagramsystem sequence diagram
system sequence diagram
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit Protocols
 
Dynamic interconnection networks
Dynamic interconnection networksDynamic interconnection networks
Dynamic interconnection networks
 

Similar a Chapter 1-Introduction.ppt

Chapter 1 introduction
Chapter 1 introductionChapter 1 introduction
Chapter 1 introduction
Tamrat Amare
 
Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systems
Francelyno Murela
 
Chapter 1_NG_2020.ppt
Chapter 1_NG_2020.pptChapter 1_NG_2020.ppt
Chapter 1_NG_2020.ppt
MrVMNair
 

Similar a Chapter 1-Introduction.ppt (20)

Chapter 1 introduction
Chapter 1 introductionChapter 1 introduction
Chapter 1 introduction
 
Distributed Systems.pptx
Distributed Systems.pptxDistributed Systems.pptx
Distributed Systems.pptx
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.ppt
 
Chapter One.ppt
Chapter One.pptChapter One.ppt
Chapter One.ppt
 
istributed system
istributed systemistributed system
istributed system
 
433672084-distributed-vs-parallel-computing-ppt.ppt
433672084-distributed-vs-parallel-computing-ppt.ppt433672084-distributed-vs-parallel-computing-ppt.ppt
433672084-distributed-vs-parallel-computing-ppt.ppt
 
Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systems
 
DISTRIBUTED SYSTEM.docx
DISTRIBUTED SYSTEM.docxDISTRIBUTED SYSTEM.docx
DISTRIBUTED SYSTEM.docx
 
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptxChapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
 
CSI-503 - 11.Distributed Operating System
CSI-503 - 11.Distributed Operating SystemCSI-503 - 11.Distributed Operating System
CSI-503 - 11.Distributed Operating System
 
- Introduction - Distributed - System -
- Introduction - Distributed - System  -- Introduction - Distributed - System  -
- Introduction - Distributed - System -
 
Distributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT Jaipur
Distributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT JaipurDistributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT Jaipur
Distributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT Jaipur
 
Chapter 1_NG_2020.ppt
Chapter 1_NG_2020.pptChapter 1_NG_2020.ppt
Chapter 1_NG_2020.ppt
 
characteristicsofdistributedsystem-121004123308-phpapp02.ppt
characteristicsofdistributedsystem-121004123308-phpapp02.pptcharacteristicsofdistributedsystem-121004123308-phpapp02.ppt
characteristicsofdistributedsystem-121004123308-phpapp02.ppt
 
Distributed system
Distributed systemDistributed system
Distributed system
 
distributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdfdistributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdf
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
Lecture 1 distriubted computing
Lecture 1 distriubted computingLecture 1 distriubted computing
Lecture 1 distriubted computing
 
20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt
 
Overview of Distributed Systems
Overview of Distributed SystemsOverview of Distributed Systems
Overview of Distributed Systems
 

Más de balewayalew

Más de balewayalew (20)

slides.06.pptx
slides.06.pptxslides.06.pptx
slides.06.pptx
 
slides.07.pptx
slides.07.pptxslides.07.pptx
slides.07.pptx
 
slides.08.pptx
slides.08.pptxslides.08.pptx
slides.08.pptx
 
Data Analytics.ppt
Data Analytics.pptData Analytics.ppt
Data Analytics.ppt
 
PE1 Module 4.ppt
PE1 Module 4.pptPE1 Module 4.ppt
PE1 Module 4.ppt
 
PE1 Module 3.ppt
PE1 Module 3.pptPE1 Module 3.ppt
PE1 Module 3.ppt
 
PE1 Module 2.ppt
PE1 Module 2.pptPE1 Module 2.ppt
PE1 Module 2.ppt
 
Chapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptxChapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptx
 
Chapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptxChapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptx
 
Chapter 8.ppt
Chapter 8.pptChapter 8.ppt
Chapter 8.ppt
 
PE1 Module 1.ppt
PE1 Module 1.pptPE1 Module 1.ppt
PE1 Module 1.ppt
 
chapter7.ppt
chapter7.pptchapter7.ppt
chapter7.ppt
 
chapter6.ppt
chapter6.pptchapter6.ppt
chapter6.ppt
 
chapter5.ppt
chapter5.pptchapter5.ppt
chapter5.ppt
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
chapter3.ppt
chapter3.pptchapter3.ppt
chapter3.ppt
 
chapter2.ppt
chapter2.pptchapter2.ppt
chapter2.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
Ch 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.pptCh 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.ppt
 
Ch 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptxCh 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptx
 

Último

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Chapter 1-Introduction.ppt

  • 1. Chapter 1 - Introduction Ayalew Belay (Dr.) with thanks to Dr. Mulugeta Lebsie
  • 2. 2 1.1 Introduction and Definition  Before the mid-80s, computers were  Very expensive (hundred of thousands or even millions of dollars)  Very slow (a few thousand instructions per second)  Not connected among themselves  After the mid-80s: two major developments  Cheap and powerful microprocessor-based computers appeared  Computer networks  LANs at speeds ranging from 10 to 1000 Mbps (now even 10, 40, and 100Gbps)  WANs at speed ranging from 64 Kbps to gigabits/sec  Consequence  Feasibility of using a large network of computers to work for the same application; this is in contrast to the old centralized systems where there was a single computer with its peripherals
  • 3. 3  Definition of a Distributed System  A distributed system is: A collection of independent computers that appears to its users as a single coherent system - computer (Tanenbaum & Van Steen)  Other Definitions A distributed system is a system designed to support the development of applications and services which can exploit a physical architecture consisting of multiple, autonomous processing elements that do not share primary memory but cooperate by sending asynchronous messages over a communication network (Blair & Stefani)  The definitions has two aspects: 1. Hardware: autonomous machines 2. Software: a single system view for the users
  • 4. 4 A distributed system is one that stops you getting any work done when a machine you have never even heard of crashes (Leslie)
  • 5. 5  Why Distributed?  Resource and Data Sharing  Printers, databases, multimedia servers, ...  Availability, Reliability  The loss of some instances can be hidden  Scalability, Extensibility  The system grows with demand (e.g., extra servers)  Performance  Huge power (CPU, memory, ...) available  Inherent distribution, communication  Organizational distribution, e-mail, video
  • 6. 6  Problems of Distribution  Concurrency, Security  Clients must not disturb each other  Privacy  e.g., when building a preference profile such as using cookies  Unwanted communication such as spam  Partial failure  We often do not know where the error is (e.g., RPC)  Location, Migration, Relocation, Replication  Clients must be able to find their servers  Heterogeneity  Hardware, platforms, languages, management
  • 7. 7  Characteristics of Distributed Systems  Differences between the computers and the ways they communicate are hidden from users  Users and applications can interact with a distributed system in a consistent and uniform way regardless of location  Distributed systems should be easy to expand and scale  A distributed system is normally continuously available, even if there may be partial failures
  • 8. 8 1.2 Goals of a Distributed System  To support heterogeneous computers and networks and to provide a single-system view, a distributed system is often organized by means of a layer of software called middleware that extends over multiple machines A distributed system organized as middleware; note that the middleware layer extends over multiple machines, and offers each application the same interface Ack: most diagrams in all slides are taken from the textbook
  • 9. 9  A distributed system should  Easily connect users with resources (printers, computers, storage facilities, data, files, Web pages, ...)  Some of the reasons  Economics: sharing resources such as printers and high- speed computers  To collaborate and exchange information  Groupware: software for collaborative editing, teleconferencing, etc.  e-commerce: buying and selling goods  Be transparent: hide the fact that the resources and processes are distributed across multiple computers  Be open  Be scalable  Transparency in a Distributed System  A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent
  • 10. 10  Different forms of transparency in a distributed system Transparency Description AccessHide differences in data representation (endianness, file naming, ...) and how a resource is accessed Location Hide where a resource is physically located; where is http://www.prenhall.com/index.html? (naming) Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use; e.g., mobile users using their wireless laptops and moving from place to place Replication Hide that a resource is replicated (for availability and performance); all replicas have the same name Concurrency Hide that a resource may be shared by several competitive users; a resource must be left in a consistent state; through locking Failure Hide the failure and recovery of a resource  But trying to achieve all distribution transparency may be impossible or may not be a good idea
  • 11. 11  Openness in a Distributed System  A distributed system should be open  We need well-defined interfaces  Interoperability  Components of different origin can communicate  Portability  Components work on different platforms  Another goal of an open distributed system is that it should be flexible and extensible; easy to configure the system out of different components; easy to add new components, replace existing ones; easier said than done  An Open Distributed System is a system that offers services according to standard rules that describe the syntax and semantics of those services; e.g., protocols in networks  Standards - a necessity
  • 12. 12  Scalability in Distributed Systems  A distributed system should be scalable; there are three dimensions  Size: adding more users and resources to the system  Geographically: users and resources may be far apart  Administratively: should be easy to manage even if it spans many administrative organizations  But a scalable system may exhibit performance problems  In distributed systems, such services are often specified through interfaces often described using an Interface Definition Language (IDL)  Specify only syntax: the names of the functions, types of parameters, return values, possible exceptions, ...  Semantics given in an informal way by means of natural languages
  • 13. 13 Concept Example Centralized services Single server for all users-mostly for security reasons Centralized data A single on-line telephone book Centralized algorithms Doing routing based on complete information Examples of scalability limitations  Scalability problems leading to low performance  Scaling Techniques: how to solve scaling problems  The problem is mainly performance, and arises as a result of limitations in the capacity of servers and networks (for geographical scalability with high latency and mostly unreliable links)  Three possible solutions: hiding communication latencies, distribution, and replication
  • 14. 14 a. Hide Communication Latencies  Try to avoid waiting for responses to remote service requests  Let the requester do other useful job  i.e., construct requesting applications that use only asynchronous communication instead of synchronous communication; when a reply arrives the application is interrupted  Good for batch processing and parallel applications since independent tasks can be scheduled while another task is waiting for communication to complete or use multithreading for non-parallel programs  Hiding communication latencies is not in general applicable for interactive applications  For interactive applications, try to reduce communication; move part of the job to the client to reduce communication; e.g., filling a form to access a database and checking the entries
  • 15. 15 (a) A server checking the correctness of field entries (b) A client doing the job  e.g., checking the completeness of mandatory fields  Shipping code is now supported in Web applications using Java Applets and ActiveX controls (with some security issues)
  • 16. 16 b. Distribution  Means splitting a component into smaller parts and spreading those parts across the system  e.g., DNS - Domain Name System (abebe@aau.edu.et)  Divide the name space into nonoverlapping zones  For details, see later in Chapter 5 - Naming An example of dividing the DNS name space into zones
  • 17. 17 c. Replication  Replicate components across a distributed system to increase availability and for load balancing, leading to better performance  Replication is decided by the owner of a resource  Caching (a special form of replication) also reduces communication latency; decided by the user  But, caching and replication may lead to consistency problems (see Chapter 7 - Consistency and Replication)
  • 18. 18  Pitfalls when Developing Distributed Systems  Because of false assumptions made by first time developers (of distributed systems) which are related to the properties of distributed systems and do not occur in nondistributed applications  The network is reliable (making it difficult to achieve failure transparency)  The network is secure  The network is homogeneous  The topology does not change  Latency is zero  Bandwidth is infinite  Transport cost is zero  There is one administrator
  • 19. 19 1.3 Types of Distributed Systems  Three types: distributed computing systems, distributed information systems, and distributed pervasive/embedded systems 1. Distributed Computing Systems  Used for high-performance computing tasks  Two types: cluster computing and grid computing  Cluster Computing  A collection of similar workstations or PCs (homogeneous), closely connected by means of a high- speed LAN  Each node runs the same operating system  Used for parallel programming in which a single compute intensive program is run in parallel on multiple machines
  • 20. 20 An example of a cluster computing system  A master node runs a middleware (containing libraries for parallel programs) and controls other compute nodes; it  Allocates tasks  Provides an interface to users  etc.
  • 21. 21  Grid Computing  “Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster)  High degree of heterogeneity: no assumptions are made concerning hardware, operating systems, networks, administrative domains, security policies, etc.  Globus is a software system for Grid Computing; read about the Globus Alliance at http://www.globus.org/ 2. Distributed Information Systems  Many networked applications  Problem: interoperability  At the lowest level: wrap a number of requests into a single larger request and have it executed as a distributed transaction; all or none of the requests would be executed  How to let applications communicate directly with each other, i.e., Enterprise Application Integration (EAI)
  • 22. 22 a. Transaction Processing Systems  Consider database applications  Special primitives are required to program transactions, supplied either by the underlying distributed system or by the language runtime system  Exact list of primitives depends on the type of application; procedure calls, ordinary statements, etc. can also be included Primitive Description BEGIN_TRANSACTION Mark the start of a transaction END_TRANSACTION Terminate the transaction and try to commit ABORT_TRANSACTION Kill the transaction and restore the old values READ Read data from a file, a table, or otherwise WRITE Write data to a file, a table, or
  • 23. 23  The Transaction Model  The model for transactions comes from the world of business  A supplier and a retailer negotiate on  Price  Delivery date  Quality  etc.  Until the deal is concluded they can continue negotiating or one of them can terminate  But once they have reached an agreement they are bound by law to carry out their part of the deal  Transactions between processes is similar with this scenario
  • 24. 24  e.g., assume the following banking operation  Withdraw an amount x from account 1  Deposit the amount x to account 2  What happens if there is a problem after the first activity is carried out?  Group the two operations into one transaction; either both are carried out or neither  We need a way to roll back when a transaction is not completed
  • 25. 25 (a) Transaction to reserve three flights commits (b) Transaction aborts when the third flight is unavailable BEGIN_TRANSACTION reserve Man  Heathrow; reserve Heathrow  Bole; reserve Bole  Lalibella; END_TRANSACTION (a) BEGIN_TRANSACTION reserve Man  Heathrow; reserve Heathrow  Bole; reserve Bole  Lalibella full  ABORT_TRANSACTION (b)  e.g. reserving a seat from Manchester to Lalibella through Heathrow and AA Bole airports
  • 26. 26  Properties of transactions, often referred to as ACID 1. Atomic: to the outside world, the transaction happens indivisibly; a transaction either happens completely or not at all; intermediate states are not seen by other processes 2. Consistent: the transaction does not violate system invariants; e.g., in an internal transfer in a bank, the amount of money in the bank must be the same as it was before the transfer (the law of conservation of money); this may be violated for a brief period of time, but not seen to other processes 3. Isolated or Serializable: concurrent transactions do not interfere with each other; if two or more transactions are running at the same time, the final result must look as though all transactions run sequentially in some order 4. Durable: once a transaction commits, the changes are permanent; see later in Chapter 8 - Fault Tolerance
  • 27. 27  Classification of Transactions  A transaction could be flat, nested or distributed  Flat Transaction  Consists of a series of operations that satisfy the ACID properties  Simple and widely used but with some limitations  Do not allow partial results to be committed or aborted  i.e., atomicity is also partly a weakness  In our airline reservation example, we may want to accept the first two reservations and find an alternative one for the last  Some transactions may take too much time
  • 28. 28  Nested Transaction  Constructed from a number of subtransactions; it is logically decomposed into a hierarchy of subtransactions; the flight reservation can be split into three transactions, each accessing a different database  The top-level transaction forks off children that run in parallel, on different machines; to gain performance or for programming simplicity  Each may also execute one or more subtransactions  Permanence (durability) applies only to the top-level transaction; commits by children should be undone  Distributed Transaction  A flat transaction that operates on data that are distributed across multiple machines  Problem: separate algorithms are needed to handle the locking of data and committing the entire transaction; see later in Chapter 8 for distributed commit
  • 29. 29 (a) A nested transaction (b) A distributed transaction
  • 30. 30 b. Enterprise Application Integration  How to integrate applications independent from their databases  Transaction systems rely on request/reply  How can applications communicate with each other; by means of a middleware Middleware as a communication facilitator in enterprise application integration
  • 31. 31  There are different communication models  RPC (Remote procedure Call)  RMI (Remote Method Invocation)  MOM (Message-Oriented Middleware)  Stream-Oriented Communication  Multicast Communication  See later in Chapter 4 - Communication 3. Distributed Pervasive Systems  The distributed systems discussed so far are characterized by their stability; fixed nodes having high-quality connection to a network  There are also mobile and embedded computing devices which are small, battery-powered, mobile, and with a wireless connection
  • 32. 32  Three requirements for pervasive applications  Embrace contextual changes: a device is aware that its environment (location, identities of nearby people and objects, time of the day, season, temperature, etc.) may change all the time, e.g., by changing its network access point; hence its operations and services must be adapted to the current context  Encourage ad hoc composition: devices are used in different ways by different users  Recognize sharing as the default: devices join a system to access or provide information  Examples of pervasive systems  Home Systems that integrate consumer electronics  Electronic Health Care Systems to monitor the well-being of individuals  Sensor Networks  Read pages 26 - 30
  • 33. 33  Different approaches to distribution - Lost in the forest of distribution  Distributed System  N autonomous computers (sites): n administrators, n data/control flows  An interconnection network  User view: one single (virtual) system  (traditional) programmer view: client-server  Parallel System  1 computer, n nodes: one administrator, one scheduler, one power source  Memory: it depends  Programmer view: one single machine executing parallel codes; various programming models (message passing, distributed shared memory, …)
  • 34. 34  Cluster Computing  Use of PCs interconnected by a (high performance) network as a parallel (cheap) machine  Network Computing  From LAN (cluster) computing to WAN computing  Set of machines distributed over a MAN/WAN that are used to execute parallel loosely coupled codes  Depending on the infrastructure, network computing comes in many flavours: grid computing, P2P, Internet computing, etc. a. Grid Computing  “Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster) b. Peer-to-Peer Computing  A site is both client and server  Application: mostly file sharing, but also others like Internet Telephony (Skype)
  • 35. 35  2 approaches:  Centralized management: Napster  Distributed management: Gnutella, Kazaa c. Internet Computing  Use of (idle) computers interconnected by Internet for processing large throughput applications  Programmer view: a single master, n servants  Cloud Computing  practice of using a network of remote servers hosted on the Internet to store, manage, and process data, rather than a local server or a personal computer.  A general term for anything that involves delivering hosted services over the Internet
  • 36.  A model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction  Service models: Software as a Service - SaaS; Platform as a Service – PaaS; Infrastructure as a Service - IaaS 36
  • 37. 37

Notas del editor

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37