SlideShare una empresa de Scribd logo
1 de 35
Interprocess Communication


    Chapter 4 Distributed Systems,
        Concepts and Design
Distributed Systems
• “A distributed system is a collection of
  independent computers that appear to
  its users as a single system.”
  (Tanenbaum)
• Distributed systems are therefore built
  around communication. Actually, it
  could be argued that computers are
  used more as communication devices
  than computational devices.
Communications
• Because communications are critical to
  distributed systems, communications
  protocols tend to be well defined. A key form
  of communications is interprocess
  communications, based on low-level message
  passing over the network.
• Protocols are sets of rules that must be
  followed to enable standardized
  communications.
Overhead
• Overhead is a financial term that refers to indirect
  costs in a business. For example, a merchant cannot
  sell you a product for the price that he pays because
  he has additional costs beyond buying the
  merchandise such as rent and staff wages. Overhead
  always puts pressure on profits, so it must be kept to
  a minimum. Because corporations treat information
  technology as overhead, overhead is a major concern
  in this course. Activities that support work rather
  than doing work in IT are also costs and are referred
  to as overhead.
Communications Overhead
• In most communication systems, overhead is a key
  concern. Overhead activities are background
  operations that do not directly involve sending and
  receiving messages. Headers and footers involve
  sending extra information, so they are overhead. In
  a phone system, overhead includes time spent setting
  up and tearing down the circuit path over which a
  phone call can take place. TCP is like a phone call,
  since it has to set up, tear down, and manage
  operations in addition to “talk time.”
Headers and trailers
• Each level is packaged as data to other
  levels with a header attached.
    Headers                                             Trailer



                               Message

Note that short messages are mostly overhead while long messages
involve a much higher proportion of actual work.
Normal Operation of TCP
Figure 2-4a in Tanenbaum et al



                SYN                  Steps 4 and 7 do the
1
            SYN, ACK(SYN)            communication. All
                                 2
              ACK(SYN)               of the rest of the TCP
3
               request               messages are
4
                FIN                  overhead operations.
5
           ACK(req+FIN)          6      KEY:
                answer
                                 7      SYNchronize
                FIN                     ACKnowledge
                                 8
              ACK(FIN)
9                                       FINished
Transactional TCP
Figure 2-4b in Tanenbaum et al



           SYN, request, FIN
1                                     By sending the
     SYN, ACK(FIN), answer, FIN   2   message and response
              ACK(FIN)
3                                     with the overhead
                                      signals, transactional
                                      TCP can speed up
                                      throughput and
                                      reduce overhead time
                                      delays.
Classroom Exercise
• Calculate the percentage improvement in
  throughput of Transactional TCP (sending 3
  messages instead of 9) under the following
  assumptions:
• 1) Short packets, dominated by latency of 10 ms.
• 2) Ethernet LAN, 10 ms latency, 10Mbps bandwidth,
  maximum Ethernet packet size of 1500 bytes.
• 3) TCP/IP WAN, 20 ms latency, 500 Mbps bandwidth,
  maximum TCP packet size of 64KB. (Latency assumes
  multiple hops between routers)
• Thought exercise: When is Transactional TCP
  worthwhile?
Ethernet Jumbo Frames
• Ethernet Jumbo Frames of 9KB are possible
  if supported end to end. A 9KB Ethernet
  frame can hold an 8 KB TCP/IP datagram
  (NFS standard) plus packet overhead.
  Ethernet cannot use 64KB packets because
  it uses CRC for error correction, and CRC
  has an upper limit of 12KB, which is hard
  to change. [P. Dykstra]
Upper Bound of TCP
• Dykstra’s article (see References) is a good discussion
  of frame (packet, datagram) size.
• Dykstra quotes an article by Matt Mathis et al. which
  sets this limit on TCP WAN performance:
• Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss))
• MSS – Max Segment Size = Packet size minus TCP
  headers
• rtt = Round trip time (about 40 ms NYC – LA)
• packet_loss = percentage of packets lost (wide
  variation, 0.1 % is a typical value.
Importance of Mathis Formula

• If you examine the formula:
• Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss))
• You will see that throughput is dominated by the
  maximum segment size, since the error rate has an
  inverse square effect on performance. In general,
  doubling the MSS doubles performance.
• Remember that maximum segment size, packet size,
  datagram size and frame size all mean
  approximately the same thing.
Storing Data
• Data stored in digital format is composed of binary
  sequences that have a combination of logical and
  arbitrary meanings attached to them. Most binary
  formats for numbers are logical, although there are
  a lot of differences in storage sizes and handling
  negative numbers and exponents. While it is
  somewhat logical that 0101 represents 5 as a short
  integer, it is somewhat less logical that 01000001
  represents A and 01100001represents a in the ASCII
  code or that 00011000 represents A and 00010100
  represents a in the EBCDIC code.
Numeric formats
• Some computers store data in memory in different
  ways, so that a value of 11110000 might be stored
  so that the 1111 is in the lowest memory location
  on one computer and the 0000 on another. The
  same binary integer would have different meanings
  as an unsigned integer or a signed integer with
  two’s complement notation. There are different
  formats for storing floating point numbers.
  Computers have different register sizes, making
  default word sizes of 8, 16, 32, 36 or 64 bits most
  practical in different CPUs.
Transferring Data
• With different coding schemes, memory storage
  order, word sizes and numeric formats, generic
  attempts to transfer information between systems
  must carefully define formats for the transferred
  data and have ways to convert data to the data
  transfer format and back to another format. Such a
  scheme must understand the format at both ends of
  the transaction. The intermediate format is called an
  External Data Representation (XDR), and a set of
  commands to accomplish that is called an Interface
  Definition Language (IDL).
External Data Representation

   • There are three different common approaches
     to XDR:
   • CORBA’s common data representation, which
     can be used by a variety of languages.
   • Java’s object serialization, which can even
     pass complex objects across a network, but is
     limited to Java only.
   • Extensible Markup Language (XML), which
     can represent even structured data as ASCII
     text.
Marshalling and
Unmarshalling

  • Converting information to a network
    transportable form (XDR) following
    the specifications of an IDL is called
    marshalling. Converting it back to an
    application readable format is called
    unmarshalling.
Java Object Serialization
• Serialization transforms an object into a sequence of
  bytes. This allows objects to be saved to files or
  transferred across a network, and is a key feature of
  Java. Since objects can have attributes that are also
  objects, and those objects can have object attributes,
  serialization allows a very complex structure to be
  transferred across a network or stored in a file.
• Classes that need to be stored in files or transferred
  over a network should implement the
  java.io.serializable interface.
Reflection
• Java supports reflection—the ability to
  enquire about the properties of a class,
  including the names and types of its instance
  variables. Classes can be created from their
  names, and a constructor with specified
  arguments can create a class. Reflection
  makes serialization and deserialization
  possible and allows a class to be instantiated
  by a Java Virtual Machine after transfer across
  a network.
The Document is the Object
 XML (eXtensible Markup Language)
Describes the structure of a document
Defines new tags
Specifies metadata that lets programs discover
  document structure
 DOM (Document Object Model)
Allows programmatic access to XML
  structure and content of XML documents
 XSL (eXtensible Style Language)
The XML version of Style sheets
What is XML?
• XML stands for eXtensible Markup Language.
• XML specification defines a syntax and
  document organization for data, represented by
  tag/value pairs.
• XML Elements have data surrounded by
  matching start and end tags.
• XML Attributes are optional in some start tags
  and have an identifier with an = sign.
• There is a well defined syntax that can be
  parsed.
XML Namespaces
• An XML namespace is a collection of names,
  identified by a URI reference, which are used in
  XML documents as element types and attribute
  names. XML namespaces have internal structure
  and are not, mathematically speaking, sets.
• The file that identifies the namespace can be
  specified as an attribute called xmlns like this:
  xmlns:pers = “http://www.cdk4.net/person
• See http://www.w3.org/XML/ for specifications.
XML Schemas
• An XML Schema defines the elements
  and attributes that can be used in a
  document, how they can be nested, the
  order and number of the elements, and
  whether an element is empty or can
  include text. Default values and types
  are defined. An example is Coulouris
  figure 4.12 shown on the next slide.
Figure 4.12 An XML schema
for the Person structure
<xsd:schema xmlns:xsd = URL of XML schema definitions >
   <xsd:element name= "person" type ="personType" />
        <xsd:complexType name="personType">
                 <xsd:sequence>
                         <xsd:element name = "name" type="xs:string"/>
                         <xsd:element name = "place" type="xs:string"/>
                         <xsd:element name = "year"
   type="xs:positiveInteger"/>
                 </xsd:sequence>
                 <xsd:attribute name= "id" type = "xs:positiveInteger"/>
        </xsd:complexType>
</xsd:schema>
XML: Structured Data in a
 Text File

 Spreadsheets, address books,
 configuration parameters, financial
 transactions, product catalogs…
 XML defines a set of rules and
 conventions for designing text formats for
 such data
 Easy to generate and read by computer
 Extensible
Role of XML
• Applications built on different
  technologies can communicate via XML.
• New integration tools and integration
  servers capitalize on emergence of XML as
  an integration technology.
• Many .NET and J2EE technologies, such as
  SOAP, XML Web Services, JXTA, XML-RPC,
  and EJB use or are based on XML.
Client/Server Communication

  • Communication in Client/Server
    systems uses a variety of well specified
    request/reply mechanisms with send
    and receive protocols defined by TCP,
    RPC, Java RMI, CORBA and other
    formats.
Figure 4.14
Request-reply communication

      Client                    Server



                   Request
   doOperation
                   message   getRequest
                             select object
      (wait)                   execute
                   Reply       method
                   message    sendReply
  (continuation)
Message Oriented
Communication

  • Remote procedure calls and remote object
    invocation are not always sufficient or
    appropriate for all communications in
    distributed systems. They tend to be
    optimized for immediate connections
    between two systems, and may be inadequate
    for operations that persist over time or
    involve multiple connections requiring
    synchronization. For this, message oriented
    protocols such as mail protocols have been
    developed.
Persistent Communication
• In persistent communication, a
  message may be stored until it can be
  passed on to a recipient. Compare this
  to the distinction between a simple
  telephone and an answering machine.
  Without the answering machine, you
  must be present when the phone rings
  to get a message.
Message Oriented
Middleware
• In MOM, applications communicate by inserting
  messages in specific queues. As the queues are
  processed, messages are forwarded to other
  computers. There may be several intermediates.
  At the destination queue, individual messages may
  be accepted and acted upon, and responses sent
  back through the system. Only passing to the
  receiver’s queue is guaranteed by the system.
  Accepting, reading or acting upon the message is
  up to the receiver.
MOM
• Messages can contain any data, but must be
  properly addressed. Usually, there is a systemwide
  unique name for the receiving queue. This allows a
  very simple interface. Queues are managed by
  queue managers, which may also act as relays to
  forward messages to other queues. Messages of
  different types can be interconnected by
  specialized applications called message brokers,
  which apply a set of rules to convert a message to
  a different type.
IBM’s MQ Series
• IBM’s MQ Series is a popular mainframe
  message oriented middleware system
  that has also been integrated into
  IBM’s WebSphere Web Server.
• Details can be found at the
  IBM Web Site.
• The text gives a brief summary of the
  functionality and operation of MQ
  Series.
Data Streams
• There are a variety of approaches to stream
  oriented communications, which consist of
  ways to pass timing dependent information
  over persistent connections that are
  established for the purpose. The sockets
  exercise gives a good practical understanding
  of TCP streams. Other mechanisms include
  pipes and compiler based stream libraries.
References

• George Coularis, Jean Dollimore and Tim Kindberg,
  Distributed Systems, Concepts and Design, Addison
  Wesley, Fourth Edition, 2005
• Figures from the Coulouris text are from the
  instructor’s guide and are copyrighted by Pearson
  Education 2005
• Andrew Tanenbaum and Martin van Steen, Distributed
  Systems, Principles and Paradigms, Prentice Hall, 2002
• Phil Dykstra, Gigabit Ethernet Jumbo Frames
  http://sd.wareonearth.com/~phil/jumbo.html

Más contenido relacionado

La actualidad más candente

Chapter 4 a interprocess communication
Chapter 4 a interprocess communicationChapter 4 a interprocess communication
Chapter 4 a interprocess communicationAbDul ThaYyal
 
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)k33a
 
Connection Establishment & Flow and Congestion Control
Connection Establishment & Flow and Congestion ControlConnection Establishment & Flow and Congestion Control
Connection Establishment & Flow and Congestion ControlAdeel Rasheed
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI Hanif Durad
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsAya Mahmoud
 
Interprocess communication (IPC) IN O.S
Interprocess communication (IPC) IN O.SInterprocess communication (IPC) IN O.S
Interprocess communication (IPC) IN O.SHussain Ala'a Alkabi
 
Congestion control, slow start, fast retransmit
Congestion control, slow start, fast retransmit   Congestion control, slow start, fast retransmit
Congestion control, slow start, fast retransmit rajisri2
 
User datagram protocol (udp)
User datagram protocol (udp)User datagram protocol (udp)
User datagram protocol (udp)Ramola Dhande
 
Overview of Concurrency Control & Recovery in Distributed Databases
Overview of Concurrency Control & Recovery in Distributed DatabasesOverview of Concurrency Control & Recovery in Distributed Databases
Overview of Concurrency Control & Recovery in Distributed DatabasesMeghaj Mallick
 

La actualidad más candente (20)

Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
Chapter 4 a interprocess communication
Chapter 4 a interprocess communicationChapter 4 a interprocess communication
Chapter 4 a interprocess communication
 
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)
 
message passing
 message passing message passing
message passing
 
Distributed Operating System_4
Distributed Operating System_4Distributed Operating System_4
Distributed Operating System_4
 
Connection Establishment & Flow and Congestion Control
Connection Establishment & Flow and Congestion ControlConnection Establishment & Flow and Congestion Control
Connection Establishment & Flow and Congestion Control
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI
 
TCP and UDP
TCP and UDP TCP and UDP
TCP and UDP
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systems
 
Interprocess communication (IPC) IN O.S
Interprocess communication (IPC) IN O.SInterprocess communication (IPC) IN O.S
Interprocess communication (IPC) IN O.S
 
IPC
IPCIPC
IPC
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Congestion control, slow start, fast retransmit
Congestion control, slow start, fast retransmit   Congestion control, slow start, fast retransmit
Congestion control, slow start, fast retransmit
 
Multiplexing
MultiplexingMultiplexing
Multiplexing
 
Transport layer protocol
Transport layer protocolTransport layer protocol
Transport layer protocol
 
Ipc
IpcIpc
Ipc
 
Rpc
RpcRpc
Rpc
 
TCP/IP and UDP protocols
TCP/IP and UDP protocolsTCP/IP and UDP protocols
TCP/IP and UDP protocols
 
User datagram protocol (udp)
User datagram protocol (udp)User datagram protocol (udp)
User datagram protocol (udp)
 
Overview of Concurrency Control & Recovery in Distributed Databases
Overview of Concurrency Control & Recovery in Distributed DatabasesOverview of Concurrency Control & Recovery in Distributed Databases
Overview of Concurrency Control & Recovery in Distributed Databases
 

Similar a Communications Overhead in Distributed Systems

Similar a Communications Overhead in Distributed Systems (20)

Network fundamental
Network fundamentalNetwork fundamental
Network fundamental
 
01 pengenalan
01 pengenalan01 pengenalan
01 pengenalan
 
Chapter 4 communication2
Chapter 4 communication2Chapter 4 communication2
Chapter 4 communication2
 
Internet1
Internet1Internet1
Internet1
 
Bt0072 computer networks 1
Bt0072 computer networks  1Bt0072 computer networks  1
Bt0072 computer networks 1
 
lecture 4.pptx
lecture 4.pptxlecture 4.pptx
lecture 4.pptx
 
Lecture 3- tcp-ip
Lecture  3- tcp-ipLecture  3- tcp-ip
Lecture 3- tcp-ip
 
Module 1 slides
Module 1 slidesModule 1 slides
Module 1 slides
 
SYBSC IT COMPUTER NETWORKS UNIT I Network Models
SYBSC IT COMPUTER NETWORKS UNIT I Network ModelsSYBSC IT COMPUTER NETWORKS UNIT I Network Models
SYBSC IT COMPUTER NETWORKS UNIT I Network Models
 
Tcp ip
Tcp ipTcp ip
Tcp ip
 
Automation Networking By Shivam Singh
Automation Networking By Shivam SinghAutomation Networking By Shivam Singh
Automation Networking By Shivam Singh
 
Osi model
Osi modelOsi model
Osi model
 
Web technologies: recap on TCP-IP
Web technologies: recap on TCP-IPWeb technologies: recap on TCP-IP
Web technologies: recap on TCP-IP
 
Ictinfraosi7 layers tcpipmodel2016e
Ictinfraosi7 layers tcpipmodel2016eIctinfraosi7 layers tcpipmodel2016e
Ictinfraosi7 layers tcpipmodel2016e
 
Exploration network chapter_5_modified
Exploration network chapter_5_modifiedExploration network chapter_5_modified
Exploration network chapter_5_modified
 
OSI model (7 LAYER )
OSI model (7 LAYER )OSI model (7 LAYER )
OSI model (7 LAYER )
 
Document
DocumentDocument
Document
 
TCP/IP Protocols
TCP/IP ProtocolsTCP/IP Protocols
TCP/IP Protocols
 
ND0801_Assignment_3_Protocols for P3
ND0801_Assignment_3_Protocols for P3ND0801_Assignment_3_Protocols for P3
ND0801_Assignment_3_Protocols for P3
 
OSI model.pptx
OSI model.pptxOSI model.pptx
OSI model.pptx
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Communications Overhead in Distributed Systems

  • 1. Interprocess Communication Chapter 4 Distributed Systems, Concepts and Design
  • 2. Distributed Systems • “A distributed system is a collection of independent computers that appear to its users as a single system.” (Tanenbaum) • Distributed systems are therefore built around communication. Actually, it could be argued that computers are used more as communication devices than computational devices.
  • 3. Communications • Because communications are critical to distributed systems, communications protocols tend to be well defined. A key form of communications is interprocess communications, based on low-level message passing over the network. • Protocols are sets of rules that must be followed to enable standardized communications.
  • 4. Overhead • Overhead is a financial term that refers to indirect costs in a business. For example, a merchant cannot sell you a product for the price that he pays because he has additional costs beyond buying the merchandise such as rent and staff wages. Overhead always puts pressure on profits, so it must be kept to a minimum. Because corporations treat information technology as overhead, overhead is a major concern in this course. Activities that support work rather than doing work in IT are also costs and are referred to as overhead.
  • 5. Communications Overhead • In most communication systems, overhead is a key concern. Overhead activities are background operations that do not directly involve sending and receiving messages. Headers and footers involve sending extra information, so they are overhead. In a phone system, overhead includes time spent setting up and tearing down the circuit path over which a phone call can take place. TCP is like a phone call, since it has to set up, tear down, and manage operations in addition to “talk time.”
  • 6. Headers and trailers • Each level is packaged as data to other levels with a header attached. Headers Trailer Message Note that short messages are mostly overhead while long messages involve a much higher proportion of actual work.
  • 7. Normal Operation of TCP Figure 2-4a in Tanenbaum et al SYN Steps 4 and 7 do the 1 SYN, ACK(SYN) communication. All 2 ACK(SYN) of the rest of the TCP 3 request messages are 4 FIN overhead operations. 5 ACK(req+FIN) 6 KEY: answer 7 SYNchronize FIN ACKnowledge 8 ACK(FIN) 9 FINished
  • 8. Transactional TCP Figure 2-4b in Tanenbaum et al SYN, request, FIN 1 By sending the SYN, ACK(FIN), answer, FIN 2 message and response ACK(FIN) 3 with the overhead signals, transactional TCP can speed up throughput and reduce overhead time delays.
  • 9. Classroom Exercise • Calculate the percentage improvement in throughput of Transactional TCP (sending 3 messages instead of 9) under the following assumptions: • 1) Short packets, dominated by latency of 10 ms. • 2) Ethernet LAN, 10 ms latency, 10Mbps bandwidth, maximum Ethernet packet size of 1500 bytes. • 3) TCP/IP WAN, 20 ms latency, 500 Mbps bandwidth, maximum TCP packet size of 64KB. (Latency assumes multiple hops between routers) • Thought exercise: When is Transactional TCP worthwhile?
  • 10. Ethernet Jumbo Frames • Ethernet Jumbo Frames of 9KB are possible if supported end to end. A 9KB Ethernet frame can hold an 8 KB TCP/IP datagram (NFS standard) plus packet overhead. Ethernet cannot use 64KB packets because it uses CRC for error correction, and CRC has an upper limit of 12KB, which is hard to change. [P. Dykstra]
  • 11. Upper Bound of TCP • Dykstra’s article (see References) is a good discussion of frame (packet, datagram) size. • Dykstra quotes an article by Matt Mathis et al. which sets this limit on TCP WAN performance: • Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss)) • MSS – Max Segment Size = Packet size minus TCP headers • rtt = Round trip time (about 40 ms NYC – LA) • packet_loss = percentage of packets lost (wide variation, 0.1 % is a typical value.
  • 12. Importance of Mathis Formula • If you examine the formula: • Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss)) • You will see that throughput is dominated by the maximum segment size, since the error rate has an inverse square effect on performance. In general, doubling the MSS doubles performance. • Remember that maximum segment size, packet size, datagram size and frame size all mean approximately the same thing.
  • 13. Storing Data • Data stored in digital format is composed of binary sequences that have a combination of logical and arbitrary meanings attached to them. Most binary formats for numbers are logical, although there are a lot of differences in storage sizes and handling negative numbers and exponents. While it is somewhat logical that 0101 represents 5 as a short integer, it is somewhat less logical that 01000001 represents A and 01100001represents a in the ASCII code or that 00011000 represents A and 00010100 represents a in the EBCDIC code.
  • 14. Numeric formats • Some computers store data in memory in different ways, so that a value of 11110000 might be stored so that the 1111 is in the lowest memory location on one computer and the 0000 on another. The same binary integer would have different meanings as an unsigned integer or a signed integer with two’s complement notation. There are different formats for storing floating point numbers. Computers have different register sizes, making default word sizes of 8, 16, 32, 36 or 64 bits most practical in different CPUs.
  • 15. Transferring Data • With different coding schemes, memory storage order, word sizes and numeric formats, generic attempts to transfer information between systems must carefully define formats for the transferred data and have ways to convert data to the data transfer format and back to another format. Such a scheme must understand the format at both ends of the transaction. The intermediate format is called an External Data Representation (XDR), and a set of commands to accomplish that is called an Interface Definition Language (IDL).
  • 16. External Data Representation • There are three different common approaches to XDR: • CORBA’s common data representation, which can be used by a variety of languages. • Java’s object serialization, which can even pass complex objects across a network, but is limited to Java only. • Extensible Markup Language (XML), which can represent even structured data as ASCII text.
  • 17. Marshalling and Unmarshalling • Converting information to a network transportable form (XDR) following the specifications of an IDL is called marshalling. Converting it back to an application readable format is called unmarshalling.
  • 18. Java Object Serialization • Serialization transforms an object into a sequence of bytes. This allows objects to be saved to files or transferred across a network, and is a key feature of Java. Since objects can have attributes that are also objects, and those objects can have object attributes, serialization allows a very complex structure to be transferred across a network or stored in a file. • Classes that need to be stored in files or transferred over a network should implement the java.io.serializable interface.
  • 19. Reflection • Java supports reflection—the ability to enquire about the properties of a class, including the names and types of its instance variables. Classes can be created from their names, and a constructor with specified arguments can create a class. Reflection makes serialization and deserialization possible and allows a class to be instantiated by a Java Virtual Machine after transfer across a network.
  • 20. The Document is the Object  XML (eXtensible Markup Language) Describes the structure of a document Defines new tags Specifies metadata that lets programs discover document structure  DOM (Document Object Model) Allows programmatic access to XML structure and content of XML documents  XSL (eXtensible Style Language) The XML version of Style sheets
  • 21. What is XML? • XML stands for eXtensible Markup Language. • XML specification defines a syntax and document organization for data, represented by tag/value pairs. • XML Elements have data surrounded by matching start and end tags. • XML Attributes are optional in some start tags and have an identifier with an = sign. • There is a well defined syntax that can be parsed.
  • 22. XML Namespaces • An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names. XML namespaces have internal structure and are not, mathematically speaking, sets. • The file that identifies the namespace can be specified as an attribute called xmlns like this: xmlns:pers = “http://www.cdk4.net/person • See http://www.w3.org/XML/ for specifications.
  • 23. XML Schemas • An XML Schema defines the elements and attributes that can be used in a document, how they can be nested, the order and number of the elements, and whether an element is empty or can include text. Default values and types are defined. An example is Coulouris figure 4.12 shown on the next slide.
  • 24. Figure 4.12 An XML schema for the Person structure <xsd:schema xmlns:xsd = URL of XML schema definitions > <xsd:element name= "person" type ="personType" /> <xsd:complexType name="personType"> <xsd:sequence> <xsd:element name = "name" type="xs:string"/> <xsd:element name = "place" type="xs:string"/> <xsd:element name = "year" type="xs:positiveInteger"/> </xsd:sequence> <xsd:attribute name= "id" type = "xs:positiveInteger"/> </xsd:complexType> </xsd:schema>
  • 25. XML: Structured Data in a Text File  Spreadsheets, address books, configuration parameters, financial transactions, product catalogs…  XML defines a set of rules and conventions for designing text formats for such data  Easy to generate and read by computer  Extensible
  • 26. Role of XML • Applications built on different technologies can communicate via XML. • New integration tools and integration servers capitalize on emergence of XML as an integration technology. • Many .NET and J2EE technologies, such as SOAP, XML Web Services, JXTA, XML-RPC, and EJB use or are based on XML.
  • 27. Client/Server Communication • Communication in Client/Server systems uses a variety of well specified request/reply mechanisms with send and receive protocols defined by TCP, RPC, Java RMI, CORBA and other formats.
  • 28. Figure 4.14 Request-reply communication Client Server Request doOperation message getRequest select object (wait) execute Reply method message sendReply (continuation)
  • 29. Message Oriented Communication • Remote procedure calls and remote object invocation are not always sufficient or appropriate for all communications in distributed systems. They tend to be optimized for immediate connections between two systems, and may be inadequate for operations that persist over time or involve multiple connections requiring synchronization. For this, message oriented protocols such as mail protocols have been developed.
  • 30. Persistent Communication • In persistent communication, a message may be stored until it can be passed on to a recipient. Compare this to the distinction between a simple telephone and an answering machine. Without the answering machine, you must be present when the phone rings to get a message.
  • 31. Message Oriented Middleware • In MOM, applications communicate by inserting messages in specific queues. As the queues are processed, messages are forwarded to other computers. There may be several intermediates. At the destination queue, individual messages may be accepted and acted upon, and responses sent back through the system. Only passing to the receiver’s queue is guaranteed by the system. Accepting, reading or acting upon the message is up to the receiver.
  • 32. MOM • Messages can contain any data, but must be properly addressed. Usually, there is a systemwide unique name for the receiving queue. This allows a very simple interface. Queues are managed by queue managers, which may also act as relays to forward messages to other queues. Messages of different types can be interconnected by specialized applications called message brokers, which apply a set of rules to convert a message to a different type.
  • 33. IBM’s MQ Series • IBM’s MQ Series is a popular mainframe message oriented middleware system that has also been integrated into IBM’s WebSphere Web Server. • Details can be found at the IBM Web Site. • The text gives a brief summary of the functionality and operation of MQ Series.
  • 34. Data Streams • There are a variety of approaches to stream oriented communications, which consist of ways to pass timing dependent information over persistent connections that are established for the purpose. The sockets exercise gives a good practical understanding of TCP streams. Other mechanisms include pipes and compiler based stream libraries.
  • 35. References • George Coularis, Jean Dollimore and Tim Kindberg, Distributed Systems, Concepts and Design, Addison Wesley, Fourth Edition, 2005 • Figures from the Coulouris text are from the instructor’s guide and are copyrighted by Pearson Education 2005 • Andrew Tanenbaum and Martin van Steen, Distributed Systems, Principles and Paradigms, Prentice Hall, 2002 • Phil Dykstra, Gigabit Ethernet Jumbo Frames http://sd.wareonearth.com/~phil/jumbo.html