SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
ᔕᕈᒪᐱᓭ
                            Distributed Systems
                               Made Simple
                         Pascal Felber, Raluca Halalai, Lorenzo Leonini,
                         Etienne Rivière,Valerio Schiavoni, José Valerio
                              Université de Neuchâtel, Switzerland
                                     www.splay-project.org




                                              ᔕᕈᒪᐱᓭ
Monday, January 23, 12
2




                                       Motivations
                    • Developing, testing, and tuning distributed
                         applications is hard
                    • In Computer Science research, fixing the
                         gap of simplicity between pseudocode
                         description and implementation is hard
                    • Using worldwide testbeds is hard

                            ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
3
                                                                                  Networks of idle
                Testbeds                                                           workstations


          • Set of machines for
                 testing distributed
                 application/protocol
          •      Several different testbeds!


                                                 A cluster
                                                 @UniNE

         Your machine
                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
4




                What is PlanetLab?




                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
4




                What is PlanetLab?
               • Machines contributed by universities,
                         companies,etc.
                     •    1098 nodes at 531 sites (02/09/2011)
                     •    Shared resources, no privileged access
               • University-quality Internet links
               • High resource contention
               • Faults, churn, packet-loss is the norm
                • Challenging conditions
                             ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
5



                                   Daily Job With
                                 Distributed Systems
                         1
                 code




                         1   • Write (testbed specific) code
                               • Local tests, in-house cluster, PlanetLab...


                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
5



                                   Daily Job With
                                 Distributed Systems
                                         2
                   1
                 code               debug




                         2   • Debug (in this context, a nightmare)


                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
5



                                   Daily Job With
                                 Distributed Systems
                                                                3
                   1                  2
                 code               debug                  deploy




                         3   • Deploy, with testbed specific scripts

                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
5



                                   Daily Job With
                                 Distributed Systems
                                                                                       4
                   1                  2                      3
                 code               debug                  deploy                 get logs




                         4   • Get logs, with testbed specific scripts
                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
5



                                   Daily Job With
                                 Distributed Systems
                   1                  2                      3                       4                            5
                 code               debug                  deploy                 get logs                      plots




                         5   • Produce plots, hopefully
                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
6




                         ᔕᕈᒪᐱᓭ at a Glance
               code            debug                  deploy                 get logs                   plots      t...
                                                                                                                plo
                                                                                                             gnu


             •       Supports the development, evaluation, testing, and
                     tuning of distributed applications on any testbed:
                   •   In-house cluster, shared testbeds, emulated
                       environments...
             •       Provides an easy-to-use pseudocode-like language
                     based on Lua
             •       Open-source: http://www.splay-project.org

                           ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
7




                                The Big Picture

                                                                                                           application

   splayd

  splayd

   splayd

   splayd



                                                                                                 Splay
                                                                                               Controller
                                                                                              SQL DB



                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
7




                                             The Big Picture

                                                                                                                        application
                         splayd                  splayd




                                             splayd

                             splayd




                                                                                                              Splay
                                                                                                            Controller
                                                                                                           SQL DB



                                      ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
7




                                             The Big Picture

                         splayd                    splayd

                                            application


                                             splayd

                             splayd




                                                                                                              Splay
                                                                                                            Controller
                                                                                                           SQL DB



                                      ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
8




                           SPLAY architecture
           • Portable daemons (ANSI
                  C) on the testbed(s)
                 • Testbed agnostic
                         for the user
                 • Lightweight
                         (memory & CPU)
           • Retrieve results
                  by remote logging

                             ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
9

      Lua


                                    SPLAY language
 ributed system. It is noteworthy that the churn manage-              Lua’s interpreter can directly execute source code,

            •
ment system relieves the need for fault injection systems          as well as hardware-dependent (but operating system-
                Based on Lua
 uch as Loki [13]. Another typical use of the churn man-           independent) bytecode. In S PLAY, the favored way of
agement system is for long-running applications, e.g., a           submitting applications is in the form of source code, but
                •    High-level scripting language, simple interaction with
DHT that serves as a substrate for some other distributed
application under test and needs to stay available for the
                                                                   bytecode programs are also supported (e.g., for intellec-
                                                                   tual property protection).
                     C, bytecode-based, garbage collection, ...
whole duration of the experiments. In such a scenario,                Isolation and sandboxing are achieved thanks to Lua’s

            •
one can ask the churn manager to maintain a fixed size              support for first-class functions with lexical scoping and
                Close to pseudo code (focus on the algorithm)
population of nodes and to automatically bootstrap new             closures, which allow us to restrict access to I/O and net-
ones as faults occur in the testbed.
            •
                                                                   working libraries. We modify the behavior of these func-
                Favors natural ways of expressing algorithms (e.g., RPCs)
                                                                   tions to implement the restriction imposed by the admin-
                                                                   istrator or by the user at the time he/she submits the ap-
            •
3.3 Language and Applications
                SPLAY applications can be run locally without the  plication for deployment over S PLAY.
                                                                      Lua also supports cooperative multitasking by the
S PLAY applications are written in the Lua language [17],
a highly efficient scripting language. infrastructure (for debugging & testing)
                deployment This design choice                      means of coroutines, which are at the core of S PLAY’s

            •
was dictated by four majors factors. First, the most im-           event-based model (discussed below).
portant reason is the support of sandboxingsetremote
                Comprehensive for of                                                 events/threads
processes and, as a result, increased security both for the
                libraries (extensible)
                                                                       luasocket*                      io (fs)*        stdlib*
                                                                       sb_socket     socketevents       sb_fs         sb_stdlib
 estbed owner and its users. As mentioned earlier, sand-
boxing is a sound basis for execution in non-dedicated                 crypto*             llenc        json*

environments, where resources need to be constrained                    misc                log         rpc            splay::app
and where the hosting operating system must be shielded          * : third!party and lua libraries          : main dependencies
 rom possibly buggy or ill-behaved code. Second, one
                                ᔕᕈᒪᐱᓭ Distributed Systems Made Simple -Figure Felber - University of main S PLAY libraries.
of S PLAY’s goals is to support large numbers of pro-                     Pascal 6: Overview of the Neuchâtel
cesses within23, 12
  Monday, January a single host of the testbed. This calls for
10




                                    Why                                              ?
                    • Light & Fast
                     • (Very) close to equivalent code in C
                    • Concise
                     • Allow developers to focus on ideas more
                           than implementation details
                         • Key for researchers and students
                    •    Sandbox thanks to the possibility of easily
                         redefine (even built-in) functions
                             ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
11



                                       Concise
                                     Pseudo code
                                     as published
                                      on original
                                        paper

                                          Executable
                                          code using
                                            SPLAY
                                           libraries

                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
12




                         http://www.splay-project.org
Monday, January 23, 12
13




                         Sandbox: Motivations
     • Experiments should access only
            their own resources
     • Required for non-dedicated testbeds
      • Idle workstations in universities or companies
      • Must protect the regular users from ill-behaved
                  code (e.g., full disk, full memory, etc.)
     • Memory-allocation, filesystem, network resources
            must be restricted
                           ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
14




                         Sandboxing with Lua
             •      In Lua, all functions can be re-defined transparently
             •      Resource control done in re-defined functions
                    before calling the original function
             •      No code modification required for the user

                                                      Sandboxing

                                                 SPLAY Application
                                                    SPLAY Lua libraries
                                           (includes all stdlib and system calls)

                                                 Operating system



                          ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
15




       for _,module in pairs(splay)
                                                                                                           present novelties
                                                                                                           introduced due to dist
                                                                                                           systems




                          luasocket                     events                          io

                             crypto                  llenc/benc                       json

                               misc                        log                        rpc


                                  Modules sandboxed to prevent
                                   access to sensible resources
                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
17




                             splay.events
                • Distributed protocols can use message-
                         passing paradigm to communicate
                •        Nodes react to events                   events
                         •  Local, incoming, outgoing messages
                •        The core of the Splay runtime
                         •  Libraries splay.socket & splay.io
                            provide non-blocking operational mode
                         •  Based on Lua’s coroutines

                             ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
19




                                  splay.rpc

                    • Default mechanism to communicate
                         between nodes
                    • Support for UDP/TCP
                    • Efficient BitTorrent-like encoding                                                      rpc

                    • Experimental binary encoding

                           ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
responseCallback.onResponseReceived(msg);


                    }/*
                                                                                                                                                                20
                     * CAN BE COMMENTED AS SEND RETRIAL OF MESSAGES IS OFF else { it is a




                                                           Life Before ᔕᕈᒪᐱᓭ
                     * duplicate lookup response, send a PONG to (hopefully) stop source to


                     * send us again that message final Message pong = new


                     * Message(MessageType.PONG, msg.messageId, this, msg.src); pong.ackedMsg


                     * = msg; simulator.send(pong); }


                     */




                                                                                                         • Time spent on developing testbed
                }


                /* ELSE IF IS SOMETHING NOT A LOOKUP RESPONSE */


                else {




                    final Message beingAcked = msg.ackedMsg;                                                               specific protocols
                                                                                                         • Or fallback to simulations...
                    final ResponseArrivedCallback handler = this.sentMessagesCallbacks


                          .remove(beingAcked);




                                                                                                         • The focus should be on ideas
                    /* if there is a mapping */


                    if (handler != null) {


                     handler.onResponseReceived(msg);




                                                                                                         • Researchers usually have no time
                    }/*


                     * else { this may happen in case of a response arrive later than


                     * expected... log .info("%%% " + this + " received the response => " +




                                                                                                                           to produce industrial-quality code
                     * msg +


                     * " but no handler was expecting it! So, simply send back a PONG to make it stop sending again..n"


                     * ); send a PONG to (hopefully) stop source to send us again final


                     * Message pong = new Message(MessageType.PONG, msg.messageId, this,



        there is more :-(
                     * msg.src); pong.ackedMsg = msg; simulator.send(pong); }


                     */


                }




           // /* it replied at last, so "reset" the value */


                                                                  ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
           // this.nodeConsecutiveTimeouts.put((DHTNode) msg.src, 0);


            }
Monday, January 23, 12
such as epidemic diffusion on Erd¨ s-Renyi random 21
                                         o
graphs [16] and various types of distribution trees [10]
                          Life With ᔕᕈᒪᐱᓭ
(n-ary trees, parallel trees). As one can note from the fol-
lowing figure, all implementations are extremely concise
                                   7
in terms of lines of code (LOC):

            Chord        (base)     59 (base) + 17 (FT) + 26 (leafset) = 101
            Pastry                                      265
            Scribe                   Pastry                        79
      SplitStream                    Pastry               Scribe           58
       BitTorrent                                                            420
           Cyclon                  93
        Epidemic             35
            Trees             47


  Although the number of lines is clearly just a rough
      • Lines thepseudocode ~== Lines of executableitcodestill a
indicator of
              of
                   expressiveness of a system, is
        5 Splay
          daemons have been continuously running of Neuchâtel hosts for
               ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University on these
more than one year.
Monday, January 23, 12
There exist several characterizations of churn that can
                                                                                                                                                        22
 y      be leveraged to reproduce realistic conditions for the pro-
 -
u-
                          Churn management
        tocol under test. First, synthetic description issued from
        analytical studies [23] can be used to generate churn sce-
 n
         •
        narios and is key for testing large-scale application
            Churn replay them in the system. Second, several
 0
e-
           •
        traces “Natural churn”of real networks have been made
               of the dynamics on PlanetLab not reproducible

 e         •
        publicly available by the community (e.g., see the reposi-
               Hard to compare algorithms under same conditions
         •
        tory at [1]); they cover a wide rangechurn
            SPLAY allows to finely control     of applications such
        as highly churned file-sharing system [8] or high perfor-
           •   Traces (e.g., from file sharing systems)
        mance computing clusters [26].
           •          Synthetic descriptions: dedicated language
                                                                          1        <    2   >   <   3   >4<   5   >6
          1    at 30s join 10
                                             Leaves/joins per min.



                                                                              Nodes leaving
          2    from 5m to 10m inc 10                                          Nodes joining                            30
          3    from 10m to 15m const                                                                                   20




                                                                                                                            Nodes
                                                                                                                       10
                   churn 50%                                                                                           0
                                                                     40
          4    at 15m leave 50%                                      30
          5    from 15m to 20m inc 10                                20                                                             Binned number of joins
                   churn 150%                                        10                                                             and leaves per minute
               at 20m stop                                            0
er        6
                                                                          0       5         10        15          20
                                                                                       Time (minutes)

                    Figure 5: Example of a synthetic churn description.
                            ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
 Monday, January 23, 12
This section evaluates the use of the churn management                                                            as 14% of t
                                                                                                                                                      23
                         module, both using traces and synthetic descriptions. Us-                                                         minute; (2) r
                         ing churn is as simple as launching a regular S PLAY appli-                                                       plex nor lon

                                       Synthetic churn
                         cation with a trace file as extra argument. S PLAY provides
                         a set of tools to generate and process trace files. One can,
                         for instance, speed-up a trace, increase the churn ampli-
                                                                                                                                           we did for F
                                                                                                                                           estimate tha
                                                                                                                                           human effor
                         tude whilst keeping its statistical properties, or generate a                                                     than with an
                         trace from a synthetic description. a Pastry DHT
                           Routing performance in                                                                                          lieve that the
                                                 90th perc.
                                                                                                                 60                        courage the
                                                 75th perc.                                                      40                        protocols un
                                                 50th perc.




                                                                                                                      Route failures (%)
                                                 25th perc.                       Self-repair                                              current syste
                                                   5th perc.                                                     20
                          Delay (ms)




                                                Failure rate                                                                                                 10
                                                                                                                                                                      110




                                                                                                                                            Time (seconds)
                                                                                                                 0
                                                                              50% of the network fail                                                        8        130
                                       20
                                                                                                                                                             6
                                       15
                                       10                                                                                                                    4
                                        5                                                                                                                    2
                                        0                                                                                                                    0
                                            0     1    2       3     4   5    6       7       8         9   10                                                    0
                                                                   Time (minutes)
                         Figure 11: Stabilized
                                      Using churn management to reproduce massive                                                          Figure 13: D
                         churn conditions for thefailure: half of theimplementation.
                                        Massive S PLAY Pastry nodes fail                                                                   tion of (1) the
                                                                                                                                           superset of da
                            Figure 11 presents aMade Simple - experiment of aof massive
                              ᔕᕈᒪᐱᓭ Distributed Systems typical Pascal Felber - University Neuchâtel
                         failure using the synthetic description. We ran Pastry                                                            5.6 Deploy
Monday, January 23, 12
24




                                                                                              Trace-driven churn
                                                  •                       Traces from the OverNet file sharing network [IPTPS03]
                                                                      •        Speed up of 2, 5, 10 (1 minute = 30, 12, 6 seconds)

                                                  •                       Evolution of hit-ratio & delays of Pastry on PlanetLab
                                                                                     Churn x2                                                                                                                          Churn x5                                                                                                                                 Churn x10
                                                                                                                    12                                                                                                                                       12                                                                                                                                 12
                                                                                                                    10                                                                                                                                       10                                                                                                                                 10
                                                                                                                          Route failures (%)




                                                                                                                                                                                                                                                                   Route failures (%)




                                                                                                                                                                                                                                                                                                                                                                                                      Route failures (%)
  Leaves and joins per minute Delay (seconds)




                                                                                                                                               Leaves and joins per minute Delay (seconds)




                                                                                                                                                                                                                                                                                        Leaves and joins per minute Delay (seconds)
                                                                                                                    8                                                                                                                                        8                                                                                                                                  8
                                                                                                                    6                                                                                                                                        6                                                                                                                                  6
                                                                                                                    4                                                                                                                                        4                                                                                                                                  4
                                                                                                                    2                                                                                                                                        2                                                                                                                                  2
                                                  2                                                                 0                                                                          2                                                             0                                                                           2                                                      0
                                                1.5                                                                                                                                          1.5                                                                                                                                       1.5
                                                  1                                                                                                                                            1                                                                                                                                         1
                                                0.5                                                                                                                                          0.5                                                                                                                                       0.5
                                                  0                                                                                                                                            0                                                                                                                                         0
                                                200                                                                                                                                          200                                                                                                                                      200
                                                                                          Nodes leaving                                                                                                                     Nodes leaving                                                                                                                              Nodes leaving
                                                                                          Nodes joining             700                                                                                                     Nodes joining                    700                                                                                                       Nodes joining            700
                                                150                                                                                                                                          150                                                                                                                                      150
                                                                                                                    650                                                                                                                                      650                                                                                                                                650
                                                                                                                           Nodes




                                                                                                                                                                                                                                                                    Nodes




                                                                                                                                                                                                                                                                                                                                                                                                       Nodes
                                                100                                                                                                                                          100                                                                                                                                      100
                                                                                                                    600                                                                                                                                      600                                                                                                                                600
                                                50                                                                  550                                                                      50                                                              550                                                                       50                                                       550

                                                 0                                                                  500                                                                       0                                                              500                                                                        0                                                       500
                                                      0          5        10   15   20   25   30   35   40   45    50                                                                              0    5   10   15   20   25   30   35   40          45    50                                                                               0   5    10   15   20    25   30   35   40   45   50

  Figure 11: Study of the effect of churn on Pastry deployed on PlanetLab. Churn is derived from the trace of the Overnet file sharing
  system and sped up for increasing volatility.
                                                                          10
                                                                                Churn x2                                                                                                                          Churn x5                                                                                                                                 Churn x10
                                                                               110%             150%              200%                                                                                                                                 70      SPLAY trees                                                                                 CRCP trees
                                                          ime (seconds)




                                                                           8   130%             170%                                                                                                                               60
                                                                           6
                                                                                                                                                                                                                                          mpletions




                                                                                                                                                                                                                                   50
                                                                           4                                                                                                                           http://www.splay-project.org40                             16 KB                                                                      128 KB                  512 KB
                 2
Monday, January 23, 12                                                                                                                                                                                                             30
25




                                     Live Demo




                              www.splay-project.org

                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
26

                                 Live demo:
                         Gossip-based dissemination

      • Also called epidemic dissemination
        • A message spreads like the flu among population
        • Robust way to spread message in random network
      • Two actions: push and pull
        • Push: node chooses f other random node to infect
          • A newly infected node infects f other in turn
          • Done for a limited number of steps (TTL)
        • Pull: node tries to fetch message from random node
                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
27




                                SOURCE




                             Every node knows a random set
                                     of other nodes.

                            A node wants to send a message
                                     to all nodes.

                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
28




                             A first initial push in which nodes
                            that receive the messages for the
                             first time forwards it to f random
                               neighbors, up to TTL times.



                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
29




                              Periodically, nodes pull for new
                               messages from one random
                                 neighbor. The message
                             eventually reaches all peers with
                                      high probability.

                         ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
30
                         Algorithm 1: A simple push-pull dissemination protocol, on node P .
                          Constants
                            f : Fanout (push: P forwards a new message to f random peers).
                            TTL: Time To Live (push: a message is forwarded to f peers at most
                            TTL times).
                               : Pull period (pull : P asks for new messages every seconds).

                          Variables
                            R: set of received messages IDs
                            N : set of node identifiers

                          /* Invoked when a message is pushed to node P by node Q                        */
                          function Push(Message m, hops)
                             if m received for the first time then
                                 /* Store the message locally                                            */
                                 R     R⌅m
                                 /* Propage the message if required                                      */
                                 if hops > 0 then
                                     invoke Push(msg, hops-1 ) on f random peers from N


                          /* Periodic pull                                                               */
                          thread PeriodicPull()
                             every     seconds do
                                invoke Pull(R, P ) on a random node from N

                          /* Invoked when a node Q requests messages from node P                         */
                          function Pull(RQ , Q)
                             foreach m | m ⇥ R ⇧ m ⇤⇥ RQ do
                                 invoke Push(m, 0 ) on Q

                            ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
31




 require"splay.base"
 rpc = require"splay.urpc"                                                               Loading Base and UDP RPC Libraries.
 --[[ SETTINGS ]]--

 fanout = 3
 ttl = 3
 pull_period = 3

 rpc_timeout = 15                                                                            Set the timeout for RPC operations
                                                                                      (i.e., an RPC will return nil after 15 seconds).
 --[[ VARS ]]--

 msgs = {}

 function push(q, m)
   if not msgs[m.id] then
     msgs[m.id] = m                                                                  Logging facilities: all logs are available at the
     log.print("NEW: "..m.id.." (ttl:"..m.t..") from "..q.ip..":"..q.port)
     m.t = m.t - 1
                                                                                     controllers, where they are timestamped and
     if m.t > 0 then                                                                                stored in the DB.
       for _, n in pairs(misc.random_pick(job.nodes, fanout)) do
         log.print("FORWARD: "..m.id.." (ttl:"..m.t..") to
 "..n.ip..":"..n.port)                                                                The RPC call itself is embedded in an inner
         events.thread(function()
           rpc.call(n, {"push", job.me, m}, rpc_timeout)
                                                                                       anonymous function in a new anonymous
         end)                                                                           thread: we do not care about its success
       end                                                                            (epidemics are naturally robust to message
     end
   else                                                                              loss). Testing for success would be easy: the
     log.print("DUP: "..m.id.." (ttl:"..m.t..") from "..q.ip..":"..q.port)            second return value is nil in case of failure.
   end
 end




                                ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
32




                                                                                        SPLAY provides a library for cooperative
 function periodic_pull()
   while events.sleep(pull_period) do
                                                                                         multithreading, events, scheduling and
     local q = misc.random_pick_one(job.nodes)                                        synchronization. SPLAY also provides various
     log.print("PULLING: "..q.ip..":"..q.port)                                                    commodity libraries.
     local ids = {}
     for id, _ in pairs(msgs) do
       ids[id] = true                                                                  (In LUA, arrays and maps are the same type)
     end
     local r = rpc.call(q, {"pull", ids}, rpc_timeout)
     if r then                                                                             Variable r is nil if the RPC times out.
       for id, m in pairs(r) do
         if not msgs[id] then
           msgs[id] = m
           log.print("NEW REPLY: "..m.id.." from "..q.ip..":"..q.port)
         else
           log.print("DUP REPLY: "..m.id.." from "..q.ip..":"..q.port)
         end
       end
     end
   end
 end




 function pull(ids)
   local r = {}                                                                      There is no difference between a local function
   for id, m in pairs(msgs) do
     if not ids[id] then
                                                                                    and one called by a distant node. Even variables
       r[id] = m                                                                      can be accessed by a distant node by a RPC!
     end                                                                            SPLAY does all the work of serialization, network
   end
   return r
                                                                                      calls, etc. while preserving low footprint and
 end                                                                                                high performance.




                                 ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
33




 function source()
   local m = {id = 1, t = ttl, data = "data"}
   for _, n in pairs(misc.random_pick(job.nodes, fanout)) do
     log.print("SOURCE: "..m.id.." to "..n.ip..":"..n.port)
     events.thread(function()
       rpc.call(n, {"push", job.me, m}, rpc_timeout)                                           Calling a remote function.
     end)
   end
 end




 events.loop(function()
   rpc.server(job.me)
   log.print("ME", job.me.ip, job.me.port, job.position)                               Start the RPC server: it is up and running.
   events.sleep(15)
   events.thread(periodic_pull)
   if job.position == 1 then                                                             Embed a function in a separate thread.
     source()
   end
 end)




                                ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12
34



                   ay
                aw
              e- e
            ak lid
           T S


                         • Distributed systems raise a number of issues
                           for their evaluation
                         • Hard to implement, debug, deploy, tune
                         • ᔕᕈᒪᐱᓭ leverages Lua and centralized
                           controller to produce an easy to use yet
                           powerful working environment


                              ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel
Monday, January 23, 12

Más contenido relacionado

Destacado

Galway City Innovation District
Galway City Innovation DistrictGalway City Innovation District
Galway City Innovation Districtgalwaycity
 
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...swissolar-romandie
 
2016 ISCN Awards: Campus Planning and Management Systems
2016 ISCN Awards: Campus Planning and Management Systems2016 ISCN Awards: Campus Planning and Management Systems
2016 ISCN Awards: Campus Planning and Management SystemsISCN_Secretariat
 
Economie numérique et développement local
Economie numérique et développement localEconomie numérique et développement local
Economie numérique et développement localMarc NGIAMBA
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveJohn Breslin
 
Research Methods Workshop, Discourse Analysis
Research Methods Workshop, Discourse AnalysisResearch Methods Workshop, Discourse Analysis
Research Methods Workshop, Discourse AnalysisMethodsLab
 
L’importance de la lumière naturelle
L’importance de la lumière naturelleL’importance de la lumière naturelle
L’importance de la lumière naturelleswissolar-romandie
 
Microcity, édifice hybride et amplificateur urbain
Microcity, édifice hybride et amplificateur urbainMicrocity, édifice hybride et amplificateur urbain
Microcity, édifice hybride et amplificateur urbainswissolar-romandie
 
Nano material and surface engineering ppt
Nano material  and surface engineering pptNano material  and surface engineering ppt
Nano material and surface engineering pptVipin Singh
 
Economie collaborative et transformation du système alimentaire - OuiShare Food
Economie collaborative et transformation du système alimentaire - OuiShare FoodEconomie collaborative et transformation du système alimentaire - OuiShare Food
Economie collaborative et transformation du système alimentaire - OuiShare FoodMyriam Bouré
 

Destacado (13)

Jean piaget
Jean piagetJean piaget
Jean piaget
 
Galway City Innovation District
Galway City Innovation DistrictGalway City Innovation District
Galway City Innovation District
 
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...
Progrès dans les technologies photovoltaïques: abaissement des coûts et nouve...
 
2016 ISCN Awards: Campus Planning and Management Systems
2016 ISCN Awards: Campus Planning and Management Systems2016 ISCN Awards: Campus Planning and Management Systems
2016 ISCN Awards: Campus Planning and Management Systems
 
Economie numérique et développement local
Economie numérique et développement localEconomie numérique et développement local
Economie numérique et développement local
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
 
Expo 2023 Speaker Series | John Comazzi on Swiss Expo 2002
Expo 2023 Speaker Series | John Comazzi on Swiss Expo 2002Expo 2023 Speaker Series | John Comazzi on Swiss Expo 2002
Expo 2023 Speaker Series | John Comazzi on Swiss Expo 2002
 
Gestion du projet Microcity
Gestion du projet MicrocityGestion du projet Microcity
Gestion du projet Microcity
 
Research Methods Workshop, Discourse Analysis
Research Methods Workshop, Discourse AnalysisResearch Methods Workshop, Discourse Analysis
Research Methods Workshop, Discourse Analysis
 
L’importance de la lumière naturelle
L’importance de la lumière naturelleL’importance de la lumière naturelle
L’importance de la lumière naturelle
 
Microcity, édifice hybride et amplificateur urbain
Microcity, édifice hybride et amplificateur urbainMicrocity, édifice hybride et amplificateur urbain
Microcity, édifice hybride et amplificateur urbain
 
Nano material and surface engineering ppt
Nano material  and surface engineering pptNano material  and surface engineering ppt
Nano material and surface engineering ppt
 
Economie collaborative et transformation du système alimentaire - OuiShare Food
Economie collaborative et transformation du système alimentaire - OuiShare FoodEconomie collaborative et transformation du système alimentaire - OuiShare Food
Economie collaborative et transformation du système alimentaire - OuiShare Food
 

Similar a SLPAY: Distributed Systems Made Simple

SPLAY: Distributed Systems Made Simple
SPLAY: Distributed Systems Made SimpleSPLAY: Distributed Systems Made Simple
SPLAY: Distributed Systems Made Simplevschiavoni
 
Research Issues in P2P Netwroks
Research Issues in P2P NetwroksResearch Issues in P2P Netwroks
Research Issues in P2P Netwrokssabumt
 
Continuous Deployment at Disqus (Pylons Minicon)
Continuous Deployment at Disqus (Pylons Minicon)Continuous Deployment at Disqus (Pylons Minicon)
Continuous Deployment at Disqus (Pylons Minicon)zeeg
 
Automated Testing of NASA Software
Automated Testing of NASA SoftwareAutomated Testing of NASA Software
Automated Testing of NASA SoftwareDharmalingam Ganesan
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objectsseanb
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep LearningPoo Kuan Hoong
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user storylaurabeckcahoon
 
NICS Puppet Case Study
NICS Puppet Case StudyNICS Puppet Case Study
NICS Puppet Case StudyPuppet
 
The Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectThe Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectDETER-Project
 
Towards Lensfield
Towards LensfieldTowards Lensfield
Towards LensfieldJim Downing
 
Puppet buero20 presentation
Puppet buero20 presentationPuppet buero20 presentation
Puppet buero20 presentationMartin Alfke
 
Sun Microsystems Puppet Case Study
Sun Microsystems Puppet Case StudySun Microsystems Puppet Case Study
Sun Microsystems Puppet Case StudyPuppet
 
Governing services, data, rules, processes and more
Governing services, data, rules, processes and moreGoverning services, data, rules, processes and more
Governing services, data, rules, processes and moreRandall Hauch
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Adsa lab manual
Adsa lab manualAdsa lab manual
Adsa lab manualRaja Ch
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre EvolutionGavin McCance
 
Testing Plug-in Architectures
Testing Plug-in ArchitecturesTesting Plug-in Architectures
Testing Plug-in ArchitecturesArie van Deursen
 
Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnnkartikaursang53
 

Similar a SLPAY: Distributed Systems Made Simple (20)

SPLAY: Distributed Systems Made Simple
SPLAY: Distributed Systems Made SimpleSPLAY: Distributed Systems Made Simple
SPLAY: Distributed Systems Made Simple
 
Orientation - Java
Orientation - JavaOrientation - Java
Orientation - Java
 
Research Issues in P2P Netwroks
Research Issues in P2P NetwroksResearch Issues in P2P Netwroks
Research Issues in P2P Netwroks
 
Continuous Deployment at Disqus (Pylons Minicon)
Continuous Deployment at Disqus (Pylons Minicon)Continuous Deployment at Disqus (Pylons Minicon)
Continuous Deployment at Disqus (Pylons Minicon)
 
Automated Testing of NASA Software
Automated Testing of NASA SoftwareAutomated Testing of NASA Software
Automated Testing of NASA Software
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep Learning
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
NICS Puppet Case Study
NICS Puppet Case StudyNICS Puppet Case Study
NICS Puppet Case Study
 
The Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectThe Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER Project
 
Towards Lensfield
Towards LensfieldTowards Lensfield
Towards Lensfield
 
Puppet buero20 presentation
Puppet buero20 presentationPuppet buero20 presentation
Puppet buero20 presentation
 
Sun Microsystems Puppet Case Study
Sun Microsystems Puppet Case StudySun Microsystems Puppet Case Study
Sun Microsystems Puppet Case Study
 
Governing services, data, rules, processes and more
Governing services, data, rules, processes and moreGoverning services, data, rules, processes and more
Governing services, data, rules, processes and more
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Adsa lab manual
Adsa lab manualAdsa lab manual
Adsa lab manual
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre Evolution
 
Testing Plug-in Architectures
Testing Plug-in ArchitecturesTesting Plug-in Architectures
Testing Plug-in Architectures
 
Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnn
 

Más de Förderverein Technische Fakultät

The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...Förderverein Technische Fakultät
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfFörderverein Technische Fakultät
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfFörderverein Technische Fakultät
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Förderverein Technische Fakultät
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...Förderverein Technische Fakultät
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksFörderverein Technische Fakultät
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfFörderverein Technische Fakultät
 

Más de Förderverein Technische Fakultät (20)

Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

SLPAY: Distributed Systems Made Simple

  • 1. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple Pascal Felber, Raluca Halalai, Lorenzo Leonini, Etienne Rivière,Valerio Schiavoni, José Valerio Université de Neuchâtel, Switzerland www.splay-project.org ᔕᕈᒪᐱᓭ Monday, January 23, 12
  • 2. 2 Motivations • Developing, testing, and tuning distributed applications is hard • In Computer Science research, fixing the gap of simplicity between pseudocode description and implementation is hard • Using worldwide testbeds is hard ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 3. 3 Networks of idle Testbeds workstations • Set of machines for testing distributed application/protocol • Several different testbeds! A cluster @UniNE Your machine ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 4. 4 What is PlanetLab? ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 5. 4 What is PlanetLab? • Machines contributed by universities, companies,etc. • 1098 nodes at 531 sites (02/09/2011) • Shared resources, no privileged access • University-quality Internet links • High resource contention • Faults, churn, packet-loss is the norm • Challenging conditions ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 6. 5 Daily Job With Distributed Systems 1 code 1 • Write (testbed specific) code • Local tests, in-house cluster, PlanetLab... ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 7. 5 Daily Job With Distributed Systems 2 1 code debug 2 • Debug (in this context, a nightmare) ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 8. 5 Daily Job With Distributed Systems 3 1 2 code debug deploy 3 • Deploy, with testbed specific scripts ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 9. 5 Daily Job With Distributed Systems 4 1 2 3 code debug deploy get logs 4 • Get logs, with testbed specific scripts ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 10. 5 Daily Job With Distributed Systems 1 2 3 4 5 code debug deploy get logs plots 5 • Produce plots, hopefully ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 11. 6 ᔕᕈᒪᐱᓭ at a Glance code debug deploy get logs plots t... plo gnu • Supports the development, evaluation, testing, and tuning of distributed applications on any testbed: • In-house cluster, shared testbeds, emulated environments... • Provides an easy-to-use pseudocode-like language based on Lua • Open-source: http://www.splay-project.org ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 12. 7 The Big Picture application splayd splayd splayd splayd Splay Controller SQL DB ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 13. 7 The Big Picture application splayd splayd splayd splayd Splay Controller SQL DB ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 14. 7 The Big Picture splayd splayd application splayd splayd Splay Controller SQL DB ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 15. 8 SPLAY architecture • Portable daemons (ANSI C) on the testbed(s) • Testbed agnostic for the user • Lightweight (memory & CPU) • Retrieve results by remote logging ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 16. 9 Lua SPLAY language ributed system. It is noteworthy that the churn manage- Lua’s interpreter can directly execute source code, • ment system relieves the need for fault injection systems as well as hardware-dependent (but operating system- Based on Lua uch as Loki [13]. Another typical use of the churn man- independent) bytecode. In S PLAY, the favored way of agement system is for long-running applications, e.g., a submitting applications is in the form of source code, but • High-level scripting language, simple interaction with DHT that serves as a substrate for some other distributed application under test and needs to stay available for the bytecode programs are also supported (e.g., for intellec- tual property protection). C, bytecode-based, garbage collection, ... whole duration of the experiments. In such a scenario, Isolation and sandboxing are achieved thanks to Lua’s • one can ask the churn manager to maintain a fixed size support for first-class functions with lexical scoping and Close to pseudo code (focus on the algorithm) population of nodes and to automatically bootstrap new closures, which allow us to restrict access to I/O and net- ones as faults occur in the testbed. • working libraries. We modify the behavior of these func- Favors natural ways of expressing algorithms (e.g., RPCs) tions to implement the restriction imposed by the admin- istrator or by the user at the time he/she submits the ap- • 3.3 Language and Applications SPLAY applications can be run locally without the plication for deployment over S PLAY. Lua also supports cooperative multitasking by the S PLAY applications are written in the Lua language [17], a highly efficient scripting language. infrastructure (for debugging & testing) deployment This design choice means of coroutines, which are at the core of S PLAY’s • was dictated by four majors factors. First, the most im- event-based model (discussed below). portant reason is the support of sandboxingsetremote Comprehensive for of events/threads processes and, as a result, increased security both for the libraries (extensible) luasocket* io (fs)* stdlib* sb_socket socketevents sb_fs sb_stdlib estbed owner and its users. As mentioned earlier, sand- boxing is a sound basis for execution in non-dedicated crypto* llenc json* environments, where resources need to be constrained misc log rpc splay::app and where the hosting operating system must be shielded * : third!party and lua libraries : main dependencies rom possibly buggy or ill-behaved code. Second, one ᔕᕈᒪᐱᓭ Distributed Systems Made Simple -Figure Felber - University of main S PLAY libraries. of S PLAY’s goals is to support large numbers of pro- Pascal 6: Overview of the Neuchâtel cesses within23, 12 Monday, January a single host of the testbed. This calls for
  • 17. 10 Why ? • Light & Fast • (Very) close to equivalent code in C • Concise • Allow developers to focus on ideas more than implementation details • Key for researchers and students • Sandbox thanks to the possibility of easily redefine (even built-in) functions ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 18. 11 Concise Pseudo code as published on original paper Executable code using SPLAY libraries ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 19. 12 http://www.splay-project.org Monday, January 23, 12
  • 20. 13 Sandbox: Motivations • Experiments should access only their own resources • Required for non-dedicated testbeds • Idle workstations in universities or companies • Must protect the regular users from ill-behaved code (e.g., full disk, full memory, etc.) • Memory-allocation, filesystem, network resources must be restricted ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 21. 14 Sandboxing with Lua • In Lua, all functions can be re-defined transparently • Resource control done in re-defined functions before calling the original function • No code modification required for the user Sandboxing SPLAY Application SPLAY Lua libraries (includes all stdlib and system calls) Operating system ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 22. 15 for _,module in pairs(splay) present novelties introduced due to dist systems luasocket events io crypto llenc/benc json misc log rpc Modules sandboxed to prevent access to sensible resources ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 23. 17 splay.events • Distributed protocols can use message- passing paradigm to communicate • Nodes react to events events • Local, incoming, outgoing messages • The core of the Splay runtime • Libraries splay.socket & splay.io provide non-blocking operational mode • Based on Lua’s coroutines ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 24. 19 splay.rpc • Default mechanism to communicate between nodes • Support for UDP/TCP • Efficient BitTorrent-like encoding rpc • Experimental binary encoding ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 25. responseCallback.onResponseReceived(msg); }/* 20 * CAN BE COMMENTED AS SEND RETRIAL OF MESSAGES IS OFF else { it is a Life Before ᔕᕈᒪᐱᓭ * duplicate lookup response, send a PONG to (hopefully) stop source to * send us again that message final Message pong = new * Message(MessageType.PONG, msg.messageId, this, msg.src); pong.ackedMsg * = msg; simulator.send(pong); } */ • Time spent on developing testbed } /* ELSE IF IS SOMETHING NOT A LOOKUP RESPONSE */ else { final Message beingAcked = msg.ackedMsg; specific protocols • Or fallback to simulations... final ResponseArrivedCallback handler = this.sentMessagesCallbacks .remove(beingAcked); • The focus should be on ideas /* if there is a mapping */ if (handler != null) { handler.onResponseReceived(msg); • Researchers usually have no time }/* * else { this may happen in case of a response arrive later than * expected... log .info("%%% " + this + " received the response => " + to produce industrial-quality code * msg + * " but no handler was expecting it! So, simply send back a PONG to make it stop sending again..n" * ); send a PONG to (hopefully) stop source to send us again final * Message pong = new Message(MessageType.PONG, msg.messageId, this, there is more :-( * msg.src); pong.ackedMsg = msg; simulator.send(pong); } */ } // /* it replied at last, so "reset" the value */ ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel // this.nodeConsecutiveTimeouts.put((DHTNode) msg.src, 0); } Monday, January 23, 12
  • 26. such as epidemic diffusion on Erd¨ s-Renyi random 21 o graphs [16] and various types of distribution trees [10] Life With ᔕᕈᒪᐱᓭ (n-ary trees, parallel trees). As one can note from the fol- lowing figure, all implementations are extremely concise 7 in terms of lines of code (LOC): Chord (base) 59 (base) + 17 (FT) + 26 (leafset) = 101 Pastry 265 Scribe Pastry 79 SplitStream Pastry Scribe 58 BitTorrent 420 Cyclon 93 Epidemic 35 Trees 47 Although the number of lines is clearly just a rough • Lines thepseudocode ~== Lines of executableitcodestill a indicator of of expressiveness of a system, is 5 Splay daemons have been continuously running of Neuchâtel hosts for ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University on these more than one year. Monday, January 23, 12
  • 27. There exist several characterizations of churn that can 22 y be leveraged to reproduce realistic conditions for the pro- - u- Churn management tocol under test. First, synthetic description issued from analytical studies [23] can be used to generate churn sce- n • narios and is key for testing large-scale application Churn replay them in the system. Second, several 0 e- • traces “Natural churn”of real networks have been made of the dynamics on PlanetLab not reproducible e • publicly available by the community (e.g., see the reposi- Hard to compare algorithms under same conditions • tory at [1]); they cover a wide rangechurn SPLAY allows to finely control of applications such as highly churned file-sharing system [8] or high perfor- • Traces (e.g., from file sharing systems) mance computing clusters [26]. • Synthetic descriptions: dedicated language 1 < 2 > < 3 >4< 5 >6 1 at 30s join 10 Leaves/joins per min. Nodes leaving 2 from 5m to 10m inc 10 Nodes joining 30 3 from 10m to 15m const 20 Nodes 10 churn 50% 0 40 4 at 15m leave 50% 30 5 from 15m to 20m inc 10 20 Binned number of joins churn 150% 10 and leaves per minute at 20m stop 0 er 6 0 5 10 15 20 Time (minutes) Figure 5: Example of a synthetic churn description. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 28. This section evaluates the use of the churn management as 14% of t 23 module, both using traces and synthetic descriptions. Us- minute; (2) r ing churn is as simple as launching a regular S PLAY appli- plex nor lon Synthetic churn cation with a trace file as extra argument. S PLAY provides a set of tools to generate and process trace files. One can, for instance, speed-up a trace, increase the churn ampli- we did for F estimate tha human effor tude whilst keeping its statistical properties, or generate a than with an trace from a synthetic description. a Pastry DHT Routing performance in lieve that the 90th perc. 60 courage the 75th perc. 40 protocols un 50th perc. Route failures (%) 25th perc. Self-repair current syste 5th perc. 20 Delay (ms) Failure rate 10 110 Time (seconds) 0 50% of the network fail 8 130 20 6 15 10 4 5 2 0 0 0 1 2 3 4 5 6 7 8 9 10 0 Time (minutes) Figure 11: Stabilized Using churn management to reproduce massive Figure 13: D churn conditions for thefailure: half of theimplementation. Massive S PLAY Pastry nodes fail tion of (1) the superset of da Figure 11 presents aMade Simple - experiment of aof massive ᔕᕈᒪᐱᓭ Distributed Systems typical Pascal Felber - University Neuchâtel failure using the synthetic description. We ran Pastry 5.6 Deploy Monday, January 23, 12
  • 29. 24 Trace-driven churn • Traces from the OverNet file sharing network [IPTPS03] • Speed up of 2, 5, 10 (1 minute = 30, 12, 6 seconds) • Evolution of hit-ratio & delays of Pastry on PlanetLab Churn x2 Churn x5 Churn x10 12 12 12 10 10 10 Route failures (%) Route failures (%) Route failures (%) Leaves and joins per minute Delay (seconds) Leaves and joins per minute Delay (seconds) Leaves and joins per minute Delay (seconds) 8 8 8 6 6 6 4 4 4 2 2 2 2 0 2 0 2 0 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 200 200 200 Nodes leaving Nodes leaving Nodes leaving Nodes joining 700 Nodes joining 700 Nodes joining 700 150 150 150 650 650 650 Nodes Nodes Nodes 100 100 100 600 600 600 50 550 50 550 50 550 0 500 0 500 0 500 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 Figure 11: Study of the effect of churn on Pastry deployed on PlanetLab. Churn is derived from the trace of the Overnet file sharing system and sped up for increasing volatility. 10 Churn x2 Churn x5 Churn x10 110% 150% 200% 70 SPLAY trees CRCP trees ime (seconds) 8 130% 170% 60 6 mpletions 50 4 http://www.splay-project.org40 16 KB 128 KB 512 KB 2 Monday, January 23, 12 30
  • 30. 25 Live Demo www.splay-project.org ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 31. 26 Live demo: Gossip-based dissemination • Also called epidemic dissemination • A message spreads like the flu among population • Robust way to spread message in random network • Two actions: push and pull • Push: node chooses f other random node to infect • A newly infected node infects f other in turn • Done for a limited number of steps (TTL) • Pull: node tries to fetch message from random node ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 32. 27 SOURCE Every node knows a random set of other nodes. A node wants to send a message to all nodes. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 33. 28 A first initial push in which nodes that receive the messages for the first time forwards it to f random neighbors, up to TTL times. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 34. 29 Periodically, nodes pull for new messages from one random neighbor. The message eventually reaches all peers with high probability. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 35. 30 Algorithm 1: A simple push-pull dissemination protocol, on node P . Constants f : Fanout (push: P forwards a new message to f random peers). TTL: Time To Live (push: a message is forwarded to f peers at most TTL times). : Pull period (pull : P asks for new messages every seconds). Variables R: set of received messages IDs N : set of node identifiers /* Invoked when a message is pushed to node P by node Q */ function Push(Message m, hops) if m received for the first time then /* Store the message locally */ R R⌅m /* Propage the message if required */ if hops > 0 then invoke Push(msg, hops-1 ) on f random peers from N /* Periodic pull */ thread PeriodicPull() every seconds do invoke Pull(R, P ) on a random node from N /* Invoked when a node Q requests messages from node P */ function Pull(RQ , Q) foreach m | m ⇥ R ⇧ m ⇤⇥ RQ do invoke Push(m, 0 ) on Q ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 36. 31 require"splay.base" rpc = require"splay.urpc" Loading Base and UDP RPC Libraries. --[[ SETTINGS ]]-- fanout = 3 ttl = 3 pull_period = 3 rpc_timeout = 15 Set the timeout for RPC operations (i.e., an RPC will return nil after 15 seconds). --[[ VARS ]]-- msgs = {} function push(q, m)   if not msgs[m.id] then     msgs[m.id] = m Logging facilities: all logs are available at the     log.print("NEW: "..m.id.." (ttl:"..m.t..") from "..q.ip..":"..q.port)     m.t = m.t - 1 controllers, where they are timestamped and     if m.t > 0 then stored in the DB.       for _, n in pairs(misc.random_pick(job.nodes, fanout)) do         log.print("FORWARD: "..m.id.." (ttl:"..m.t..") to "..n.ip..":"..n.port) The RPC call itself is embedded in an inner         events.thread(function()           rpc.call(n, {"push", job.me, m}, rpc_timeout) anonymous function in a new anonymous         end) thread: we do not care about its success       end (epidemics are naturally robust to message     end   else loss). Testing for success would be easy: the     log.print("DUP: "..m.id.." (ttl:"..m.t..") from "..q.ip..":"..q.port) second return value is nil in case of failure.   end end ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 37. 32 SPLAY provides a library for cooperative function periodic_pull()   while events.sleep(pull_period) do multithreading, events, scheduling and     local q = misc.random_pick_one(job.nodes) synchronization. SPLAY also provides various     log.print("PULLING: "..q.ip..":"..q.port) commodity libraries.     local ids = {}     for id, _ in pairs(msgs) do       ids[id] = true (In LUA, arrays and maps are the same type)     end     local r = rpc.call(q, {"pull", ids}, rpc_timeout)     if r then Variable r is nil if the RPC times out.       for id, m in pairs(r) do         if not msgs[id] then           msgs[id] = m           log.print("NEW REPLY: "..m.id.." from "..q.ip..":"..q.port)         else           log.print("DUP REPLY: "..m.id.." from "..q.ip..":"..q.port)         end       end     end   end end function pull(ids)   local r = {} There is no difference between a local function   for id, m in pairs(msgs) do     if not ids[id] then and one called by a distant node. Even variables       r[id] = m can be accessed by a distant node by a RPC!     end SPLAY does all the work of serialization, network   end   return r calls, etc. while preserving low footprint and end high performance. ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 38. 33 function source()   local m = {id = 1, t = ttl, data = "data"}   for _, n in pairs(misc.random_pick(job.nodes, fanout)) do     log.print("SOURCE: "..m.id.." to "..n.ip..":"..n.port)     events.thread(function()       rpc.call(n, {"push", job.me, m}, rpc_timeout) Calling a remote function.     end)   end end events.loop(function()   rpc.server(job.me)   log.print("ME", job.me.ip, job.me.port, job.position) Start the RPC server: it is up and running.   events.sleep(15)   events.thread(periodic_pull)   if job.position == 1 then Embed a function in a separate thread.     source()   end end) ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12
  • 39. 34 ay aw e- e ak lid T S • Distributed systems raise a number of issues for their evaluation • Hard to implement, debug, deploy, tune • ᔕᕈᒪᐱᓭ leverages Lua and centralized controller to produce an easy to use yet powerful working environment ᔕᕈᒪᐱᓭ Distributed Systems Made Simple - Pascal Felber - University of Neuchâtel Monday, January 23, 12