SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
EFL	
  Munich	
  
                                              February	
  2013	
  




Copyright	
  Push	
  Technology	
  2012	
  
•  Distributed	
  Systems	
  /	
  HPC	
  guy.	
  
                                                 	
  
                                              •  Chief	
  Scien*st	
  :-­‐	
  at	
  Push	
  Technology	
  

                                              •  Alumnus	
  of	
  :-­‐	
  	
  
                                                 	
  
                                                 Motorola,	
  IONA,	
  BeOair,	
  JPMC,	
  StreamBase.	
  

                                              •  School:	
  Trinity	
  College	
  Dublin.	
  	
  
                                                 -­‐	
  BA	
  (Mod).	
  Comp.	
  Sci.+	
  	
  
                                                 -­‐	
  M.Sc.	
  Networks	
  &	
  Distributed	
  Systems	
  	
  
                                                 	
  
                                              •  Responds	
  to:	
  Guinness,	
  Whisky	
  



                                                                                 About	
  me?	
  
Copyright	
  Push	
  Technology	
  2012	
  
                                                          Darach@PushTechnology.com	
  
Conversa[onal	
  
                                                Big	
  Data	
  
                                                  with	
  	
  




Copyright	
  Push	
  Technology	
  2012	
  
Big	
  Data.	
  The	
  most	
  important	
  V	
  is	
  
                                       Value.	
  
•  4Vs:	
  
•       Volume	
  
                                              •  “It’s	
  the	
  connec[ons	
  
•       Velocity	
                               that	
  ma`er	
  most”	
  	
  
•       Variety	
                                -­‐	
  David	
  Evans,	
  Cisco	
  
•       Variability	
  




Copyright	
  Push	
  Technology	
  2012	
  
Connect	
  all	
  the	
  things	
  




Copyright	
  Push	
  Technology	
  2012	
  
An	
  Internet	
  Minute	
  
                                                             10k,	
  1ms	
     1k,	
  100us	
  




Copyright	
  Push	
  Technology	
  2012	
  
The	
  Last	
  Mile	
  is	
  HARD	
  
                              Buffer	
  
                              Bloat	
                                         Network	
                                   Ba`ery!	
  
                                                                              Satura[on	
  


                                                                                        Download	
  Bandwidth	
  
   30.00	
  
   20.00	
  
                                                                                                                                                                                                         Expected	
  (Mbps)	
  
   10.00	
  
                                                                                                                                                                                                         Download	
  (Mbps)	
  
    0.00	
  
         Sep-­‐06	
  
  -­‐10.00	
                      Oct-­‐06	
                 Oct-­‐06	
           Nov-­‐06	
             Dec-­‐06	
          Dec-­‐06	
             Jan-­‐07	
                    Feb-­‐07	
  

                                                                                                                    Latency	
  
1.00E+00	
  
1.00E+01	
  
                                                                                                                                                           y	
  =	
  -­‐1.1667x	
  +	
  45703	
                           Latency	
  (ms)	
  
1.00E+02	
                                                                                                                                                             R²	
  =	
  0.02496	
                               Expected	
  (ms)	
  
1.00E+03	
  
                                                                                                                                                                                                                          Linear	
  (Latency	
  (ms))	
  
1.00E+04	
  
          Sep-­‐06	
                   Oct-­‐06	
                  Oct-­‐06	
             Nov-­‐06	
               Dec-­‐06	
               Dec-­‐06	
                    Jan-­‐07	
                Feb-­‐07	
  

               Copyright	
  Push	
  Technology	
  2012	
  
Conversa[ons	
  


M2M	
                                                                        M2H	
  
       Tradi[onal	
  Messaging	
                                                           Bidirec[onal	
  Real	
  
       MQTT,	
  AMQP	
                                                                     Time	
  Data	
  Distribu[on	
  


                                              WebSockets	
  


                                                               W3C	
  Server	
  Sent	
  
                                                               Events	
  

Copyright	
  Push	
  Technology	
  2012	
  
Tradi[onal	
  Messaging	
  
                                              A

                                              B
                                                      ba

                                                      bb



                                          Producers        ?                                  Consumers




Pros	
                                                         Cons	
  
•  Loosely	
  coupled.	
                                       •  No	
  data	
  model.	
  Slinging	
  blobs	
  
•  All	
  you	
  can	
  eat	
  messaging	
  pa`erns	
          •  Fast	
  producer,	
  slow	
  consumer?	
  Ouch.	
  
•  Familiar	
                                                  •  No	
  data	
  ‘smarts’.	
  A	
  blob	
  is	
  a	
  blob.	
  



Copyright	
  Push	
  Technology	
  2012	
  
Invented	
  yonks	
  ago…	
  
Before	
  the	
  InterWebs	
  
For	
  ‘reliable’	
  networks	
  
For	
  machine	
  to	
  machine	
  
Remember	
  DEC	
  Message	
  Queues?	
  
-­‐	
  That	
  basically.	
  Vomit!	
  




Copyright	
  Push	
  Technology	
  2012	
  
When	
  fallacies	
  were	
  simple	
  

              - The	
  network	
  is	
  reliable	
  
              - Latency	
  is	
  zero	
  	
  
              - Bandwidth	
  is	
  infinite	
  
              - There	
  is	
  one	
  administrator	
  
              - Topology	
  does	
  not	
  change	
  
              - The	
  network	
  is	
  secure	
  
              - Transport	
  cost	
  is	
  zero	
  
              - The	
  network	
  is	
  homogeneous	
  
	
  



 Copyright	
  Push	
  Technology	
  2012	
  
Then	
  in	
  1992,	
  this	
  happened:	
  
The	
  phrase	
  ‘surfing	
  the	
  internet’	
  was	
  coined	
  by	
  Jean	
  Poly.	
  
	
  
First	
  SMS	
  sent	
  
	
  
First	
  base.	
  




Copyright	
  Push	
  Technology	
  2012	
  
It	
  grew,	
  and	
  it	
  grew	
  




Copyright	
  Push	
  Technology	
  2012	
  
Then	
  in	
  2007,	
  this	
  happened:	
  

The	
        god	
  phone:	
  
	
  
Surfing	
  died.	
  Touching	
  happened.	
  
	
  
Second	
  base	
  unlocked.	
  




Copyright	
  Push	
  Technology	
  2012	
  
Then	
  in	
  2007,	
  this	
  happened:	
  
So	
  we	
  took	
  all	
  the	
  things	
  and	
  put	
  them	
  in	
  the	
  internet:	
  
	
  
Cloud	
  happened.	
  
	
  
So	
  we	
  could	
  touch	
  all	
  the	
  things.	
  
                                                                                   Messaging	
  
	
                                                                                 Apps	
  
They	
  grew	
  and	
  they	
  grew	
  like	
  all	
  the	
                        Hardware	
  
     good	
  things	
  do.	
                                                       Virtualize	
  all	
  the	
  things	
  
                                                                                    Services	
  
                                                                                    Skills,	
  
                                                                                    Special[es	
  




Copyright	
  Push	
  Technology	
  2012	
  
Then	
  in	
  2007,	
  this	
  happened:	
  
And	
  it	
  grew	
  and	
  it	
  grew.	
  
	
  
Like	
  all	
  the	
  good	
  things	
  do.	
  


                                                  Messaging	
  
                                                  Apps	
  
                                                  Hardware	
  
                                                  Virtualize	
  all	
  the	
  things	
  
                                                  Services	
  
                                                  Skills,	
  
                                                  Special[es	
  




Copyright	
  Push	
  Technology	
  2012	
  
So	
  we	
  scaled	
  all	
  the	
  things	
  
Big	
  data	
  is	
  fundamentally	
  about	
  extrac[ng	
  value,	
  understanding	
  
     implica[ons,	
  gaining	
  insight	
  so	
  we	
  can	
  
     learn	
  and	
  improve	
  con[nuously.	
  
	
  
It’s	
  nothing	
  new.	
  
                                                               big	
  


                                                               DATA	
  
Copyright	
  Push	
  Technology	
  2012	
  
But,	
  problem:	
  The	
  bird,	
  basically.	
  

             Immediately	
  Inconsistent.	
  
             But,	
  Eventually	
  Consistent	
  
             …	
  Maybe.	
  
             	
  
             Humans	
  
             Are	
  
             grumpy	
  


                                              IIBECM
Copyright	
  Push	
  Technology	
  2012	
  
Stop.	
  
             - The	
  network	
  is	
  not	
  reliable	
  	
  
                            nor	
  is	
  it	
  cost	
  free.	
  
             - Latency	
  is	
  not	
  zero	
  
                            nor	
  is	
  it	
  a	
  democracy.	
  
             - Bandwidth	
  is	
  not	
  infinite	
  
                            nor	
  predictable	
  especially	
  the	
  last	
  mile!	
  
             - There	
  is	
  not	
  only	
  one	
  administrator	
  
                  trust,	
  rela[onships	
  are	
  key	
  
             - Topology	
  does	
  change	
  
                            It	
  should,	
  however,	
  converge	
  eventually	
  
             - The	
  network	
  is	
  not	
  secure	
  
                            nor	
  is	
  the	
  data	
  that	
  flows	
  through	
  it	
  
             - Transport	
  cost	
  is	
  not	
  zero	
  
                            but	
  what	
  you	
  don’t	
  do	
  is	
  free	
  
             - The	
  network	
  is	
  not	
  homogeneous	
  
                            nor	
  is	
  it	
  smart	
  




Copyright	
  Push	
  Technology	
  2012	
  
Look.	
  

             - What	
  and	
  How	
  are	
  what	
  geeks	
  do.	
  
             - Why	
  gets	
  you	
  paid	
  
             - Business	
  Value	
  and	
  Trust	
  dictate	
  What	
  and	
  How	
  
             - 	
  Policies,	
  Events	
  and	
  Content	
  implements	
  Business	
  Value	
  

             - Science	
  basically.	
  But	
  think	
  like	
  a	
  carpenter:	
  

             - Measure	
  twice.	
  Cut	
  once.	
  




Copyright	
  Push	
  Technology	
  2012	
  
Listen.	
  

             - 	
  Every	
  nuance	
  comes	
  with	
  a	
  set	
  of	
  tradeoffs.	
  
             - 	
  Choosing	
  the	
  right	
  ones	
  can	
  be	
  hard,	
  but	
  it	
  pays	
  off.	
  
             - 	
  Context,	
  Environment	
  are	
  cri[cal	
  
             - 	
  Break	
  all	
  the	
  rules,	
  one	
  benchmark	
  at	
  a	
  [me.	
  
             	
  


             - 	
  Benchmark	
  Driven	
  Development	
  


Copyright	
  Push	
  Technology	
  2012	
  
Ac[on:	
  Data	
  Distribu[on	
  
Messaging	
  remixed	
  around:	
  
	
  
Relevance 	
  -­‐ 	
  Queue	
  depth	
  for	
  conflatable	
  data	
  should	
  be	
  0	
  or	
  1.	
  No	
  more	
  
     	
  
Responsiveness	
  -­‐	
  Use	
  HTTP/REST	
  for	
  things.	
  Stream	
  the	
  li`le	
  things	
  
	
  
Timeliness 	
  -­‐ 	
  It’s	
  rela[ve.	
  M2M	
  !=	
  M2H.	
  
     	
  
Context 	
     	
  -­‐ 	
  Packed	
  binary,	
  deltas	
  mostly,	
  snapshot	
  on	
  subscribe.	
  
     	
  
Environment-­‐ 	
  Don’t	
  send	
  1M	
  1K	
  events	
  to	
  a	
  mobile	
  phone	
  with	
  0.5mbps.	
  
               	
  



Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Virtualize	
  Client	
  Queues	
  
                                              A

                                              B
                                                  ba

                                                  bb


                                                        Buffer
                                      Producers         Bloat
                                                                                       Consumers




       Nuance:	
  Client	
  telemetry.	
  Tradeoff:	
  Durable	
  subscrip[ons	
  harder	
  

Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Add	
  data	
  caching	
  
                                              A

                                              B
                                                  ba   x


                                                  bb   x




                                      Producers
                                                             Is it a   Consumers
                                                            cache?




       One	
  hop	
  closer	
  to	
  the	
  edge	
  …	
  

Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Exploit	
  data	
  structure	
  
                                              A

                                              B            Snapshot     Delta
                                                  ba   x


                                                  bb   x




                                      Producers                State!                   Consumers




       Snapshot	
  recovery.	
  Deltas	
  or	
  Changes	
  mostly.	
  Conserves	
  bandwidth	
  

Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Behaviors	
  
                                              A

                                              B
                                                  ba   x


                                                  bb   x

                   X                                           The
                                      Producers
                                                             topic is                               Consumers
                                                                the
                                                              cloud!




       Extensible.	
  Nuance?	
  Roll	
  your	
  own	
  protocols.	
  Tradeoff?	
  3rd	
  party	
  code	
  in	
  the	
  engine	
  :/	
  

Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Structural	
  Confla[on	
  
                                              A

                                              B
                                                  ba   x


                                                  bb   x

                   X
                                      Producers
                                                             Current                              Consumers
                                                              data!



                         +




       Ensures	
  only	
  current	
  +	
  consistent	
  data	
  is	
  distributed.	
  Ac[vely	
  soaks	
  up	
  bloat.	
  Extensible!	
  

Copyright	
  Push	
  Technology	
  2012	
  
Ac[on.	
  Structural	
  Confla[on	
  [EEP]	
  
                                         C2       A2   C1   B1   A1   'op'             Cop Bop Aop            ...   ...




                                         C2       A2   C1   B1   A1     R              C2      B1    A2                          Current	
  
                                              5   4    3    2    1                       5      1      4
                                                                                                                                 Current	
  
                                         C2       A2   C1   B1   A1    M+              C2      B1    A+                             +	
  	
  
                                              5   4    3    2    1                       8    1        5
                                                                                                                                Consistent	
  
                                                                                         =             =
                                                                                      C2 + C1       A2 + A1
                                                                                         =             =
                                                                                       5+3           4+1
Replace	
                                                             Merge	
  
•  Replace/Overwrite	
  ‘A1’	
  with	
  ‘A2’	
                        •  Merge	
  A1,	
  A2	
  under	
  some	
  opera[on.	
  
•  ‘Current	
  data	
  right	
  now’	
                                     •  Opera[on	
  is	
  pluggable	
  
•  Fast	
                                                                  •  Tunable	
  consistency	
  
                                                                           •  Performance	
  f(opera[on)	
  

Copyright	
  Push	
  Technology	
  2012	
  
Implement	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Client	
  J	
  




Copyright	
  Push	
  Technology	
  2012	
  
Example	
  Diffusion	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Client	
  




     Backend	
  to	
  Basho’s	
  Lager	
  logging	
  –	
  Stream	
  log	
  events	
  for	
  near	
  real-­‐[me	
  analysis	
  …	
  

Copyright	
  Push	
  Technology	
  2012	
  
Benchmark	
  Driven	
  
                                                Development	
  
                                                     	
  
                                              ‘Measuring	
  Value’	
  




Copyright	
  Push	
  Technology	
  2012	
  
Performance	
  is	
  not	
  a	
  number	
  




Copyright	
  Push	
  Technology	
  2012	
  
Benchmark	
  Models	
  




                                              Box A                              Box B


            Throughput	
  (Worst	
  case)	
                             Latency	
  (Best	
  case)	
  
           •    Ramp	
  clients	
  con[nuously	
                        •  Really	
  simple.	
  1	
  client	
  
           •    100	
  messages	
  per	
  second	
  per	
  client	
     •  Ping	
  –	
  Pong.	
  Measure	
  RTT	
  [me.	
  
           •    Payload:	
  125	
  ..	
  2000	
  bytes	
                •  Payload:	
  125	
  ..	
  2000	
  bytes	
  
           •    Message	
  style	
  vs	
  with	
  confla[on	
  


Copyright	
  Push	
  Technology	
  2012	
  
Throughput.	
  Non-­‐conflated	
  
                                                •    Cold	
  start	
  server	
  

                                                •    Ramp	
  750	
  clients	
  ‘simultaneously’	
  at	
  5	
  
                                                     second	
  intervals	
  
                                                •    5	
  minute	
  benchmark	
  dura[on	
  

                                                •     Clients	
  onboarded	
  linearly	
  un[l	
  IO	
  
                                                     (here)	
  or	
  compute	
  satura[on	
  occurs.	
  
                                                     	
  
                                                •     What	
  the	
  ‘server’	
  sees	
  




Copyright	
  Push	
  Technology	
  2012	
  
Throughput.	
  Non-­‐conflated	
  
                                                •     What	
  the	
  ‘client’	
  sees	
  
                                                     	
  
                                                •     At	
  and	
  beyond	
  satura[on	
  of	
  some	
  
                                                     resource?	
  

                                                •     Things	
  break!	
  
                                                     	
  
                                                •     New	
  connec[ons	
  fail.	
  Good.	
  
                                                •     Long	
  established	
  connec[ons	
  ok.	
  
                                                •     Recent	
  connec[ons	
  [meout	
  and	
  client	
  
                                                     connec[ons	
  are	
  dropped.	
  Good.	
  

                                                •     Diffusion	
  handles	
  smaller	
  message	
  
                                                     sizes	
  more	
  gracefully	
  
                                                     	
  
                                                •     Back-­‐pressure	
  ‘waveform’	
  can	
  be	
  
                                                     tuned	
  out.	
  Or,	
  you	
  could	
  use	
  structural	
  
                                                     confla[on!	
  




Copyright	
  Push	
  Technology	
  2012	
  
Throughput.	
  Replace-­‐conflated	
  
                                              •     Cold	
  start	
  server	
  
                                                   	
  
                                              •     Ramp	
  750	
  clients	
  ‘simultaneously’	
  at	
  5	
  
                                                   second	
  intervals	
  
                                                   	
  
                                              •     5	
  minute	
  benchmark	
  dura[on	
  

                                              •     Clients	
  onboarded	
  linearly	
  un[l	
  IO	
  
                                                   (here)	
  or	
  compute	
  satura[on	
  occurs.	
  
                                                   	
  
                                              •     Takes	
  longer	
  to	
  saturate	
  than	
  non-­‐
                                                   conflated	
  case	
  
                                                   	
  
                                              •     Handles	
  more	
  concurrent	
  connec[ons	
  
                                                   	
  
                                              •     Again,	
  ‘server’	
  view	
  




Copyright	
  Push	
  Technology	
  2012	
  
Throughput.	
  Replace-­‐Conflated	
  
                                              •     What	
  the	
  ‘client’	
  sees	
  
                                                   	
  
                                              •     Once	
  satura[on	
  occurs	
  Diffusion	
  
                                                   adapts	
  ac[vely	
  by	
  degrading	
  messages	
  
                                                   per	
  second	
  per	
  client	
  
                                                   	
  
                                              •     This	
  is	
  good.	
  Soak	
  up	
  the	
  peaks	
  
                                                   through	
  fairer	
  distribu[on	
  of	
  data.	
  
                                                   	
  
                                              •     Handles	
  spikes	
  in	
  number	
  of	
  
                                                   connec[ons,	
  volume	
  of	
  data	
  or	
  
                                                   satura[on	
  of	
  other	
  resources	
  using	
  a	
  
                                                   single	
  load	
  adap[ve	
  mechanism	
  




Copyright	
  Push	
  Technology	
  2012	
  
Throughput.	
  Stats	
  
•  Data	
  /	
  (Data	
  +	
  Overhead)?	
  
   •  1.09	
  GB/Sec	
  payload	
  at	
  satura[on.	
  	
  
   •  10Ge	
  offers	
  theore[c	
  of	
  1.25GB/Sec.	
  
   •  Ballpark	
  87.2%	
  of	
  traffic	
  represents	
  business	
  data.	
  

•  Benefits?	
  
        •  Op[mize	
  for	
  business	
  value	
  of	
  distributed	
  data.	
  
        •  Most	
  real-­‐world	
  high-­‐frequency	
  real-­‐[me	
  data	
  is	
  
           recoverable	
  
                      •  Stock	
  prices,	
  Gaming	
  odds	
  
        •  Don’t	
  use	
  confla[on	
  for	
  transac[onal	
  data,	
  natch!	
  

Copyright	
  Push	
  Technology	
  2012	
  
Latency	
  (Best	
  case)	
  
                                                                  •     Cold	
  start	
  server	
  
                                                                       	
  
                                                                  •     1	
  solitary	
  client	
  
                                                                       	
  
                                                                  •     5	
  minute	
  benchmark	
  dura[on	
  

                                                                  •     Measure	
  ping-­‐pong	
  round	
  trip	
  [me	
  
                                                                       	
  
                                                                  •     Results	
  vary	
  by	
  network	
  card	
  make,	
  
                                                                       model,	
  OS	
  and	
  Diffusion	
  tuning.	
  




Copyright	
  Push	
  Technology	
  2012	
  
Latency.	
  Round	
  Trip	
  Stats	
  
                                                                    •    Sub-­‐millisecond	
  at	
  
                                                                         the	
  99.99%	
  
                                                                         percen[le	
  
                                                                    •    As	
  low	
  as	
  15	
  
                                                                         microseconds	
  
                                                                    •    On	
  average	
  
                                                                         significantly	
  sub	
  
                                                                         100	
  microseconds	
  
                                                                         for	
  small	
  data	
  




Copyright	
  Push	
  Technology	
  2012	
  
Latency.	
  Single	
  Hop	
  Stats	
  
                                                                      •    Sub	
  500	
  
                                                                           microseconds	
  at	
  
                                                                           the	
  99.99%	
  
                                                                           percen[le	
  
                                                                      •    As	
  low	
  as	
  8	
  
                                                                           microseconds	
  
                                                                      •    On	
  average	
  
                                                                           significantly	
  sub	
  50	
  
                                                                           microseconds	
  for	
  
                                                                           small	
  data	
  




Copyright	
  Push	
  Technology	
  2012	
  
Latency	
  in	
  perspec[ve	
  
                                                                                         ~                         ~
                                                                                        125                      1000
                                                                                       bytes                     bytes
                                                                                       (avg)                     (avg)




                      2.4 us                      5.0 us   8.0 us                                   50.0 us



•  2.4	
  us.	
  A	
  tuned	
  benchmark	
  in	
  C	
  with	
  low	
  latency	
  10Ge	
  NIC,	
  with	
  kernel	
  
   bypass,	
  with	
  FPGA	
  accelera[on	
  
•  5.0	
  us.	
  A	
  basic	
  java	
  benchmark	
  –	
  as	
  good	
  as	
  it	
  gets	
  in	
  java	
  
•  Diffusion	
  is	
  measurably	
  ‘moving	
  to	
  the	
  le{’	
  release	
  on	
  release	
  
•  We’ve	
  been	
  ac[vely	
  tracking	
  and	
  con[nuously	
  improving	
  since	
  4.0	
  

Copyright	
  Push	
  Technology	
  2012	
  
Embedded	
  
                                                Event	
  
                                              Processing	
  




Copyright	
  Push	
  Technology	
  2012	
  
CEP,	
  basically	
  



                                                                w w               S
                                                                                               C              Q
                                                                w w




                                              What	
  is	
  eep.erl?	
  
                                              •    Add	
  aggregate	
  func[ons	
  and	
  window	
  opera[ons	
  to	
  Erlang	
  
                                              •    Separate	
  context	
  (window)	
  from	
  computa[on	
  (aggregate	
  fn)	
  
                                              •    4	
  window	
  types:	
  tumbling,	
  sliding,	
  periodic,	
  monotonic	
  
                                              •    Process	
  oriented.	
  
                                              •    Fast	
  enough.	
  J	
  

Copyright	
  Push	
  Technology	
  2012	
  
Aggregate	
  Func[ons	
  




                                              What	
  is	
  an	
  aggregate	
  func*on?	
  
                                              •    A	
  func[on	
  that	
  computes	
  values	
  over	
  events.	
  
                                              •    The	
  cost	
  of	
  calcula[ons	
  are	
  ammor[zed	
  per	
  event	
  
                                              •    Just	
  follow	
  the	
  above	
  recipe	
  
                                              •    Example:	
  Aggregate	
  2M	
  events	
  (equity	
  prices),	
  send	
  to	
  GPU	
  
                                                   on	
  emit,	
  receive	
  2M	
  op[ons	
  put/call	
  prices	
  as	
  a	
  result.	
  

Copyright	
  Push	
  Technology	
  2012	
  
Aggregate	
  Func[on	
  Example	
  




Copyright	
  Push	
  Technology	
  2012	
  
Tumbling	
  Windows	
  
                         x()             x()           x()         x()
                                                                                         emit()


                                                                                x()       x()        x()       x()                  emit()
                       1                2              3              4
                                                                                                                             x()       x()        x()       x()
                                                                                                                                                                        emit()
                                                                               2          3          4          5
init()


                                                                                                                            2          3          4          5
                                                             init()


                                                                                                           init()


                       t0               t1             t2           t3         t4         t5         t6         t7         t8          t9        t10        t11   ...

                                                             What	
  is	
  a	
  tumbling	
  window?	
  
                                                             •    Every	
  N	
  events,	
  give	
  me	
  an	
  average	
  of	
  the	
  last	
  N	
  events	
  
                                                             •    Does	
  not	
  overlap	
  windows	
  
                                                             •    ‘Closing’	
  a	
  window,	
  ‘Emits’	
  a	
  result	
  (the	
  average)	
  
                                                             •    Closing	
  a	
  window,	
  Opens	
  a	
  new	
  window	
  

         Copyright	
  Push	
  Technology	
  2012	
  
Tumbling	
  Windows	
  




Copyright	
  Push	
  Technology	
  2012	
  
Sliding	
  Windows	
  -­‐	
  O(Long	
  Story)	
  
init()


                   1                 2                 3           4         5                   ..        ..         ..        ..

            x()                      1                 2           3         4                   ..        ..         ..        ..

                              x()                      1           2         3                   ..        ..         ..        ..

                                                 x()               1         2                   ..        ..         ..        ..


                 t0               t1               t2            t3         t4        t5        t6        t7         t8        t9        t10       t11       ...
                                                           What	
  is	
  a	
  sliding	
  window?	
  
                                                           •    Like	
  tumbling,	
  except	
  can	
  overlap.	
  But	
  O(N2),	
  Keep	
  N	
  small.	
  
                                                           •    Every	
  event	
  opens	
  a	
  new	
  window.	
  
                                                           •    A{er	
  N	
  events,	
  every	
  subsequent	
  event	
  emits	
  a	
  result.	
  
                                                           •    Like	
  all	
  windows,	
  cost	
  of	
  calcula[on	
  amor[zed	
  over	
  events	
  

   Copyright	
  Push	
  Technology	
  2012	
  
Compensa[ng	
  Aggregates	
  
init()


                    1                 2               3    4    5                                ..    ..     ..     ..

             x()                      1               2    3    4                                ..    ..     ..     ..
                                                                         do {
                               x()                    1    2    3         …                      ..    ..     ..     ..
                                                                         } while (…)
                                                x()        1    2                                ..    ..     ..     ..


                                                               1                  compensate()




                  t0               t1             t2      t3   t4   t5       t6     t7   t8       t9    t10    t11    ...




  Copyright	
  Push	
  Technology	
  2012	
  
Sliding	
  Windows.	
  




Copyright	
  Push	
  Technology	
  2012	
  
Periodic	
  Windows	
  

                         x()             x()     x()        x()
                                                                               emit()


                                                                         x()    x()      x()       x()                  emit()
                       1                 2       3              4
                                                                                                                x()       x()    x()   x()
                                                                                                                                                        emit()
                                                                         2     3          4         5
init()


                                                                                                              2          3       4     5
                                                       init()


                                                                                               init()


                t0                                                  t1                                   t2                                  t3   ...
                                                  What	
  is	
  a	
  periodic	
  window?	
  
                                                  •  Driven	
  by	
  ‘wall	
  clock	
  [me’	
  in	
  milliseconds	
  
                                                  •  Not	
  monotonic,	
  natch.	
  Beware	
  of	
  NTP	
  

   Copyright	
  Push	
  Technology	
  2012	
  
Periodic	
  Windows	
  




Copyright	
  Push	
  Technology	
  2012	
  
Monotonic	
  Windows	
  
                                                                          my                                     my                                          my


                         x()             x()           x()        x()
                                                                                      emit()


                                                                                x()    x()       x()       x()                emit()
                       1                2              3              4
                                                                                                                        x()      x()       x()         x()
                                                                                                                                                                        emit()
                                                                                2      3         4          5
init()


                                                                                                                       2         3         4           5
                                                             init()


                                                                                                       init()


                t0                                                         t1                                     t2                                         t3   ...
                                                             What	
  is	
  a	
  monotonic	
  window?	
  
                                                             •  Driven	
  mad	
  by	
  ‘wall	
  clock	
  [me’?	
  Need	
  a	
  logical	
  clock?	
  
                                                             •  No	
  worries.	
  Provide	
  your	
  own	
  clock!	
  Eg.	
  Vector	
  clock	
  

         Copyright	
  Push	
  Technology	
  2012	
  
Monotonic	
  Windows	
  




Copyright	
  Push	
  Technology	
  2012	
  
EEP	
  looking	
  forward?	
  
        •         Migrate	
  windows	
  from	
  simple	
  processes	
  to	
  gen_server.	
  
        •         Extract	
  library	
  (no	
  process	
  /	
  message	
  passing	
  overhead).	
  
        •         Consider	
  NIFs	
  for	
  performance	
  cri[cal	
  sec[ons.	
  
        •         Consider	
  na[ve	
  aggregate	
  func[ons.	
  
        •         Consider	
  alterna[ve	
  to	
  gen_event	
  for	
  interfacing	
  
        •         Streams	
  /	
  Pipes?	
  Hey,	
  shoot	
  me,	
  but	
  I	
  like	
  Node.js	
  Streams/Pipes.	
  
        •         Add	
  more	
  func[ons.	
  
        •         Add	
  more	
  languages	
  
        •         Node.js	
  EEP	
  –	
  h`ps://github.com/darach/eep-­‐js	
  	
  
        •         React/PHP	
  EEP	
  -­‐	
  h`ps://github.com/ianbarber/eep-­‐php	
  	
  


Copyright	
  Push	
  Technology	
  2012	
  
•  Thank	
  you	
  for	
  listening	
  to,	
  having	
  me	
  
                                                	
  
                                                •  Le	
  twi`er:	
  @darachennis	
  
                                                     	
  
                                                •  EEP	
  in	
  Erlang	
  /	
  PHP	
  /	
  Node.js:	
  	
  
                                                     	
  
                                                     h`ps://github.com/darach/eep-­‐erl	
  
                                                •  h`ps://github.com/ianbarber/eep-­‐php	
  	
  
                                                     h`ps://github.com/darach/eep-­‐js	
  	
  

                                                •  Push	
  Technology’s	
  Diffusion	
  product	
  now	
  
                                                   supports	
  Erlang	
  Clients	
  too.	
  	
  
                                                   [Alpha,	
  ping	
  me	
  if	
  interested,	
  email	
  below]	
  
                                                   	
  

                           PROMO	
  CODE:	
  
                           ENNI100	
                                              Ques[ons?	
  
Copyright	
  Push	
  Technology	
  2012	
  
                                                            Darach@PushTechnology.com	
  

Más contenido relacionado

Similar a EFL Munich Big Data Conversational

Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.jsData distribution in the cloud with Node.js
Data distribution in the cloud with Node.jsdarach
 
Moving Global Content Exchange to the Cloud
Moving Global Content Exchange to the CloudMoving Global Content Exchange to the Cloud
Moving Global Content Exchange to the CloudSigniantMarketing
 
The Hybrid Windows Azure Application
The Hybrid Windows Azure ApplicationThe Hybrid Windows Azure Application
The Hybrid Windows Azure ApplicationMichael Collier
 
Cloud computing NIC 2012
Cloud computing NIC 2012Cloud computing NIC 2012
Cloud computing NIC 2012Kristian Nese
 
Din mobile framtid, john strand
Din mobile framtid, john strandDin mobile framtid, john strand
Din mobile framtid, john strandErgoGroup
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...JAX London
 
Sql azure database under the hood
Sql azure database under the hoodSql azure database under the hood
Sql azure database under the hoodguest2dd056
 
Sql azure database under the hood
Sql azure database under the hoodSql azure database under the hood
Sql azure database under the hoodEduardo Castro
 
Real Life WebSocket Case Studies and Demos
Real Life WebSocket Case Studies and DemosReal Life WebSocket Case Studies and Demos
Real Life WebSocket Case Studies and DemosPeter Moskovits
 
Cloud Computing Service & Market Trends in Korea
Cloud Computing Service & Market Trends in KoreaCloud Computing Service & Market Trends in Korea
Cloud Computing Service & Market Trends in KoreaFanny Lee
 
Modern Carrier Strategies for Traffic Engineering
Modern Carrier Strategies for Traffic EngineeringModern Carrier Strategies for Traffic Engineering
Modern Carrier Strategies for Traffic EngineeringVishal Sharma, Ph.D.
 
3D-IC Designs require 3D tools
3D-IC Designs require 3D tools3D-IC Designs require 3D tools
3D-IC Designs require 3D toolschiportal
 
Palestra "Technology Trends To Watch In 2012 and beyond"
Palestra "Technology Trends To Watch In 2012 and beyond"Palestra "Technology Trends To Watch In 2012 and beyond"
Palestra "Technology Trends To Watch In 2012 and beyond"Dígitro Tecnologia
 
Oasis cloud-law-ics-unofficial
Oasis cloud-law-ics-unofficialOasis cloud-law-ics-unofficial
Oasis cloud-law-ics-unofficialJamie Clark
 
The DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupThe DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupNorm Leitman
 
Cloudify summit2012 pub
Cloudify summit2012 pubCloudify summit2012 pub
Cloudify summit2012 pubGary Berger
 

Similar a EFL Munich Big Data Conversational (20)

Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.jsData distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
 
Moving Global Content Exchange to the Cloud
Moving Global Content Exchange to the CloudMoving Global Content Exchange to the Cloud
Moving Global Content Exchange to the Cloud
 
The Hybrid Windows Azure Application
The Hybrid Windows Azure ApplicationThe Hybrid Windows Azure Application
The Hybrid Windows Azure Application
 
Cloud computing NIC 2012
Cloud computing NIC 2012Cloud computing NIC 2012
Cloud computing NIC 2012
 
Lte asia 2011 s niri
Lte asia 2011 s niriLte asia 2011 s niri
Lte asia 2011 s niri
 
Din mobile framtid, john strand
Din mobile framtid, john strandDin mobile framtid, john strand
Din mobile framtid, john strand
 
10 fn key2
10 fn key210 fn key2
10 fn key2
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
 
Sql azure database under the hood
Sql azure database under the hoodSql azure database under the hood
Sql azure database under the hood
 
Sql azure database under the hood
Sql azure database under the hoodSql azure database under the hood
Sql azure database under the hood
 
Real Life WebSocket Case Studies and Demos
Real Life WebSocket Case Studies and DemosReal Life WebSocket Case Studies and Demos
Real Life WebSocket Case Studies and Demos
 
Cloud Computing Service & Market Trends in Korea
Cloud Computing Service & Market Trends in KoreaCloud Computing Service & Market Trends in Korea
Cloud Computing Service & Market Trends in Korea
 
Modern Carrier Strategies for Traffic Engineering
Modern Carrier Strategies for Traffic EngineeringModern Carrier Strategies for Traffic Engineering
Modern Carrier Strategies for Traffic Engineering
 
3D-IC Designs require 3D tools
3D-IC Designs require 3D tools3D-IC Designs require 3D tools
3D-IC Designs require 3D tools
 
Palestra "Technology Trends To Watch In 2012 and beyond"
Palestra "Technology Trends To Watch In 2012 and beyond"Palestra "Technology Trends To Watch In 2012 and beyond"
Palestra "Technology Trends To Watch In 2012 and beyond"
 
Oasis cloud-law-ics-unofficial
Oasis cloud-law-ics-unofficialOasis cloud-law-ics-unofficial
Oasis cloud-law-ics-unofficial
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Data centers presentation
Data centers presentationData centers presentation
Data centers presentation
 
The DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetupThe DevOps PaaS Infusion - May meetup
The DevOps PaaS Infusion - May meetup
 
Cloudify summit2012 pub
Cloudify summit2012 pubCloudify summit2012 pub
Cloudify summit2012 pub
 

Más de darach

Thing. An unexpected journey. Devoxx UK 2014
Thing. An unexpected journey. Devoxx UK 2014Thing. An unexpected journey. Devoxx UK 2014
Thing. An unexpected journey. Devoxx UK 2014darach
 
FunctionalJS - May 2014 - Streams
FunctionalJS - May 2014 - StreamsFunctionalJS - May 2014 - Streams
FunctionalJS - May 2014 - Streamsdarach
 
Deconstructing Lambda
Deconstructing LambdaDeconstructing Lambda
Deconstructing Lambdadarach
 
Streams and Things
Streams and ThingsStreams and Things
Streams and Thingsdarach
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.darach
 
Meta Programming with Streams and Pipes
Meta Programming with Streams and PipesMeta Programming with Streams and Pipes
Meta Programming with Streams and Pipesdarach
 
Erlang/Sapiens
Erlang/SapiensErlang/Sapiens
Erlang/Sapiensdarach
 
Streamy, Pipy, Analyticy
Streamy, Pipy, AnalyticyStreamy, Pipy, Analyticy
Streamy, Pipy, Analyticydarach
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBasedarach
 
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011darach
 

Más de darach (10)

Thing. An unexpected journey. Devoxx UK 2014
Thing. An unexpected journey. Devoxx UK 2014Thing. An unexpected journey. Devoxx UK 2014
Thing. An unexpected journey. Devoxx UK 2014
 
FunctionalJS - May 2014 - Streams
FunctionalJS - May 2014 - StreamsFunctionalJS - May 2014 - Streams
FunctionalJS - May 2014 - Streams
 
Deconstructing Lambda
Deconstructing LambdaDeconstructing Lambda
Deconstructing Lambda
 
Streams and Things
Streams and ThingsStreams and Things
Streams and Things
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.
 
Meta Programming with Streams and Pipes
Meta Programming with Streams and PipesMeta Programming with Streams and Pipes
Meta Programming with Streams and Pipes
 
Erlang/Sapiens
Erlang/SapiensErlang/Sapiens
Erlang/Sapiens
 
Streamy, Pipy, Analyticy
Streamy, Pipy, AnalyticyStreamy, Pipy, Analyticy
Streamy, Pipy, Analyticy
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBase
 
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 

EFL Munich Big Data Conversational

  • 1. EFL  Munich   February  2013   Copyright  Push  Technology  2012  
  • 2. •  Distributed  Systems  /  HPC  guy.     •  Chief  Scien*st  :-­‐  at  Push  Technology   •  Alumnus  of  :-­‐       Motorola,  IONA,  BeOair,  JPMC,  StreamBase.   •  School:  Trinity  College  Dublin.     -­‐  BA  (Mod).  Comp.  Sci.+     -­‐  M.Sc.  Networks  &  Distributed  Systems       •  Responds  to:  Guinness,  Whisky   About  me?   Copyright  Push  Technology  2012   Darach@PushTechnology.com  
  • 3. Conversa[onal   Big  Data   with     Copyright  Push  Technology  2012  
  • 4. Big  Data.  The  most  important  V  is   Value.   •  4Vs:   •  Volume   •  “It’s  the  connec[ons   •  Velocity   that  ma`er  most”     •  Variety   -­‐  David  Evans,  Cisco   •  Variability   Copyright  Push  Technology  2012  
  • 5. Connect  all  the  things   Copyright  Push  Technology  2012  
  • 6. An  Internet  Minute   10k,  1ms   1k,  100us   Copyright  Push  Technology  2012  
  • 7. The  Last  Mile  is  HARD   Buffer   Bloat   Network   Ba`ery!   Satura[on   Download  Bandwidth   30.00   20.00   Expected  (Mbps)   10.00   Download  (Mbps)   0.00   Sep-­‐06   -­‐10.00   Oct-­‐06   Oct-­‐06   Nov-­‐06   Dec-­‐06   Dec-­‐06   Jan-­‐07   Feb-­‐07   Latency   1.00E+00   1.00E+01   y  =  -­‐1.1667x  +  45703   Latency  (ms)   1.00E+02   R²  =  0.02496   Expected  (ms)   1.00E+03   Linear  (Latency  (ms))   1.00E+04   Sep-­‐06   Oct-­‐06   Oct-­‐06   Nov-­‐06   Dec-­‐06   Dec-­‐06   Jan-­‐07   Feb-­‐07   Copyright  Push  Technology  2012  
  • 8. Conversa[ons   M2M   M2H   Tradi[onal  Messaging   Bidirec[onal  Real   MQTT,  AMQP   Time  Data  Distribu[on   WebSockets   W3C  Server  Sent   Events   Copyright  Push  Technology  2012  
  • 9. Tradi[onal  Messaging   A B ba bb Producers ? Consumers Pros   Cons   •  Loosely  coupled.   •  No  data  model.  Slinging  blobs   •  All  you  can  eat  messaging  pa`erns   •  Fast  producer,  slow  consumer?  Ouch.   •  Familiar   •  No  data  ‘smarts’.  A  blob  is  a  blob.   Copyright  Push  Technology  2012  
  • 10. Invented  yonks  ago…   Before  the  InterWebs   For  ‘reliable’  networks   For  machine  to  machine   Remember  DEC  Message  Queues?   -­‐  That  basically.  Vomit!   Copyright  Push  Technology  2012  
  • 11. When  fallacies  were  simple   - The  network  is  reliable   - Latency  is  zero     - Bandwidth  is  infinite   - There  is  one  administrator   - Topology  does  not  change   - The  network  is  secure   - Transport  cost  is  zero   - The  network  is  homogeneous     Copyright  Push  Technology  2012  
  • 12. Then  in  1992,  this  happened:   The  phrase  ‘surfing  the  internet’  was  coined  by  Jean  Poly.     First  SMS  sent     First  base.   Copyright  Push  Technology  2012  
  • 13. It  grew,  and  it  grew   Copyright  Push  Technology  2012  
  • 14. Then  in  2007,  this  happened:   The   god  phone:     Surfing  died.  Touching  happened.     Second  base  unlocked.   Copyright  Push  Technology  2012  
  • 15. Then  in  2007,  this  happened:   So  we  took  all  the  things  and  put  them  in  the  internet:     Cloud  happened.     So  we  could  touch  all  the  things.   Messaging     Apps   They  grew  and  they  grew  like  all  the   Hardware   good  things  do.   Virtualize  all  the  things   Services   Skills,   Special[es   Copyright  Push  Technology  2012  
  • 16. Then  in  2007,  this  happened:   And  it  grew  and  it  grew.     Like  all  the  good  things  do.   Messaging   Apps   Hardware   Virtualize  all  the  things   Services   Skills,   Special[es   Copyright  Push  Technology  2012  
  • 17. So  we  scaled  all  the  things   Big  data  is  fundamentally  about  extrac[ng  value,  understanding   implica[ons,  gaining  insight  so  we  can   learn  and  improve  con[nuously.     It’s  nothing  new.   big   DATA   Copyright  Push  Technology  2012  
  • 18. But,  problem:  The  bird,  basically.   Immediately  Inconsistent.   But,  Eventually  Consistent   …  Maybe.     Humans   Are   grumpy   IIBECM Copyright  Push  Technology  2012  
  • 19. Stop.   - The  network  is  not  reliable     nor  is  it  cost  free.   - Latency  is  not  zero   nor  is  it  a  democracy.   - Bandwidth  is  not  infinite   nor  predictable  especially  the  last  mile!   - There  is  not  only  one  administrator   trust,  rela[onships  are  key   - Topology  does  change   It  should,  however,  converge  eventually   - The  network  is  not  secure   nor  is  the  data  that  flows  through  it   - Transport  cost  is  not  zero   but  what  you  don’t  do  is  free   - The  network  is  not  homogeneous   nor  is  it  smart   Copyright  Push  Technology  2012  
  • 20. Look.   - What  and  How  are  what  geeks  do.   - Why  gets  you  paid   - Business  Value  and  Trust  dictate  What  and  How   -   Policies,  Events  and  Content  implements  Business  Value   - Science  basically.  But  think  like  a  carpenter:   - Measure  twice.  Cut  once.   Copyright  Push  Technology  2012  
  • 21. Listen.   -   Every  nuance  comes  with  a  set  of  tradeoffs.   -   Choosing  the  right  ones  can  be  hard,  but  it  pays  off.   -   Context,  Environment  are  cri[cal   -   Break  all  the  rules,  one  benchmark  at  a  [me.     -   Benchmark  Driven  Development   Copyright  Push  Technology  2012  
  • 22. Ac[on:  Data  Distribu[on   Messaging  remixed  around:     Relevance  -­‐  Queue  depth  for  conflatable  data  should  be  0  or  1.  No  more     Responsiveness  -­‐  Use  HTTP/REST  for  things.  Stream  the  li`le  things     Timeliness  -­‐  It’s  rela[ve.  M2M  !=  M2H.     Context    -­‐  Packed  binary,  deltas  mostly,  snapshot  on  subscribe.     Environment-­‐  Don’t  send  1M  1K  events  to  a  mobile  phone  with  0.5mbps.     Copyright  Push  Technology  2012  
  • 23. Ac[on.  Virtualize  Client  Queues   A B ba bb Buffer Producers Bloat Consumers Nuance:  Client  telemetry.  Tradeoff:  Durable  subscrip[ons  harder   Copyright  Push  Technology  2012  
  • 24. Ac[on.  Add  data  caching   A B ba x bb x Producers Is it a Consumers cache? One  hop  closer  to  the  edge  …   Copyright  Push  Technology  2012  
  • 25. Ac[on.  Exploit  data  structure   A B Snapshot Delta ba x bb x Producers State! Consumers Snapshot  recovery.  Deltas  or  Changes  mostly.  Conserves  bandwidth   Copyright  Push  Technology  2012  
  • 26. Ac[on.  Behaviors   A B ba x bb x X The Producers topic is Consumers the cloud! Extensible.  Nuance?  Roll  your  own  protocols.  Tradeoff?  3rd  party  code  in  the  engine  :/   Copyright  Push  Technology  2012  
  • 27. Ac[on.  Structural  Confla[on   A B ba x bb x X Producers Current Consumers data! + Ensures  only  current  +  consistent  data  is  distributed.  Ac[vely  soaks  up  bloat.  Extensible!   Copyright  Push  Technology  2012  
  • 28. Ac[on.  Structural  Confla[on  [EEP]   C2 A2 C1 B1 A1 'op' Cop Bop Aop ... ... C2 A2 C1 B1 A1 R C2 B1 A2 Current   5 4 3 2 1 5 1 4 Current   C2 A2 C1 B1 A1 M+ C2 B1 A+ +     5 4 3 2 1 8 1 5 Consistent   = = C2 + C1 A2 + A1 = = 5+3 4+1 Replace   Merge   •  Replace/Overwrite  ‘A1’  with  ‘A2’   •  Merge  A1,  A2  under  some  opera[on.   •  ‘Current  data  right  now’   •  Opera[on  is  pluggable   •  Fast   •  Tunable  consistency   •  Performance  f(opera[on)   Copyright  Push  Technology  2012  
  • 29. Implement                                Client  J   Copyright  Push  Technology  2012  
  • 30. Example  Diffusion                              Client   Backend  to  Basho’s  Lager  logging  –  Stream  log  events  for  near  real-­‐[me  analysis  …   Copyright  Push  Technology  2012  
  • 31. Benchmark  Driven   Development     ‘Measuring  Value’   Copyright  Push  Technology  2012  
  • 32. Performance  is  not  a  number   Copyright  Push  Technology  2012  
  • 33. Benchmark  Models   Box A Box B Throughput  (Worst  case)   Latency  (Best  case)   •  Ramp  clients  con[nuously   •  Really  simple.  1  client   •  100  messages  per  second  per  client   •  Ping  –  Pong.  Measure  RTT  [me.   •  Payload:  125  ..  2000  bytes   •  Payload:  125  ..  2000  bytes   •  Message  style  vs  with  confla[on   Copyright  Push  Technology  2012  
  • 34. Throughput.  Non-­‐conflated   •  Cold  start  server   •  Ramp  750  clients  ‘simultaneously’  at  5   second  intervals   •  5  minute  benchmark  dura[on   •  Clients  onboarded  linearly  un[l  IO   (here)  or  compute  satura[on  occurs.     •  What  the  ‘server’  sees   Copyright  Push  Technology  2012  
  • 35. Throughput.  Non-­‐conflated   •  What  the  ‘client’  sees     •  At  and  beyond  satura[on  of  some   resource?   •  Things  break!     •  New  connec[ons  fail.  Good.   •  Long  established  connec[ons  ok.   •  Recent  connec[ons  [meout  and  client   connec[ons  are  dropped.  Good.   •  Diffusion  handles  smaller  message   sizes  more  gracefully     •  Back-­‐pressure  ‘waveform’  can  be   tuned  out.  Or,  you  could  use  structural   confla[on!   Copyright  Push  Technology  2012  
  • 36. Throughput.  Replace-­‐conflated   •  Cold  start  server     •  Ramp  750  clients  ‘simultaneously’  at  5   second  intervals     •  5  minute  benchmark  dura[on   •  Clients  onboarded  linearly  un[l  IO   (here)  or  compute  satura[on  occurs.     •  Takes  longer  to  saturate  than  non-­‐ conflated  case     •  Handles  more  concurrent  connec[ons     •  Again,  ‘server’  view   Copyright  Push  Technology  2012  
  • 37. Throughput.  Replace-­‐Conflated   •  What  the  ‘client’  sees     •  Once  satura[on  occurs  Diffusion   adapts  ac[vely  by  degrading  messages   per  second  per  client     •  This  is  good.  Soak  up  the  peaks   through  fairer  distribu[on  of  data.     •  Handles  spikes  in  number  of   connec[ons,  volume  of  data  or   satura[on  of  other  resources  using  a   single  load  adap[ve  mechanism   Copyright  Push  Technology  2012  
  • 38. Throughput.  Stats   •  Data  /  (Data  +  Overhead)?   •  1.09  GB/Sec  payload  at  satura[on.     •  10Ge  offers  theore[c  of  1.25GB/Sec.   •  Ballpark  87.2%  of  traffic  represents  business  data.   •  Benefits?   •  Op[mize  for  business  value  of  distributed  data.   •  Most  real-­‐world  high-­‐frequency  real-­‐[me  data  is   recoverable   •  Stock  prices,  Gaming  odds   •  Don’t  use  confla[on  for  transac[onal  data,  natch!   Copyright  Push  Technology  2012  
  • 39. Latency  (Best  case)   •  Cold  start  server     •  1  solitary  client     •  5  minute  benchmark  dura[on   •  Measure  ping-­‐pong  round  trip  [me     •  Results  vary  by  network  card  make,   model,  OS  and  Diffusion  tuning.   Copyright  Push  Technology  2012  
  • 40. Latency.  Round  Trip  Stats   •  Sub-­‐millisecond  at   the  99.99%   percen[le   •  As  low  as  15   microseconds   •  On  average   significantly  sub   100  microseconds   for  small  data   Copyright  Push  Technology  2012  
  • 41. Latency.  Single  Hop  Stats   •  Sub  500   microseconds  at   the  99.99%   percen[le   •  As  low  as  8   microseconds   •  On  average   significantly  sub  50   microseconds  for   small  data   Copyright  Push  Technology  2012  
  • 42. Latency  in  perspec[ve   ~ ~ 125 1000 bytes bytes (avg) (avg) 2.4 us 5.0 us 8.0 us 50.0 us •  2.4  us.  A  tuned  benchmark  in  C  with  low  latency  10Ge  NIC,  with  kernel   bypass,  with  FPGA  accelera[on   •  5.0  us.  A  basic  java  benchmark  –  as  good  as  it  gets  in  java   •  Diffusion  is  measurably  ‘moving  to  the  le{’  release  on  release   •  We’ve  been  ac[vely  tracking  and  con[nuously  improving  since  4.0   Copyright  Push  Technology  2012  
  • 43. Embedded   Event   Processing   Copyright  Push  Technology  2012  
  • 44. CEP,  basically   w w S C Q w w What  is  eep.erl?   •  Add  aggregate  func[ons  and  window  opera[ons  to  Erlang   •  Separate  context  (window)  from  computa[on  (aggregate  fn)   •  4  window  types:  tumbling,  sliding,  periodic,  monotonic   •  Process  oriented.   •  Fast  enough.  J   Copyright  Push  Technology  2012  
  • 45. Aggregate  Func[ons   What  is  an  aggregate  func*on?   •  A  func[on  that  computes  values  over  events.   •  The  cost  of  calcula[ons  are  ammor[zed  per  event   •  Just  follow  the  above  recipe   •  Example:  Aggregate  2M  events  (equity  prices),  send  to  GPU   on  emit,  receive  2M  op[ons  put/call  prices  as  a  result.   Copyright  Push  Technology  2012  
  • 46. Aggregate  Func[on  Example   Copyright  Push  Technology  2012  
  • 47. Tumbling  Windows   x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5 init() 2 3 4 5 init() init() t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What  is  a  tumbling  window?   •  Every  N  events,  give  me  an  average  of  the  last  N  events   •  Does  not  overlap  windows   •  ‘Closing’  a  window,  ‘Emits’  a  result  (the  average)   •  Closing  a  window,  Opens  a  new  window   Copyright  Push  Technology  2012  
  • 48. Tumbling  Windows   Copyright  Push  Technology  2012  
  • 49. Sliding  Windows  -­‐  O(Long  Story)   init() 1 2 3 4 5 .. .. .. .. x() 1 2 3 4 .. .. .. .. x() 1 2 3 .. .. .. .. x() 1 2 .. .. .. .. t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What  is  a  sliding  window?   •  Like  tumbling,  except  can  overlap.  But  O(N2),  Keep  N  small.   •  Every  event  opens  a  new  window.   •  A{er  N  events,  every  subsequent  event  emits  a  result.   •  Like  all  windows,  cost  of  calcula[on  amor[zed  over  events   Copyright  Push  Technology  2012  
  • 50. Compensa[ng  Aggregates   init() 1 2 3 4 5 .. .. .. .. x() 1 2 3 4 .. .. .. .. do { x() 1 2 3 … .. .. .. .. } while (…) x() 1 2 .. .. .. .. 1 compensate() t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... Copyright  Push  Technology  2012  
  • 51. Sliding  Windows.   Copyright  Push  Technology  2012  
  • 52. Periodic  Windows   x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5 init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What  is  a  periodic  window?   •  Driven  by  ‘wall  clock  [me’  in  milliseconds   •  Not  monotonic,  natch.  Beware  of  NTP   Copyright  Push  Technology  2012  
  • 53. Periodic  Windows   Copyright  Push  Technology  2012  
  • 54. Monotonic  Windows   my my my x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5 init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What  is  a  monotonic  window?   •  Driven  mad  by  ‘wall  clock  [me’?  Need  a  logical  clock?   •  No  worries.  Provide  your  own  clock!  Eg.  Vector  clock   Copyright  Push  Technology  2012  
  • 55. Monotonic  Windows   Copyright  Push  Technology  2012  
  • 56. EEP  looking  forward?   •  Migrate  windows  from  simple  processes  to  gen_server.   •  Extract  library  (no  process  /  message  passing  overhead).   •  Consider  NIFs  for  performance  cri[cal  sec[ons.   •  Consider  na[ve  aggregate  func[ons.   •  Consider  alterna[ve  to  gen_event  for  interfacing   •  Streams  /  Pipes?  Hey,  shoot  me,  but  I  like  Node.js  Streams/Pipes.   •  Add  more  func[ons.   •  Add  more  languages   •  Node.js  EEP  –  h`ps://github.com/darach/eep-­‐js     •  React/PHP  EEP  -­‐  h`ps://github.com/ianbarber/eep-­‐php     Copyright  Push  Technology  2012  
  • 57. •  Thank  you  for  listening  to,  having  me     •  Le  twi`er:  @darachennis     •  EEP  in  Erlang  /  PHP  /  Node.js:       h`ps://github.com/darach/eep-­‐erl   •  h`ps://github.com/ianbarber/eep-­‐php     h`ps://github.com/darach/eep-­‐js     •  Push  Technology’s  Diffusion  product  now   supports  Erlang  Clients  too.     [Alpha,  ping  me  if  interested,  email  below]     PROMO  CODE:   ENNI100   Ques[ons?   Copyright  Push  Technology  2012   Darach@PushTechnology.com