2. Application Development Model
2
▪A Stream is a sequence of data tuples
▪A typical Operator takes one or more input streams, performs computations & emits one or more output streams
• Each Operator is YOUR custom business logic in java, or built-in operator from our open source library
• Operator has many instances that run in parallel and each instance is single-threaded
▪Directed Acyclic Graph (DAG) is made up of operators and streams
Directed Acyclic Graph (DAG)
Output
Stream
Tupl
e
Tupl
e
er
Operator
er
Operator
er
Operator
er
Operator
er
Operator
er
Operator
4. 4
DAG Types
O1 O2
O3
O4
O5• Logical Plan
● Logical representation of computation
● Defines operators, streams and dataflow
• Physical Plan
● Deployable plan on cluster
● Contains partition information
of operators
● Has ready-to-deploy serialized operator
instances
Logical DAG
O1
P1
O1
P2
O1
P3
O2
P1
O2
P2
O2
P3
U
O3
O4
O5
Physical DAG
5. 5
➔ All operators in DAG go through
this life-cycle
➔ Managed by Apex Platform
➔ Governed by control tuples
Operator Lifecycle
6. 6
➔ Setup
◆Start of operator lifecycle
◆Do any initialization here
➔ beginWindow
◆Marks starting of window
➔ endWindow
◆Marks end of window
➔ teardown
◆Do any finalization here
◆End of operator lifecycle
Operator Lifecycle (contd...)
7. 7
Operator Lifecycle (contd...)
➔ emitTuples
◆Called for Input Adapters
◆Called in an infinite while
loop by platform
➔ process
◆Called for Generic Operators
and Output Adapters
◆Associated to to a port
◆Called for every incoming
tuple
8. 8
Operator Lifecycle (contd...)
➔ OutputPort::emit
◆Special method not part of
operator lifecycle
◆To be called by operator
code
◆Emits the tuples to next
operator
◆Bound by Window