Slides for our talk at Reactive Summit, 2018 Montreal, Canada - "Actor based architecture for world's largest telescope"
By - @skvithalani and @unmeshjoshi
2. We work with people and
organizations who have
ambitious missions
We are focused on helping
our industry improve.
We are strong believers in the
power of software and
technology as tools for social
change.
4500
Employees
15
Countries
42
Offices
3. ● World's largest optical telescope
● Operational in 2027!
● Mauna Kea, Hawaii
● Five countries collaborating
○ USA
○ Canada
○ China
○ Japan
○ India
https://www.tmt.org/
5. The Program Summary
● A program more than 20 years long and more..
○ Started around 2007
○ Going to be operational in 2027 as per current plans
○ Telescope will be operational for 50 years from the first light
date.
● Software development is primarily happening in India.
○ ThoughtWorks working on ‘common software’ frameworks.
○ Built prototypes using Akka since 2014.
6. Overview of ‘Common Software’
● A Mechanism For Service Discovery
● Telescope Configuration Management Service.
● Centralized Logging
● Event and Telemetry data capture mechanism
● A Timer API with PTP setup.
● A Framework For Telescope control system development.
○ Using Typed Actors.
9. TCS Assembly
Probe Arm
Assembly
(20, 35)
(10, 25)
(20, 35)
(10, 10)(10, 10)
Example Scenario
Probe HCD
Probe HCD
Probe HCD
multi-axis
motor
controller
multi-axis
motor
controller
motor
controller
10. Key Characteristics
● Peer to Peer system
○ Components like Assemblies, HCDs
■ Discover other components
■ Send commands
■ Subscribe to other components’ events
● Asynchronous Message Passing
● Components are stateful
○ They maintain device state or subsystem state etc.
Concurrency and Safety are important concerns
11. Actors!
● Message passing communication
● State management
○ Without synchronization hassles
○ Safety with supervision strategies
● Location transparency
12. Deployment and Discovery
Akka Cluster with
CRDTs
Service Discovery
Http and SSE
Component Actors
Machine 1 Machine 2
13. Anatomy of a TMT Component
Supervisor
Top Level Actor
spawn()
Handlers
initialize()
Worker
spawn()
Pub-Sub Manager
Command Response Manager
spawn()
spawn()
Worker
Worker
Provided by TMT
Framework
To be implemented by
component writers
publish()
complete()
14. Supervisor
● Registers itself with Akka CRDT
● Any communication to a component goes via supervisor
○ Filter and validate commands received
● Manages lifecycle of the component
○ Helps restart the component in case of exception encountered
during execution
○ Provide admin interface to restart, shutdown, change log level,
etc. for a component
15. Role of Typed Actors in TMT
● Helps in defining communication protocols between components
● Makes the communication protocol explicit at compile time
○ Currently using mutable typed actors
○ Actor refs are stored in akka CRDT
16. Learnings
● Sealed messages for typed actors turned out to be quite rigid
○ Union type support in the language could help
● We used mutable behaviour because of familiarity
○ Eventually realized that immutable actors can do everything
we wanted
● Type information of actor refs not preserved during serialization
○ No error is thrown when a deserialized actorRef is cast to
incorrect type
○ Carrying type information with the serialized actorref could be
useful
A program more than 20 years long. Started i guess back in 2007. Going to be live in 2027 as per current plans.
30 meters of diameter for primary mirror. Its not possible to have single mirror this big. So its made up of 500 mirror segments.
Huge structure like a 5 storey building. Has hundreds of hardware and software components coordinating the telescope.
Different countries and industrial partners are contributing to different parts of the work. India is software development responsible for most of common softwre development..
Lets take a scenario. You want to schedule an observation. Point the telescope to a particular part a sky. Calibrate the telescope. Take pictures. Continue for next 8 hours, continuously adjusting the telscope.
Moving the telescope needs to coordinate several subsystems. Sequencer sends the command to Telescope control system assembly, calculates the next position. It communicates with a probe arm assembly, which talks to hardware control of the daemons to communicate with the motors to rotate.
State management makes concurrency an important aspect.
Safety in failure scenarios is another important aspect
CRDT based service discovery
Component - Seq, Assembly, Hcd
Assembly needs to discover hcd, send comands and handle response
This are pretty common actions in telescope,
So we have extracted the common protocol in a framework
Developed by Thoghtworks India
Different teams will use our framework
Supervisor - actor by framework, first thing, spawns TLA
TLA - initialize handlers
TLA- handler template pattern
Code display
Action in handlers
Execution sequence decided by TLA
Handlers delegate to workers to execute a command
Workers publish events while executing
Worker mark command complete after execution
Front facing of component
Exception in initialize bubble up to TLA to Supervisor
Sup applies restart strategy
Brings the component in clean state
So, if I have to reason why sup is exposed as comp address instead of TLA, then it is because TLA’s ref gets chaned when it restarts and exposing a short lived address is not going to be a preferred way
Explain Typed actors
All framework actors are typed actors