Apache Airavata is software for providing services to manage scientific applications on a wide range of remote computing resources. Airavata can be used by both individual scientists to run scientific workflows as well as communities of scientists through Web browser interfaces. It is a challenge to bring all of Airavata’s capabilities together in the single API layer that is our prerequisite for a 1.0 release. To support our diverse use cases, we have developed a rich data model and messaging format that we need to expose to client developers using many programming languages. We do not believe this is a good match for REST style services. In this presentation, we present our use and evaluation of Apache Thrift as an interface and data model definition tool, its use internally in Airavata, and its use to deliver and distribute client development kits.
2. 40% discount coupon on any of Manning Publications
Coupon code -‐‑ ********
(listen through the end for decryption key)
A free ebook: The Programmer’s Guide to Apache Thrift, Manning
Publications Co.
(listen through the end and impress us to win)
Red Alert:
FREE Stuff ….FREE as in
* Image Source: hHp://blog.law.cornell.edu/voxpop/files/2012/05/VOX.Chicken.Crossing_FreeBeer.jpg
3. Outline
— Introduction
— Apache Thrift
— Apache Airavata
— Motivation for Airavata to explore Thrift
— Airavata API
— Road to Airavata 1.0
— Airavata Thrift based architecture
— Experience & lessons learned
— Discussion, Q & A
4. — Modern distributed
applications are rarely
composed of modules wriHen
in a single language
— Weaving together innovations
made in a range of languages
is a core competency of
successful enterprises
— Cross language
communications are a
necessity, not a luxury
Polyglotism
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
5. — A high performance, scalable cross language serialization and RPC framework
— What?
— Full RPC Implementation -‐‑ Apache Thrift supplies a complete RPC solution:
clients, servers, everything but your business logic
— Modularity -‐‑ Apache Thrift supports plug-‐‑in serialization protocols:
binary, compact, json, or build your own
— Multiple End Points – Plug-‐‑in transports for network, disk and memory end points,
making Thrift easy to integrate with other communications and storage solutions like
AMQP messaging and HDFS
— Performance -‐‑ Apache Thrift is fast and efficient, solutions for minimal parsing overhead
and minimal size
— Reach -‐‑ Apache Thrift supports a wide range of languages and platforms:
Linux, OSX, Windows, Embedded Systems, Mobile, Browser, C++, Go, PHP, Erlang,
Haskell, Ruby, Node.js, C#, Java, C, OCaml, ObjectiveC, D, Perl, Python, SmallTalk, …
— Flexibility -‐‑ Apache Thrift supports interface evolution, that is to say, CI environments can
roll new interface features incrementally without breaking existing infrastructure.
Apache Thrift
* Slide source: Randy Abernethy.
6. 1. Define the service interface in IDL
2. Compile the IDL to generate client/server RPC stubs in the desired
languages
3. Call the remote functions as if they were local using the client stubs
4. Connect the server stubs to the desired implementation
5. Choose a prebuilt Apache Thrift server to host your service
thrift IDL
7. — Streaming – Communications characterized by an
ongoing flow of bytes from a server to one or more
clients.
— Example: An internet radio broadcast where the client
receives bytes over time transmiHed by the server in an
ongoing sequence of small packets.
— Messaging – Message passing involves one way
asynchronous, often queued, communications, producing
loosely coupled systems.
— Example: Sending an email message where you may get a
response or you may not, and if you do get a response
you don’t know exactly when you will get it.
— RPC – Remote Procedure Call systems allow function
calls to be made between processes on different
computers.
— Example: An iPhone app calling a service on the Internet
which returns the weather forecast.
Comm schemes
Apache Thrift is an efficient cross platform
serialization solution for streaming interfaces
Apache Thrift provides a complete RPC
framework
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
8. — User Code
— client code calls RPC methods and/or
[de]serializes objects
— service handlers implement RPC service
behavior
— Generated Code
— RPC stubs supply client side proxies and
server side processors
— type serialization code provides serialization
for IDL defined types
— Library Code
— servers host user defined services, managing
connections and concurrency
— protocols perform serialization
— transports move bytes from here to there
Architecture
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
9. — The Thrift framework was originally developed at Facebook and
released as open source in 2007. The project became an Apache
Software Foundation incubator project in 2008, after which four early
versions were released.
— 0.9.1
released 2013-‐‑07-‐‑16
— 0.9.2
Planned 2014-‐‑06-‐‑01
— 1.0.0
Planned 2015-‐‑01-‐‑01
Thrift Roadmap
it is difficult to make
predictions, particularly
about the future.
-‐‑-‐‑ Mark Twain
Open Source
Community Developed
Apache License Version 2.0
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
10. — Web
— thrift.apache.org
— github.com/apache/thrift
— Mail
— Users: user-‐‑subscribe@thrift.apache.org
— Developers: dev-‐‑subscribe@thrift.apache.org
— Chat
— #thrift
— Book
— Abernethy (2014), The Programmer’s Guide to Apache Thrift, Manning
Publications Co. [hHp://www.manning.com/abernethy/]
Thrift Resources
Chapter
1 is free
* Slide source: Randy Abernethy.
13. Airavata API
Security
Launch & Manage Jobs
Save/Load
configurations and
provenance data
Notify progress of job
or workflow execution
Real-‐‑Time
Monitoring
Execute & Manage
Computations
Data Ingest & Discovery
Data Movement
Data &
Provenance
Subsystem
Job &
Workflow
Subsystem
Messaging Subsystem
i
Information
Subsystem
CIPRES
Neuro
Science
Ultrascan
BioVLAB
GAAMP
Param
Chem
Academic &
Commercial
Clouds
Inter-‐‑
national
Grids
Apache Airavata
18. Airavata API in a nutshell
— Airavata provides a API to science gateway developers
— Science gateways => diverse requirements
— Airavata API needs to be
— Easy to understand and simple
— Generic enough to support different gateways
— Existing science gateway integration should be smooth
— Flexible for changes arising with different requirements of science
gateways
20. Road to Airavata 1.0
— First Approach -‐‑ SOAP / WS
— Pros
— ability to separate out context and the payload
— Already proven tools and techniques (XML, WSDL, WS-‐‑Security etc)
— Clients can easily generate stubs using WSDLs
— Cons
— Heavy weight
— Data schema changes cannot be handled easily
— Could not support broad range of clients
21. Contd.. Road to Airavata 1.0
— Second Approach -‐‑ REST
— Pros
— Flexible for data representation (JSON or XML)
— Light weight
— BeHer performance
— Cons
— Multiple object layers
— No standard way to describe the service to the client
22. Key issues with earlier Airavata API
— Only java clients were developed
— users had to write their own client for other languages
— API design was a hybrid model
— Makes it more complicated internally
— Complex data objects are overwhelming to work in REST
— Lot of marshalling / unmarshalling of data objects
— Unnecessary need for a servlet container
— Overwhelming number of methods with lots of overloaded methods
23. Contd..Road to Airavata 1.0
— Third Approach -‐‑ RPC style tools (Apache Thrift, Protocol Buffers,
Apache Avro)
— Ability to communicate easily across different programming languages
— Easy way to generate server and client code
24. Why Apache Thrift
— Apache project
— For Airavata users, a client library that can consume the API is more
important than an open API like REST
— Light framework
— No multiple dependencies and servlet containers.
— Easy learning curve to get started
— If carefully crafted, the IDLs and framework support backward and
forward compatibility
25. Clean way to define IDLs with richer data structures
26. Apache Thrift Integration with Airavata
● Airavata API server and
Orchestrator are thrift
services at the moment
● Other components will also
become thrift services in the
future
28. Experience: What we gain
— Able to support clients wriHen in different languages
— Clean design
— Having different versions of servers
— TSimpleServer -‐‑ Simple single threaded server
— TThreadPoolServer -‐‑ Uses Java'ʹs built in ThreadPool management
— TNonblockingServer -‐‑ non-‐‑blocking TServer implementation
— THsHaServer -‐‑ extension of the TNonblockingServer to a Half-‐‑Sync/
Half-‐‑Async server
29. Experience Contd.. What we gain
— No need of marshalling / unmarshalling -‐‑ data models used internally
— Server is very robust -‐‑ able to handle large number of concurrent
requests.
— Easy to do modifications to the models, since code is auto-‐‑generated
— Convenient way to achieve backward compatibility
30. Lessons Learned
— Modularize the data models in order to ease the maintenance
— No maven plugin at the moment, hence you have to commit auto-‐‑
generated code.
32. Lessons Learned Contd..
— Thrift has limited support to handle null values.
— Experiment model is a complex model with lot of other structs.
— Enums cannot be passed as null over the wire even they are specified as
optional in the struct
— Thrift has limited documentation and samples
— airavata can be used as a reference
— hHps://github.com/apache/airavata
33. Future Directions
— March towards Airavata 1.0 with a stable API
— Exploring next generation information model and Messaging Systems
— Ka{a, flume, fluend, Scribe, Hedwig….
— Refactor Registry Implementation
— Airavata architecture to support multi-‐‑tenanted PAAS
— Motivated by a downstream open source project SciGaP – Science
Gateways Platform as a Service
34. How you can participate/follow
— Join the party -‐‑ Architecture mailing list
— BYOX
— X = opinions, software, ideas, code, yourself ..
35. Take Home
— 2 e books on thrift
— awareness of thrift and airavata
— lessons learned
— how to get involved
— We are under same umbrella for a reason, lets party together
— Join our Architecture List and help
— Other means??