By Misha Tselman
PyData New York City 2017
Athena is the largest Python-based ecosystem in J.P. Morgan Chase and among the largest in the world. Maintaining consistency and stability while enabling creativity, research, and speed of development are critical for supporting our clients and staying ahead. Can we fit everything into the same platform?
TeamStation AI System Report LATAM IT Salaries 2024
Managing python at scale without breaking the bank
1. Managing Python at scale
without breaking the bank
Michael (Misha) Tselman
PyData NY 2017
2. Agenda
• J.P. Morgan and Athena
• Objectives
• Continuous delivery
• Under the hood
• Challenges
• Conclusions
• Q&A
3. J.P. Morgan
• One of the world’s biggest banks
• $2.5 trillion assets
• $95 billion revenue
• Processing $5 trillion payments every day
• 230,000+ employees globally
• One of the world’s biggest tech companies
• 44,000+ employees in Technology
• $9.5 billion annual investment in technology and innovation
4. Athena
• Python-based Pricing, Trading, Risk Management, and Analytics
platform with tools for Data Science and Machine Learning
• Thousands of users across multiple business lines
• 1500+ Python developers use and contribute to the platform
• 150,000 python modules, 35 million lines of python code
• 500+ Python packages from the Open Source.
• Rapid development and deployment model that puts developers and
quants at the heart of the business.
5. Athena
Foundation
• Hydra ( globally replicated object database )
• Reactive Athena ( C++/Python reactive dataflow framework )
• Pixie Graph ( directed acyclic dependency graph )
• Athena Application framework based on QT
• Athena Web ( tornado, html5, websockets, javascript, web assembly )
• Job scheduling ( ~270,000 jobs daily which kick-off ~1M processes )
• Integration with Compute Grid ( tens of thousands of cores + GPUs )
6. Objectives
• Keep end-users and clients happy
• Ensure robustness and stability of our production systems
• Keep developers productive and efficient
• Provide quants and data scientists with the best research tools
• Encourage sharing and global consistency across business lines
7. Approach
• Conceptually:
• Continuous delivery:
• 10,000 – 15,000 production changes every week.
• Full visibility of the entire code base. Anyone can contribute.
• Instant global deployment
• Under the hood:
• Globally replicated object databases for code (and data)
• Monorepo – Monolithic code base
• Extensively automated testing
8. Continuous delivery
Write code & tests Test Commit Ask for a bless Push Run
PROS
• Time to market
• User satisfaction
• Developer productivity
CONS
• Fear of change / stability
• High reliance on automation
• Tricky in distributed systems
10,000 - 15,000 modules pushed to production every week
10. Alternatives to filesystem based source
DB-LDN DB-NYC DB-TKO
“lib.foo”
“
def hello():
print ‘world’
“lib.bar” def hello():
print ‘pydata’
“lib.bar @ 2017-10-01
12:33”
“
def hello():
print ‘jpmorgan’
“lib.bar @ 2017-09-21
10:16”
“...”
• Use globally replicated database
• Customize the importer
• SourceMarkers - Take advantage of transactions & timestamps
11. Python and Binary Runtime
prod old prod prod new
Python Source
C++ & 3rd party
12. Some Challenges
• Open source package upgrades
• API changes
• Change of pickled/stored representation
• Numerical changes
• Runtime/binary dependencies
• Limited branching
• Streamlines production
• Does not fit some research/experimental workflows
• Full reproducibility requires “freezing” all code including the binary train
13. Conclusions
• Python’s flexibility makes things easier
• Good integration tests ensure compatibility and consistency
• Modules don’t have to be loaded from a filesystem
• Production stability does not imply slow delivery and deployment
• Open source does not imply free
• Shared platform does not imply shared knowledge
At the bottom, a more traditional release train of binaries and 3rd party packages. prod.new continuously changing until lockdown. Allows for seamless testing of new features against the python baseline.
Globally distributed and instantly available for import in any region for any user.
Not just a repo, but a fully deployed codebase at the same time.