1. Advantages and Disadvantages
Of Using
Python’s Asynchronous
Frameworks for Web Services
Ryan C. Johnson
https://www.linkedin.com/in/ryanjohnson5
2. Key Asynchronous Concepts
● The best introduction to the asynchronous model is by my friend and former
colleague Dave Peticolas (highly recommended):
http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/
● “The fundamental idea behind the asynchronous model is that an
asynchronous program, when faced with a task that would normally block in a
synchronous program, will instead execute some other task that can still make
progress.”
3. Key Asynchronous Concepts (cont.)
● Event-based
● Locus-of-control is the event loop
● I/O calls are non-blocking (rely on events to signal when data is
ready)
● Conditions for async outperforming sync (Dave Peticolas):
○ “The tasks perform lots of I/O, causing a synchronous program to
waste lots of time blocking when other tasks could be running.”
○ “There are a large number of tasks so there is likely always at least
one task that can make progress.”
○ “The tasks are largely independent from one another so there is little
need for inter-task communication (and thus for one task to wait upon
another).”
5. The Typical Web Service Provides Ideal Conditions
● HTTP requests are handled independently from each other
● The handling of HTTP requests typically involves one or more I/O calls to
one or more databases or other web services, and in most real-world
cases, the majority of time is spent waiting on these I/O calls
● The service is constantly accepting requests or should be available for
accepting requests
6. Key advantages
● Efficiency
○ Handle an equivalent number of requests with fewer/smaller servers
compared to sync
○ Scalability limited by the number of open socket connections within a
single process vs. the number of concurrent threads/processes for
sync web frameworks (thousands to tens-of-thousands for async vs.
tens to hundreds for sync)
○ A small server (in terms of CPU and memory) running an async web
service in a single process will match and often outperform a larger
server running a sync web service using tens to hundreds of
threads/processes
7. Key advantages (cont.)
● Able to handle large numbers of concurrent, long-lived requests
○ Only burn a socket, not a thread/process
○ This can be the determining factor in choosing async over sync
○ Allows efficient “push” functionality via web sockets, EventSource or
other long-lived connections
○ Gavin Roy at PyCon 2012 (then CTO of MyYearBook.com): “We do
more traffic and volume through this [Tornado] than the rest of our site
infrastructure combined...8 servers as opposed to 400-500.”
(http://pyvideo.org/video/720/more-than-just-a-pretty-web-framework-t
he-tornad)
8. Key disadvantage
A single async process has a more complex model for thinking about
shared state and how it can change than a single sync process
● Must keep in mind that shared state can change between the moments of
yielding control to the event loop and returning control back to your code
9. Simple example
shared = []
@inlineCallbacks
def get(self, id):
shared.append(id)
print ‘pre yield for get({id}): shared={shared}’.format(id=id, shared=shared)
obj = yield async_get_from_db(id)
print ‘post yield for get({id}): shared={shared}’.format(id-id, shared=shared)
Possible sequence of events:
1. An incoming GET request is handled, calling get with id=1
2. Print pre yield for get(1): shared=[1]
3. Yield to the event loop after calling async_get_from_db(1)
4. While waiting for the result of async_get_from_db(1), the event loop handles the next request, calling get
with id=2
5. Print pre yield for get(2): shared=[1, 2]
6. Yield to the event loop after calling async_get_from_db(2)
7. While waiting for the result of async_get_from_db(2), the event loop sees that the result from
async_get_from_db(1) is ready, and returns control back to the “paused” execution of get(1)
8. Print post yield get(1): shared = [1, 2] ← within the call to get(1) the shared state has
changed between the yield to the event loop and the return of the result
10. Asynchronous Frameworks
● Implicit (yields to the event loop occur implicitly when an I/O call is made)
○ gevent
● Explicit (yields to the event loop controlled by the programmer)
○ Twisted
■ Cyclone (Tornado API on Twisted’s event loop)
■ Klein (Flask-like API on Twisted Web)
■ Tornado (in Twisted-compatibility mode)
○ Tornado
○ asyncio (Python 3.4+)
■ aiohttp
■ Tornado (running on the asyncio event loop)
11. Implicit Style - Advantages
● Coding syntax and style is same as synchronous (when an I/O call is made,
control implicitly returns to the event loop to work on another request or event)
● Compatible with the huge ecosystem of popular synchronous Python
packages that perform I/O (e.g., SQLAlchemy)
○ This is a huge advantage over the explicit style
○ Assumes that the socket module is used for I/O, so when it is
monkey-patched (using gevent), you will no longer block on I/O calls but
instead yield control to the event loop
○ Python packages that perform I/O but don’t use the socket module can
still be used, but they will block on I/O
12. Implicit Style - Disadvantages
● Lack of explicit yielding syntax fails to indicate the points in the code where
shared state may change:
https://glyph.twistedmatrix.com/2014/02/unyielding.html
● Lack of control over yielding to the event loop prevents the ability to launch
multiple I/O calls before yielding (impossible to launch independent I/O
tasks in parallel)
○ In my opinion, this is the biggest disadvantage, but only if multiple and
independent I/O tasks could be performed
○ For example, the following is impossible to do using the implicit style:
@inlineCalbacks
def get(...):
deferred1 = io_call_1(...)
deferred2 = io_call_2(...)
result1 = yield deferred1
result2 = yield deferred2
13. Explicit Style - Advantages
● Explicit yielding syntax indicates points at which shared state may change
● Complete control over yielding to the event loop allows the ability to launch
multiple I/O calls before yielding (parallelism of independent I/O tasks)
○ In my opinion, this is the biggest advantage, but only if multiple and
independent I/O tasks could be performed
○ For example:
@inlineCalbacks
def get(...):
deferred1 = io_call_1(...)
deferred2 = io_call_2(...)
result1 = yield deferred1
result2 = yield deferred2
14. Explicit Style - Disadvantages
● Different syntax and coding style than synchronous code
● Much smaller and sometimes less mature ecosystem of Python packages
can be used
○ This is a huge disadvantage
○ For example, it precludes the use of SQLAlchemy
● If not using Python 3.5 with its async and await keywords, or if there is
no generator-style decorator like Twisted’s @inlineCallbacks or
Tornado’s @coroutine, any significant amount of code becomes an
unreadable mess of callbacks
15. Generator-Based Decorators are Essential
def func():
deferred = io_call()
def on_result(result):
print result
deferred.addCallback(on_result)
return deferred
becomes readable and very similar to the synchronous style:
@inlineCallbacks
def func():
result = yield io_call()
print result
16. Conclusions
● Large (10x to 100x) performance/efficiency gains for I/O bound web
services
● Implicit style recommended for most cases (assuming the monkey-patch of
the socket module does the trick), as there is very little required to reap
the benefits (for example, simple configuration of uWSGI or gunicorn to
use gevent)
● Explicit style only recommended if the gains from launching multiple I/O
tasks in parallel outweigh the losses due to a smaller and sometimes
less-mature ecosystem of Python packages and a more complex coding
style