This document discusses how PyPy makes Python code fast while maintaining its simplicity and readability. PyPy is a just-in-time compiling implementation of Python that can achieve speeds up to 1000x faster than CPython in some cases. While high-performance Python libraries and rewriting code in C can improve performance, they lose Python's simplicity and elegance. PyPy avoids these issues by making Pythonic code fast without compromising the language's qualities. The document also addresses some current limitations of PyPy like its lag behind CPython and issues with C extensions but outlines plans to improve the situation going forward.
7. Workaround #1: Using high-performance librariesWorkaround #1: Using high-performance libraries
8. GoodGood
Feels like Python
Also includes domain knowledge
Actually the default option
BadBad
Somewhat slow
Limited customisation options
Can only use what's implemented
9. Workaround #2: Rewrite in CWorkaround #2: Rewrite in C
Cython
C + cf
Write a C extension 😱
Numba
...
10. GoodGood
Fast
Developer has full control
BadBad
Different, less convenient, language
Unoptimised parts can become signi cant again
Python/C boundary is a bottleneck
11. C implementations cause bad APIsC implementations cause bad APIs
In [ ]: import numpy
class myarray(numpy.ndarray):
pass
In [ ]: import pandas
help(pandas.read_csv)
13. PyPyPyPy
Fast and compliant implementation of Python
PyPy v7.2 released on 14 October
Supports 2.7 and 3.6
JIT compiler
GC tuned for Python
14. PyPy - DemoPyPy - Demo
In [2]: %%time
class Quantity:
def __init__(self, value, unit):
self.value = value
self.unit = unit
def __add__(self, other):
if isinstance(other, Quantity):
if other.unit != self.unit: raise ValueError("units must match")
else: return Quantity(self.value + other.value, self.unit)
else: return NotImplemented
def __str__(self):
return f"{self.value} {self.unit}"
def compute(n):
total = Quantity(0, 'm')
increment = Quantity(1., 'm')
for i in range(n):
total += increment
return total
N = 1_000_000_000
print(compute(N))
3.6.9 (default, Jul 3 2019, 15:36:16)
[GCC 5.4.0 20160609]
1000000000.0 m
CPU times: user 29min 4s, sys: 112 ms, total: 29min 4s
Wall time: 29min 4s
15. How fast is PyPy?How fast is PyPy?
It dependsIt depends
Up to 1000x faster than CPython in extreme cases
Up to 50x in more realistic cases
2-5x on "typical" applications
Can be slower (short scripts, tests, ...)
16. C extension supportC extension support
cpyext = "Python.h for PyPy"
numpy, scipy, pandas, scikit-learn, lxml, ...
Cython + most extensions written in Cython
'pip install' works
Wheels available at
https://github.com/antocuni/pypy-wheels
18. PyPy issuesPyPy issues
Lags behind CPython
Implementation details are different
Need to understand GC and JIT to get best performance
No conda
Small community
20. HPyHPy
New C API
No refcounting
Replace PyObject* with opaque handles
Don't expose implementation details
https://github.com/pyhandle/hpy (https://github.com/pyhandle/hpy)
21. HPy: plansHPy: plans
In one yearIn one year
CPython: implemented on top of existing API
PyPy: much more ef cient than cpyext
Used by Cython
Long-termLong-term
Replaces current C-API
Used by all relevant implementations
CPython free to experiment and modify internals
22. SummarySummary
PyPy makes pure-Python fast
Compatible with PyData ecosystem
Computing in Python becomes viable
Complements traditional approaches to performance