SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
what’s eating python performance
Piotr Przymus
Nicolaus Copernicus University
1
about me
• Piotr Przymus PhD
• work @ Nicolaus Copernicus University in Toruń
• Interests: data mining and machine learning, databases,
GPGPU computing, high-performance computing.
• 9 years of Python experience.
2
introduction
3
introduction
Programmers waste enormous amounts of time thinking about, or
worrying about the speed of noncritical parts of their programs,
and these attempts at efficiency actually have a strong negative
impact when debugging and maintenance are considered. We
should forget about small efficiencies, say about 97% of the time:
premature optimisation is the root of all evil.
Donald Knuth, “Structured Programming With Go To
Statements”, 1974.
Yet we should not pass up our opportunities in that critical 3%.
4
premature optimisation
Premature optimisation may be stated as optimising code before
knowing whether we need to.
This may be bad as it impacts:
• your productivity,
• readability of the code,
• ease of maintenance and debugging,
• and it may contradict The Zen of Python ;).
Learn how to do proper assessment of your code in terms of
optimisation needs!
Remember that a strong felling that your code falls into the
remaining 3% does not count!
5
think before doing (think before coding)
Going for higher performance without a deeper reason may be just
a waste of your time. So start with:
• stating your reasons (Why do you need higher performance?),
• defining your goals (What would be an acceptable speed of
your code?),
• estimating time and resources you are willing to spend to
achieve these goals.
Re-evaluate all the pros and cons.
6
why do you need higher performance?
Good reasons:
• Computation cost reduction
• Significantly better user experience
• Significantly faster results
7
what would be an acceptable speed of your code?
This is an important and a difficult to answer question!
• Computation cost reduction
• Large projects with lots of computations
• They may benefit just from few percent improvements.
• Significantly better user experience of web/desktop
application.
• Note user experience is subjective, the user may:
• not notice the difference,
• or may not care about the change.
• The User is Always Right
• Significantly faster results
• Scientific computing, Data mining, Machine learning
• Large data sets processing
• Example: going from weeks to one day makes a huge
difference. 8
amdahl’s law
Amdahl’s law is used to find the maximum expected improvement
to an overall system when only part of the system is improved.
(wiki)
• Often used in parallel computing to predict the theoretical
maximum speedup.
• Assumes that the problem size remains the same!
Maximum expected improvement of a system, when only part of
the computation is improved
improvment =
1
(1 − P) + P
S
where:
• P is the proportion of improved computations,
• S is the improvement ratio. 9
amdahl’s law – example
Figure 1:Amdahl’s law example
10
amdahl’s law – example
If we improve:
• 30% of computations,
• so that they run twice as fast,
then P = 0.3 and S = 2, and the overall system improvement is
only
1
(1 − 0.3) + 0.3
2
= 1.1765.
11
test, measure, track down bottle-
necks
12
test, measure, track down bottlenecks
A starting point for optimisation is a running code that gives
correct results.
• Prepare a regression test suite!
Then rest of the optimisation process may be summarized as:
1. Test if the code works correctly.
2. Measure execution time
• if code is not fast enough use a profiler to identify the
bottlenecks,
• else Your done!
3. Fix performance problems.
4. Start from the beginning.
13
regression test suite
Before you start, prepare a regression test suite that:
• will guard the correctness of your code during the
optimisation.
• is comprehensive but yet quick-to-run.
Test will be ran very often – a reasonable execution time is a must!
14
measuring execution time
Measure execution time of your code. This is important because:
• it shows if you are getting any progress,
• it shows how far it is from the desired execution time (a.k.a.
acceptable speed),
• it allows you to compare various version of optimisations.
15
measuring execution time
There are various tools to do that, among them:
• Custom made timer,
• Pythons timeit module,
• unix time (use /usr/bin/time as time is also a common shell
built in).
16
timeit
A module provides a simple way to time small bits of Python code,
has:
• command-line interface
1 $ python −m t i m e i t ’ ”−” . j o i n ( [ s t r (n) f o r n in range (100)
] ) ’
2 10000 loops , best of 3: 33.4 usec per loop
3 $ python −m t i m e i t ’ ”−” . j o i n (map( str , range (100) ) ) ’
4 10000 loops , best of 3: 25.2 usec per loop
• Python Interface
1 >>> t i m e i t . t i m e i t ( ’ ”−”. j o i n ( [ s t r (n) f o r n in range (100)
] ) ’ , number=10000)
2 0.7288308143615723
3 >>> t i m e i t . t i m e i t ( ’ ”−”. j o i n (map( str , range (100) ) ) ’ ,
number=10000)
17
/usr/bin/time -v – simple but useful
1 Command being timed : ”python universe−new . py”
2 User time ( seconds ) : 0.38
3 System time ( seconds ) : 1.61
4 Percent of CPU t h i s job got : 26%
5 Elapsed ( wall clock ) time (h :mm: ss or m: ss ) : 0:07.46
6 Average shared text s i z e ( kbytes ) : 0
7 Average unshared data s i z e ( kbytes ) : 0
8 Average stack s i z e ( kbytes ) : 0
9 Average t o t a l s i z e ( kbytes ) : 0
10 Maximum r esid e n t set s i z e ( kbytes ) : 22900
11 Average r es id en t set s i z e ( kbytes ) : 0
12 Major ( r e q u i r i n g I /O) page f a u l t s : 64
13 Minor ( reclaiming a frame ) page f a u l t s : 6370
14 Voluntary context switches : 3398
15 Involuntary context switches : 123
16 Swaps : 0
17 F i l e system inputs : 25656
18 F i l e system outputs : 0
19 Socket messages sent : 0
20 Socket messages received : 0
21 Signals d e l i v e r e d : 0
22 Page s i z e ( bytes ) : 4096
23 Exit status : 0
18
measuring execution time
Notes on measuring:
• Try to measure multiple independent repetitions of your code.
• Establish the lower bound of your execution time!
• Prepare a testing environment that will allow you to get
comparable results.
• Consider writing a micro benchmark to check various
alternative solutions of some algorithm.
• Be careful measuring speed using artificial data.
• Re-validate using real data.
19
tracking down the bottlenecks
Profiling tools will give you a more in depth view of your code
performance.
Take a view of your program internals in terms of
• execution time
• and used memory.
20
tracking down the bottlenecks
There are various possible tools, like:
• vmprof – see next talk for details!
• cProfile – a profiling module available in Python standard
library,
• line_profiler – an external line-by line profiler,
• tools for visualizing profiling results such as runsnakerun.
21
output of cprofile
cProfiler is a deterministic profiling of Python programs.
• command-line interface
1 python -m cProfile [-o output_file] [-s
sort_order] myscript.py
• Python interface
1 import cProfile
2 import re
3 cProfile.run('re.compile("foo|bar")')
22
output of cprofile
1 197 function c a l l s (192 p r i m i t i v e c a l l s ) in 0.002 seconds
2
3 Ordered by : standard name
4
5 n c a l l s tottime p e r c a l l cumtime p e r c a l l filename : lineno ( function )
6 1 0.000 0.000 0.001 0.001 <string >:1(<module>)
7 1 0.000 0.000 0.001 0.001 re . py :212( compile )
8 1 0.000 0.000 0.001 0.001 re . py :268( _compile )
9 1 0.000 0.000 0.000 0.000 sre_compile . py :172( _compile_charset )
10 1 0.000 0.000 0.000 0.000 sre_compile . py :201( _optimize_charset )
11 4 0.000 0.000 0.000 0.000 sre_compile . py :25( _identityfunction )
12 3/1 0.000 0.000 0.000 0.000 sre_compile . py :33( _compile )
23
usage of line_profile
1 @profile
2 def do_stuff(numbers):
3 print numbers
4
5 numbers = 2
6 do_stuff(numbers)
24
output of line_profile
1 > python ”C: Python27 Scripts  kernprof . py” −l −v example . py
2 2
3 Wrote p r o f i l e r e s u l t s to example . py . l p r o f
4 Timer unit : 3.2079e−07 s
5
6 F i l e : example . py
7 Function : do_stuff at l i n e 2
8 Total time : 0.00185256 s
9
10 Line # Hits Time Per Hit % Time Line Contents
11 ==============================================================
12 1 @profile
13 2 def do_stuff ( numbers ) :
14 3 1 5775 5775.0 100.0 p rin t numbers
25
runsnakerun
Figure 2:Runsnakerun
26
io bound vs compute bound
Learn how to classify types of performance bounds.
• The compute bound – large number of instructions is
making your code slow,
• the I/O bound – your code is slow because of various I/O
operations, like:
• disk access, network delays, other I/O.
Depending on the type of the bound, different optimisation
strategies will apply.
27
fixing the cause: performance tips
28
algorithms and data structures
Improving your algorithms time complexity is probably the best
thing you could do to optimise your code!
• Micro optimisation tricks will not bring you anywhere near to
the speed boost you could get from improving time complexity
of algorithm.
The big O notation matters!
• Check data structures used in your algorithms!
• Check out Time complexity @ Python’s Wiki
29
algorithms and data structures – example
Innocent lookup code placed in a large loop may generate a
performance issue.
1 def sanitize_1(user_input , stop_words):
2 """Sanitize using standard lists, new_list , iterate
over user_input check in stop_words list"""
3 new_list = []
4 for w in user_input: # longer list
5 if w not in stop_words: # shorter list
6 new_list.append(w)
7 return new_list
• Real data (Project Guttenberg, extended English stop list)
• Execution time 'pg11.txt': 2.4460400000000035, 'pg1342.txt
': 9.896383000000007, 'pg76.txt': 9.086391999999998
30
algorithms and data structures – example
Innocent lookup code placed in a large loop may generate a
performance issue.
1 def sanitize_1d(user_input , stop_words):
2 """Sanitize using lists comprehension , iterate over
user_input , check in stop_words list"""
3 return [w for w in user_input if w not in stop_words
]
• Real data (Project Guttenberg, extended English stop list)
• Execution time 'pg11.txt': 2.4180460000000052, 'pg1342.txt
': 9.796099999999987, 'pg76.txt': 8.98378300000001
31
algorithms and data structures – example
Often a trivial change, like changing a list to a set, may be the key
to solving the problem.
1 def sanitize_2d(user_input , stop_words):
2 """Sanitize using list comprehension and set"""
3 # even better if stop_words is already a set
4 stop_words = set(stop_words)
5 return [w for w in user_input if w not in stop_words
]
• Real data (Project Guttenberg, extended English stop list)
• Execution time
'pg11.txt': 0.02787999999999835, 'pg1342.txt':
0.1341930000000058, 'pg76.txt': 0.1227470000000066
Order of magnitude faster! 32
algorithms and data structures – in the wild
See excellent “A Python Optimization Anecdote” written by Pavel
Panchekha from Dropbox.
33
memory and i/o bounds
Some performance issues may be memory related, so check
memory utilization! Typical symptoms that indicate that your code
may have memory problems:
• your program never releases memory,
• or your program allocates way too much memory.
Also check if your code uses memory efficiently.
See may previous talk and references included therein.
• “Everything You Always Wanted to Know About Memory in
Python But Were Afraid to Ask”
34
memory and i/o bounds
I/O bounds may require more effort to deal with. Depending on
the problem there may be various solutions, consider using:
• asynchronous I/O with Python
• probabilistic and heuristic data structures instead of real data
• like Bloom filters,
• which are used to test whether an element is a member of a
set,
• false positive matches are possible, but false negatives are not.
• compressed data structures and lightweight compression
algorithms
35
lightweight compression
Lightweight compression algorithms – family of algorithms that are
primarily intended for real-time applications.
Lightweight compression algorithms favours compression and
decompression speed over compression ratio.
• Improved data transfer
• Lower memory footprint
• In some cases – improved internal memory access
0s 2s 4s 6s 8s 10s 12s 14s 16s
Time seconds
no compression
with compression
Processing timeData transfer
Figure 3:Lightweight compression idea
36
lightweight compression
Lightweight compression algorithms in Python:
• bindings to Snappy, lz4, others.
• write your own compression scheme.
Cassandra example:
Depending on the data characteristics of the table, compressing its
data can result in:
• 2x-4x reduction in data size
• 25-35% performance improvement on reads
• 5-10% performance improvement on writes
Cassandra supports both Snappy and lz4.
37
iteration independent calculations
Bring iteration-independent calculations outside of the loop.
This is a common sense and good practice.
• fix loops with code that performs computations that do not
change within loop,
Beware that such operations may be hidden in a class method or in
a free function.
38
branching in large loops.
Try to avoid conditional branching in large loops.
Check whatever instead of having if/else statements in the loop
body:
• it is possible to do the conditional check outside the loop,
• unroll branch in a loop,
• have separate loops for different branches.
39
function inlining
Python introduces relatively high overhead for function/method
calls.
In some cases it may be worth to consider code inlining to avoid
the overhead
• but this comes at a cost of code maintenance and readability.
40
function inlining
1 def sigmoid(x):
2 return math.tanh(x)
3
4 class BPNN:
5 def update(self, inputs):
6 ...
7 for i in range(self.ni-1):
8 self.ai[i] = sigmoid(inputs[i])
9 ...
41
function inlining
1 class BPNN:
2 def update(self, inputs):
3 ...
4 for i in range(self.ni-1):
5 self.ai[i] = math.tanh(input[i])
6 ...
42
other
• Use high performance datatypes – module Collections
• Loop unrolling
• Preallocation
• string.intern
• using locals instead of globals
• improving lookup time of function
function/method/variable/attribute
43
notes on the special cases
Use the right tools:
• When your code involves numerics – use numpy, scipy and
other specialized scientific libraries.
• This are highly optimised routines (usually based on external
scientific libraries).
• Consider pushing performance-critical code into C.
Remember to check your code with PyPy, you may be pleasantly
surprised.
44
notes on the special cases
Some problems may just need more computing power, so it may be
a good idea to:
• write code that utilizes multi core architecture
(mutliprocessing),
• or scale your code to multiple machines (task queues, spark,
grid like environment),
• or using hardware accelerators (pyOpenCL, pyCuda, pyMIC,
etc.)
45
final notes
• Optimize only when it is justified.
• Measure, profile and test.
• Optimization takes experimenting.
• Knowledge on what is going behind the scenes may help.
• Value your time. Performance tuning takes time, and your
time is expensive.
• judging by conference hotel - our time is expensive ;)
46
references
1. A Python Optimization Anecdote, Pavel Panchekha, 2011,
Dropbox.
2. Code optimization and its effects on Python, Karl-Oskar
Masing, 2013.
3. PythonSpeed, https://wiki.python.org
4. PythonSpeed / Performance Tips, https://wiki.python.org
5. Time complexity, https://wiki.python.org
6. PythonSpeed / Profiling Python
Programs,https://wiki.python.org
7. Performance, http://pypy.org
8. Everything You Always Wanted to Know About Memory in
Python But Were Afraid to Ask, http://przymus.org
47

Más contenido relacionado

La actualidad más candente

Pycon taiwan 2018_claudiu_popa
Pycon taiwan 2018_claudiu_popaPycon taiwan 2018_claudiu_popa
Pycon taiwan 2018_claudiu_popaClaudiu Popa
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windowsextremecoders
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Fariz Darari
 
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
PYTHON-Chapter 4-Plotting and Data Science  PyLab - MAULIK BORSANIYAPYTHON-Chapter 4-Plotting and Data Science  PyLab - MAULIK BORSANIYA
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYAMaulik Borsaniya
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile TimeemBO_Conference
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internalHyunghun Cho
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And AnswersH2Kinfosys
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Chariza Pladin
 
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonMax Klymyshyn
 
Programming in Python
Programming in Python Programming in Python
Programming in Python Tiji Thomas
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Pythonprimeteacher32
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming LanguageDipankar Achinta
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonTariq Rashid
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answersRojaPriya
 
TensorFlow.Data 및 TensorFlow Hub
TensorFlow.Data 및 TensorFlow HubTensorFlow.Data 및 TensorFlow Hub
TensorFlow.Data 및 TensorFlow HubJeongkyu Shin
 

La actualidad más candente (20)

Pycon taiwan 2018_claudiu_popa
Pycon taiwan 2018_claudiu_popaPycon taiwan 2018_claudiu_popa
Pycon taiwan 2018_claudiu_popa
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
 
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
PYTHON-Chapter 4-Plotting and Data Science  PyLab - MAULIK BORSANIYAPYTHON-Chapter 4-Plotting and Data Science  PyLab - MAULIK BORSANIYA
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile Time
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And Answers
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3
 
Python basic
Python basicPython basic
Python basic
 
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
 
Python basics
Python basicsPython basics
Python basics
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Python
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming Language
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with Python
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answers
 
TensorFlow.Data 및 TensorFlow Hub
TensorFlow.Data 및 TensorFlow HubTensorFlow.Data 및 TensorFlow Hub
TensorFlow.Data 및 TensorFlow Hub
 
Python made easy
Python made easy Python made easy
Python made easy
 

Destacado

Vasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingVasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingSergey Arkhipov
 
Denis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceDenis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceSergey Arkhipov
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldPyData
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationGlobalLogic Ukraine
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning晨揚 施
 
Python profiling
Python profilingPython profiling
Python profilingdreampuf
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovMaksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkSpark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profilingJon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scalesparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterSpark Summit
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Spark Summit
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleSpark Summit
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySparkSpark Summit
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesAhsan Javed Awan
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityWes McKinney
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 

Destacado (20)

Vasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingVasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python Profiling
 
Denis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceDenis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python Performance
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian Ozsvald
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
 
Python profiling
Python profilingPython profiling
Python profiling
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 

Similar a What’s eating python performance

Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerPyData
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android BenchmarksKoan-Sin Tan
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 
Computer Architecture and Organization
Computer Architecture and OrganizationComputer Architecture and Organization
Computer Architecture and Organizationssuserdfc773
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performanceinside-BigData.com
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).pptImranKhan997082
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
Peddle the Pedal to the Metal
Peddle the Pedal to the MetalPeddle the Pedal to the Metal
Peddle the Pedal to the MetalC4Media
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldAndrey Karpov
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6Shah Zaib
 
Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdfssuser28de9e
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionShobha Kumar
 
Ruby3x3: How are we going to measure 3x
Ruby3x3: How are we going to measure 3xRuby3x3: How are we going to measure 3x
Ruby3x3: How are we going to measure 3xMatthew Gaudet
 
Fundamentals.pptx
Fundamentals.pptxFundamentals.pptx
Fundamentals.pptxdhivyak49
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 

Similar a What’s eating python performance (20)

03 performance
03 performance03 performance
03 performance
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 
Computer Architecture and Organization
Computer Architecture and OrganizationComputer Architecture and Organization
Computer Architecture and Organization
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
04 performance
04 performance04 performance
04 performance
 
L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).ppt
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Peddle the Pedal to the Metal
Peddle the Pedal to the MetalPeddle the Pedal to the Metal
Peddle the Pedal to the Metal
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdf
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
 
Ruby3x3: How are we going to measure 3x
Ruby3x3: How are we going to measure 3xRuby3x3: How are we going to measure 3x
Ruby3x3: How are we going to measure 3x
 
Fundamentals.pptx
Fundamentals.pptxFundamentals.pptx
Fundamentals.pptx
 
Embedded System-design technology
Embedded System-design technologyEmbedded System-design technology
Embedded System-design technology
 
L1
L1L1
L1
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 

Último

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

What’s eating python performance

  • 1. what’s eating python performance Piotr Przymus Nicolaus Copernicus University 1
  • 2. about me • Piotr Przymus PhD • work @ Nicolaus Copernicus University in Toruń • Interests: data mining and machine learning, databases, GPGPU computing, high-performance computing. • 9 years of Python experience. 2
  • 4. introduction Programmers waste enormous amounts of time thinking about, or worrying about the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimisation is the root of all evil. Donald Knuth, “Structured Programming With Go To Statements”, 1974. Yet we should not pass up our opportunities in that critical 3%. 4
  • 5. premature optimisation Premature optimisation may be stated as optimising code before knowing whether we need to. This may be bad as it impacts: • your productivity, • readability of the code, • ease of maintenance and debugging, • and it may contradict The Zen of Python ;). Learn how to do proper assessment of your code in terms of optimisation needs! Remember that a strong felling that your code falls into the remaining 3% does not count! 5
  • 6. think before doing (think before coding) Going for higher performance without a deeper reason may be just a waste of your time. So start with: • stating your reasons (Why do you need higher performance?), • defining your goals (What would be an acceptable speed of your code?), • estimating time and resources you are willing to spend to achieve these goals. Re-evaluate all the pros and cons. 6
  • 7. why do you need higher performance? Good reasons: • Computation cost reduction • Significantly better user experience • Significantly faster results 7
  • 8. what would be an acceptable speed of your code? This is an important and a difficult to answer question! • Computation cost reduction • Large projects with lots of computations • They may benefit just from few percent improvements. • Significantly better user experience of web/desktop application. • Note user experience is subjective, the user may: • not notice the difference, • or may not care about the change. • The User is Always Right • Significantly faster results • Scientific computing, Data mining, Machine learning • Large data sets processing • Example: going from weeks to one day makes a huge difference. 8
  • 9. amdahl’s law Amdahl’s law is used to find the maximum expected improvement to an overall system when only part of the system is improved. (wiki) • Often used in parallel computing to predict the theoretical maximum speedup. • Assumes that the problem size remains the same! Maximum expected improvement of a system, when only part of the computation is improved improvment = 1 (1 − P) + P S where: • P is the proportion of improved computations, • S is the improvement ratio. 9
  • 10. amdahl’s law – example Figure 1:Amdahl’s law example 10
  • 11. amdahl’s law – example If we improve: • 30% of computations, • so that they run twice as fast, then P = 0.3 and S = 2, and the overall system improvement is only 1 (1 − 0.3) + 0.3 2 = 1.1765. 11
  • 12. test, measure, track down bottle- necks 12
  • 13. test, measure, track down bottlenecks A starting point for optimisation is a running code that gives correct results. • Prepare a regression test suite! Then rest of the optimisation process may be summarized as: 1. Test if the code works correctly. 2. Measure execution time • if code is not fast enough use a profiler to identify the bottlenecks, • else Your done! 3. Fix performance problems. 4. Start from the beginning. 13
  • 14. regression test suite Before you start, prepare a regression test suite that: • will guard the correctness of your code during the optimisation. • is comprehensive but yet quick-to-run. Test will be ran very often – a reasonable execution time is a must! 14
  • 15. measuring execution time Measure execution time of your code. This is important because: • it shows if you are getting any progress, • it shows how far it is from the desired execution time (a.k.a. acceptable speed), • it allows you to compare various version of optimisations. 15
  • 16. measuring execution time There are various tools to do that, among them: • Custom made timer, • Pythons timeit module, • unix time (use /usr/bin/time as time is also a common shell built in). 16
  • 17. timeit A module provides a simple way to time small bits of Python code, has: • command-line interface 1 $ python −m t i m e i t ’ ”−” . j o i n ( [ s t r (n) f o r n in range (100) ] ) ’ 2 10000 loops , best of 3: 33.4 usec per loop 3 $ python −m t i m e i t ’ ”−” . j o i n (map( str , range (100) ) ) ’ 4 10000 loops , best of 3: 25.2 usec per loop • Python Interface 1 >>> t i m e i t . t i m e i t ( ’ ”−”. j o i n ( [ s t r (n) f o r n in range (100) ] ) ’ , number=10000) 2 0.7288308143615723 3 >>> t i m e i t . t i m e i t ( ’ ”−”. j o i n (map( str , range (100) ) ) ’ , number=10000) 17
  • 18. /usr/bin/time -v – simple but useful 1 Command being timed : ”python universe−new . py” 2 User time ( seconds ) : 0.38 3 System time ( seconds ) : 1.61 4 Percent of CPU t h i s job got : 26% 5 Elapsed ( wall clock ) time (h :mm: ss or m: ss ) : 0:07.46 6 Average shared text s i z e ( kbytes ) : 0 7 Average unshared data s i z e ( kbytes ) : 0 8 Average stack s i z e ( kbytes ) : 0 9 Average t o t a l s i z e ( kbytes ) : 0 10 Maximum r esid e n t set s i z e ( kbytes ) : 22900 11 Average r es id en t set s i z e ( kbytes ) : 0 12 Major ( r e q u i r i n g I /O) page f a u l t s : 64 13 Minor ( reclaiming a frame ) page f a u l t s : 6370 14 Voluntary context switches : 3398 15 Involuntary context switches : 123 16 Swaps : 0 17 F i l e system inputs : 25656 18 F i l e system outputs : 0 19 Socket messages sent : 0 20 Socket messages received : 0 21 Signals d e l i v e r e d : 0 22 Page s i z e ( bytes ) : 4096 23 Exit status : 0 18
  • 19. measuring execution time Notes on measuring: • Try to measure multiple independent repetitions of your code. • Establish the lower bound of your execution time! • Prepare a testing environment that will allow you to get comparable results. • Consider writing a micro benchmark to check various alternative solutions of some algorithm. • Be careful measuring speed using artificial data. • Re-validate using real data. 19
  • 20. tracking down the bottlenecks Profiling tools will give you a more in depth view of your code performance. Take a view of your program internals in terms of • execution time • and used memory. 20
  • 21. tracking down the bottlenecks There are various possible tools, like: • vmprof – see next talk for details! • cProfile – a profiling module available in Python standard library, • line_profiler – an external line-by line profiler, • tools for visualizing profiling results such as runsnakerun. 21
  • 22. output of cprofile cProfiler is a deterministic profiling of Python programs. • command-line interface 1 python -m cProfile [-o output_file] [-s sort_order] myscript.py • Python interface 1 import cProfile 2 import re 3 cProfile.run('re.compile("foo|bar")') 22
  • 23. output of cprofile 1 197 function c a l l s (192 p r i m i t i v e c a l l s ) in 0.002 seconds 2 3 Ordered by : standard name 4 5 n c a l l s tottime p e r c a l l cumtime p e r c a l l filename : lineno ( function ) 6 1 0.000 0.000 0.001 0.001 <string >:1(<module>) 7 1 0.000 0.000 0.001 0.001 re . py :212( compile ) 8 1 0.000 0.000 0.001 0.001 re . py :268( _compile ) 9 1 0.000 0.000 0.000 0.000 sre_compile . py :172( _compile_charset ) 10 1 0.000 0.000 0.000 0.000 sre_compile . py :201( _optimize_charset ) 11 4 0.000 0.000 0.000 0.000 sre_compile . py :25( _identityfunction ) 12 3/1 0.000 0.000 0.000 0.000 sre_compile . py :33( _compile ) 23
  • 24. usage of line_profile 1 @profile 2 def do_stuff(numbers): 3 print numbers 4 5 numbers = 2 6 do_stuff(numbers) 24
  • 25. output of line_profile 1 > python ”C: Python27 Scripts kernprof . py” −l −v example . py 2 2 3 Wrote p r o f i l e r e s u l t s to example . py . l p r o f 4 Timer unit : 3.2079e−07 s 5 6 F i l e : example . py 7 Function : do_stuff at l i n e 2 8 Total time : 0.00185256 s 9 10 Line # Hits Time Per Hit % Time Line Contents 11 ============================================================== 12 1 @profile 13 2 def do_stuff ( numbers ) : 14 3 1 5775 5775.0 100.0 p rin t numbers 25
  • 27. io bound vs compute bound Learn how to classify types of performance bounds. • The compute bound – large number of instructions is making your code slow, • the I/O bound – your code is slow because of various I/O operations, like: • disk access, network delays, other I/O. Depending on the type of the bound, different optimisation strategies will apply. 27
  • 28. fixing the cause: performance tips 28
  • 29. algorithms and data structures Improving your algorithms time complexity is probably the best thing you could do to optimise your code! • Micro optimisation tricks will not bring you anywhere near to the speed boost you could get from improving time complexity of algorithm. The big O notation matters! • Check data structures used in your algorithms! • Check out Time complexity @ Python’s Wiki 29
  • 30. algorithms and data structures – example Innocent lookup code placed in a large loop may generate a performance issue. 1 def sanitize_1(user_input , stop_words): 2 """Sanitize using standard lists, new_list , iterate over user_input check in stop_words list""" 3 new_list = [] 4 for w in user_input: # longer list 5 if w not in stop_words: # shorter list 6 new_list.append(w) 7 return new_list • Real data (Project Guttenberg, extended English stop list) • Execution time 'pg11.txt': 2.4460400000000035, 'pg1342.txt ': 9.896383000000007, 'pg76.txt': 9.086391999999998 30
  • 31. algorithms and data structures – example Innocent lookup code placed in a large loop may generate a performance issue. 1 def sanitize_1d(user_input , stop_words): 2 """Sanitize using lists comprehension , iterate over user_input , check in stop_words list""" 3 return [w for w in user_input if w not in stop_words ] • Real data (Project Guttenberg, extended English stop list) • Execution time 'pg11.txt': 2.4180460000000052, 'pg1342.txt ': 9.796099999999987, 'pg76.txt': 8.98378300000001 31
  • 32. algorithms and data structures – example Often a trivial change, like changing a list to a set, may be the key to solving the problem. 1 def sanitize_2d(user_input , stop_words): 2 """Sanitize using list comprehension and set""" 3 # even better if stop_words is already a set 4 stop_words = set(stop_words) 5 return [w for w in user_input if w not in stop_words ] • Real data (Project Guttenberg, extended English stop list) • Execution time 'pg11.txt': 0.02787999999999835, 'pg1342.txt': 0.1341930000000058, 'pg76.txt': 0.1227470000000066 Order of magnitude faster! 32
  • 33. algorithms and data structures – in the wild See excellent “A Python Optimization Anecdote” written by Pavel Panchekha from Dropbox. 33
  • 34. memory and i/o bounds Some performance issues may be memory related, so check memory utilization! Typical symptoms that indicate that your code may have memory problems: • your program never releases memory, • or your program allocates way too much memory. Also check if your code uses memory efficiently. See may previous talk and references included therein. • “Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask” 34
  • 35. memory and i/o bounds I/O bounds may require more effort to deal with. Depending on the problem there may be various solutions, consider using: • asynchronous I/O with Python • probabilistic and heuristic data structures instead of real data • like Bloom filters, • which are used to test whether an element is a member of a set, • false positive matches are possible, but false negatives are not. • compressed data structures and lightweight compression algorithms 35
  • 36. lightweight compression Lightweight compression algorithms – family of algorithms that are primarily intended for real-time applications. Lightweight compression algorithms favours compression and decompression speed over compression ratio. • Improved data transfer • Lower memory footprint • In some cases – improved internal memory access 0s 2s 4s 6s 8s 10s 12s 14s 16s Time seconds no compression with compression Processing timeData transfer Figure 3:Lightweight compression idea 36
  • 37. lightweight compression Lightweight compression algorithms in Python: • bindings to Snappy, lz4, others. • write your own compression scheme. Cassandra example: Depending on the data characteristics of the table, compressing its data can result in: • 2x-4x reduction in data size • 25-35% performance improvement on reads • 5-10% performance improvement on writes Cassandra supports both Snappy and lz4. 37
  • 38. iteration independent calculations Bring iteration-independent calculations outside of the loop. This is a common sense and good practice. • fix loops with code that performs computations that do not change within loop, Beware that such operations may be hidden in a class method or in a free function. 38
  • 39. branching in large loops. Try to avoid conditional branching in large loops. Check whatever instead of having if/else statements in the loop body: • it is possible to do the conditional check outside the loop, • unroll branch in a loop, • have separate loops for different branches. 39
  • 40. function inlining Python introduces relatively high overhead for function/method calls. In some cases it may be worth to consider code inlining to avoid the overhead • but this comes at a cost of code maintenance and readability. 40
  • 41. function inlining 1 def sigmoid(x): 2 return math.tanh(x) 3 4 class BPNN: 5 def update(self, inputs): 6 ... 7 for i in range(self.ni-1): 8 self.ai[i] = sigmoid(inputs[i]) 9 ... 41
  • 42. function inlining 1 class BPNN: 2 def update(self, inputs): 3 ... 4 for i in range(self.ni-1): 5 self.ai[i] = math.tanh(input[i]) 6 ... 42
  • 43. other • Use high performance datatypes – module Collections • Loop unrolling • Preallocation • string.intern • using locals instead of globals • improving lookup time of function function/method/variable/attribute 43
  • 44. notes on the special cases Use the right tools: • When your code involves numerics – use numpy, scipy and other specialized scientific libraries. • This are highly optimised routines (usually based on external scientific libraries). • Consider pushing performance-critical code into C. Remember to check your code with PyPy, you may be pleasantly surprised. 44
  • 45. notes on the special cases Some problems may just need more computing power, so it may be a good idea to: • write code that utilizes multi core architecture (mutliprocessing), • or scale your code to multiple machines (task queues, spark, grid like environment), • or using hardware accelerators (pyOpenCL, pyCuda, pyMIC, etc.) 45
  • 46. final notes • Optimize only when it is justified. • Measure, profile and test. • Optimization takes experimenting. • Knowledge on what is going behind the scenes may help. • Value your time. Performance tuning takes time, and your time is expensive. • judging by conference hotel - our time is expensive ;) 46
  • 47. references 1. A Python Optimization Anecdote, Pavel Panchekha, 2011, Dropbox. 2. Code optimization and its effects on Python, Karl-Oskar Masing, 2013. 3. PythonSpeed, https://wiki.python.org 4. PythonSpeed / Performance Tips, https://wiki.python.org 5. Time complexity, https://wiki.python.org 6. PythonSpeed / Profiling Python Programs,https://wiki.python.org 7. Performance, http://pypy.org 8. Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask, http://przymus.org 47