Do more than one thing at the same time, the Python way

•Descargar como KEY, PDF•

5 recomendaciones•6,211 vistas

The document discusses doing more than one thing at a time in Python using threads and processes. It describes how to create threads using the threading module and processes using the multiprocessing module. While threads are easier to use, the Global Interpreter Lock (GIL) in Python prevents true parallelism. Processes can better utilize multiple CPUs but require more work for communication. Asynchronous programming is recommended for I/O-bound tasks while processes are better for CPU-bound work. The talk cautions that threading should be used carefully in Python due to the GIL.

Tecnología

DO MORE THAN ONE
THING AT THE TIME
the Python way!

Jaime Buelta

SLICE A PROBLEM TO
SOLVE IT USING MORE
RESOURCES

THREADS IN PYTHON

module threading

module thread

THREAD EXAMPLE
import threading
ITERATIONS = 1000000

class Example1(threading.Thread):

def __init__(self, num):
self.num = num
super(Example1, self).__init__()

def run(self):

for i in xrange(ITERATIONS):
pass

def main():
for j in xrange(10):
t = Example1(j)
t.start()

if __name__ == '__main__':
main()

TIMERS
from threading import Timer
DELAYED_TIME = 10.5

def delayed():
print 'This call is delayed'

t = Timer(10.5, delayed)
t.start()

t.cancel() # Cancels the execution

MULTIPROCESS MODULE
import multiprocessing
ITERATIONS = 1000000

class Example1(multiprocessing.Process):

def __init__(self, num):
self.num = num
super(Example1, self).__init__()

def run(self):

for i in xrange(ITERATIONS):
pass

def main():
for j in xrange(10):
t = Example1(j)
t.start()

if __name__ == '__main__':
main()

PROCESS COMMUNICATION
NEEDS TO BE STRUCTURED

but that is not necessarily a bad thing

Task 1 Task 2 Task 3

The task will release
waiting control once they are
ready! blocked waiting for an
done! input from IO
waiting
ready! Callback

done!
done!

EVENTLET
NUM_URLS = 1000
URL = 'http://www.some_address.com/'

urls = [URL] * NUM_URLS

import eventlet
from eventlet.green import urllib2

def fetch(url):
return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()
for body in pool.imap(fetch, urls):
do_something_with_result(body)

Asynchronous
programing is great
when the tasks are
IO - Bound

So the CPU is
basically waiting...

Asynchronous
programing is not
good when tasks are
CPU - Bound

If one tasks enters on
an inﬁnite loop, the
whole system is
blocked

YESTERDAY THERE WAS A
TALK ABOUT ASYNC PYTHON
PROGRAMMING
Hope you attended, I did.
If you don’t, you can watch it online later

It doesn’t allow to run two threads at the same time, even if the
OS will do it.

Only one thread run. The rest will be blocked.

100,000,000 iterations

40 s

30 s

20 s

10 s

0s
1 10 100 1000 10000

4 core machine

I WANT TO WATCH THIS YOUTUBE

BUT I’M ALREADY LISTENING TO MUSIC
AT THE SAME TIME

EEH, NOT AS BIG AS IT LOOKS

GIL MAKES CONCURRENT
PROGRAMMING MUCH EASIER

And the problems are quite limited in practice

BUT MAYBE YOUR PROGRAM
IS ONE OF THE FEW

Avoid problems not using threads, but processes

threading multiprocess sequential

100,000ms

10,000ms

1,000ms

100ms

10ms

1ms
1000 10000 100000 1000000 10000000

Great, detailed talk
“Understanding GIL”
by David Beazly
http://www.dabeaz.com/python/UnderstandingGIL.pdf

ALL PYTHON OPERATIONS
ARE ATOMIC
Hey, that’s what the GIL is for

LOCKING TO
BE USED WITH
EXTREME
CAUTION
If you need to set exclusive
sections, you are probably
doing it wrong

BUT WHEN I
DO, I USE WITH

from threading import Lock
my_lock = Lock()

def some_function(args):

with my_lock:
protected_section()

Main periodic thread

task A

task B

task C

task D

Main periodic thread

Main periodic thread

but that’s probably not the best use of Python

QUESTIONS?
THANKS
FOR
YOUR
ATTENTION
@jaimebuelta
wrongsideofmemphis.wordpress.com

Más contenido relacionado

La actualidad más candente

streamparse and pystorm: simple reliable parallel processing with stormDaniel Blanchard

Understanding greenletSaúl Ibarra Corretgé

JavaOne 2012 - JVM JIT for DummiesCharles Nutter

Golang concurrency designHyejong

Making fitting in RooFit fasterPatrick Bos

Global Interpreter Lock: Episode III - cat < /dev/zero > GIL;Tzung-Bi Shih

[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용현호 김

Using Python3 to Build a Cloud Computing Service for my Superboard IIDavid Beazley (Dabeaz LLC)

About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014Fantix King 王川

JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins

concurrencyJonathan Wagoner

Commit ускоривший python 2.7.11 на 30% и новое в python 3.5PyNSK

Golang design4concurrencyEduardo Ferro Aldama

Kotlin coroutines and spring frameworkSunghyouk Bae

Python twistedMahendra M

Current State of CoroutinesGuido Pio Mariotti

Down the Rabbit HoleCharles Nutter

Coroutines for Kotlin Multiplatform in PractiseChristian Melchior

Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.Graham Dumpleton

The Year of JRuby - RubyC 2018Charles Nutter

La actualidad más candente (20)

streamparse and pystorm: simple reliable parallel processing with storm

Understanding greenlet

JavaOne 2012 - JVM JIT for Dummies

Golang concurrency design

Making fitting in RooFit faster

Global Interpreter Lock: Episode III - cat < /dev/zero > GIL;

[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용

Using Python3 to Build a Cloud Computing Service for my Superboard II

About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014

JVM Mechanics: When Does the JVM JIT & Deoptimize?

concurrency

Commit ускоривший python 2.7.11 на 30% и новое в python 3.5

Golang design4concurrency

Kotlin coroutines and spring framework

Python twisted

Current State of Coroutines

Down the Rabbit Hole

Coroutines for Kotlin Multiplatform in Practise

Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.

The Year of JRuby - RubyC 2018

Destacado

Flask and Paramiko for Python VAEnrique Valenzuela

Use git the proper wayJaime Buelta

Utopia Kindgoms scaling case: From 4 to 50K usersJaime Buelta

Use of django at jolt online v3Jaime Buelta

Database madness with_mongoengine_and_sql_alchemyJaime Buelta

Make beautiful Python codeJaime Buelta

Django deployment best practicesErik LaBianca

Ansible on AWSDiego Pacheco

Django rest framework in 20 minutenAndi Albrecht

Towards Continuous Deployment with DjangoRoger Barnes

Two scoops of Django - Security Best PracticesSpin Lai

Pythonic Deployment with Fabric 0.9Corey Oordt

Django REST FrameworkLoad Impact

Building a platform with Django, Docker, and Saltbaremetal

Life in a Queue - Using Message Queue with djangoTareque Hossain

Django in the Real WorldJacob Kaplan-Moss

Scaling DjangoMike Malone

12 tips on Django Best PracticesDavid Arcos

Ansible presentationJohn Lynch

Introduction to SSHHemant Shah

Destacado (20)

Flask and Paramiko for Python VA

Use git the proper way

Utopia Kindgoms scaling case: From 4 to 50K users

Use of django at jolt online v3

Database madness with_mongoengine_and_sql_alchemy

Make beautiful Python code

Django deployment best practices

Ansible on AWS

Django rest framework in 20 minuten

Towards Continuous Deployment with Django

Two scoops of Django - Security Best Practices

Pythonic Deployment with Fabric 0.9

Django REST Framework

Building a platform with Django, Docker, and Salt

Life in a Queue - Using Message Queue with django

Django in the Real World

Scaling Django

12 tips on Django Best Practices

Ansible presentation

Introduction to SSH

Similar a Do more than one thing at the same time, the Python way

PythonBrasil[8] - CPython for dummiesTatiana Al-Chueyr

обзор PythonYehor Nazarkin

Danny Adair - Python Cookbook - Introdanny.adair

Conf orm - explainLouise Grandjonc

Bugs from Outer Space | while42 SF #6While42

What can be done with Java, but should better be done with Erlang (@pavlobaron)Pavlo Baron

Your Library Sucks, and why you should use it.Peter Higgins

The state of PHPUnitEdorian

The State of PHPUnitEdorian

Java Performance TuningAtthakorn Chanthong

The State of PHPUnitEdorian

Polymorphism.pptxVijaykota11

Introduction to ida pythongeeksec80

Task and Data Parallelism: Real-World ExamplesSasha Goldshtein

Joblib: Lightweight pipelining for parallel jobs (v2)Marcel Caraciolo

Operationalizing Clojure ConfidentlyPrasanna Gautam

Machine learning the next revolution or just another hypeJorge Ferrer

Priming Java for Speed at Market OpenAzul Systems Inc.

Essentials of Multithreaded System Programming in C++Shuo Chen

Ruby Refinements: the Worst Feature You Ever Lovedpaoloperrotta

Similar a Do more than one thing at the same time, the Python way (20)

PythonBrasil[8] - CPython for dummies

обзор Python

Danny Adair - Python Cookbook - Intro

Conf orm - explain

Bugs from Outer Space | while42 SF #6

What can be done with Java, but should better be done with Erlang (@pavlobaron)

Your Library Sucks, and why you should use it.

The state of PHPUnit

The State of PHPUnit

Java Performance Tuning

The State of PHPUnit

Polymorphism.pptx

Introduction to ida python

Task and Data Parallelism: Real-World Examples

Joblib: Lightweight pipelining for parallel jobs (v2)

Operationalizing Clojure Confidently

Machine learning the next revolution or just another hype

Priming Java for Speed at Market Open

Essentials of Multithreaded System Programming in C++

Ruby Refinements: the Worst Feature You Ever Loved

Último

Artificial intelligence in cctv survelliance.pptxhariprasad279825

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Story boards and shot lists for my a level piececharlottematthew16

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

"ML in Production",Oleksandr BaganFwdays

CloudStudio User manual (basic edition):comworks

Install Stable Diffusion in windows machinePadma Pradeep

AI as an Interface for Commercial BuildingsMemoori

Commit 2024 - Secret Management made easyAlfredo García Lavilla

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Training state-of-the-art general text embeddingZilliz

Do more than one thing at the same time, the Python way

1. DO MORE THAN ONE THING AT THE TIME the Python way! Jaime Buelta

2. WHY?

3. SLICE A PROBLEM TO SOLVE IT USING MORE RESOURCES

4. SAME THING FOR DIFFERENT ACTORS

5. DIFFERENT THINGS FOR THE SAME ACTOR

6. DOING MORE THAN ONE THING IS TOUGH

7. CHOOSE WISELY

8. THREADS

9. THREADS IN PYTHON module threading module thread

10. THREAD EXAMPLE import threading ITERATIONS = 1000000 class Example1(threading.Thread): def __init__(self, num): self.num = num super(Example1, self).__init__() def run(self): for i in xrange(ITERATIONS): pass def main(): for j in xrange(10): t = Example1(j) t.start() if __name__ == '__main__': main()

11. TIMERS from threading import Timer DELAYED_TIME = 10.5 def delayed(): print 'This call is delayed' t = Timer(10.5, delayed) t.start() t.cancel() # Cancels the execution

12. PROCESSES

13. YE OLDE FORK

14. MULTIPROCESS MODULE import multiprocessing ITERATIONS = 1000000 class Example1(multiprocessing.Process): def __init__(self, num): self.num = num super(Example1, self).__init__() def run(self): for i in xrange(ITERATIONS): pass def main(): for j in xrange(10): t = Example1(j) t.start() if __name__ == '__main__': main()

15. OS ARE GREAT AT MULTITASKING

16. PROCESS COMMUNICATION NEEDS TO BE STRUCTURED but that is not necessarily a bad thing

17. ASYNCHRONOUS PROGRAMMING

18. Thread 1 Thread 2

19.

20. Task 1 Task 2 Task 3 The task will release waiting control once they are ready! blocked waiting for an done! input from IO waiting ready! Callback done! done!

21. death by callback

22. EVENTLET NUM_URLS = 1000 URL = 'http://www.some_address.com/' urls = [URL] * NUM_URLS import eventlet from eventlet.green import urllib2 def fetch(url): return urllib2.urlopen(url).read() pool = eventlet.GreenPool() for body in pool.imap(fetch, urls): do_something_with_result(body)

23. Asynchronous programing is great when the tasks are IO - Bound So the CPU is basically waiting...

24. Asynchronous programing is not good when tasks are CPU - Bound If one tasks enters on an inﬁnite loop, the whole system is blocked

25. YESTERDAY THERE WAS A TALK ABOUT ASYNC PYTHON PROGRAMMING Hope you attended, I did. If you don’t, you can watch it online later

26. THE INFAMOUS GIL

27. It doesn’t allow to run two threads at the same time, even if the OS will do it. Only one thread run. The rest will be blocked.

28. 2 core machine Thread A Thread B

29. 100,000,000 iterations 40 s 30 s 20 s 10 s 0s 1 10 100 1000 10000 4 core machine

30. IS IT REALLY A PROBLEM?

31. I WANT TO WATCH THIS YOUTUBE BUT I’M ALREADY LISTENING TO MUSIC AT THE SAME TIME EEH, NOT AS BIG AS IT LOOKS

32. GIL MAKES CONCURRENT PROGRAMMING MUCH EASIER And the problems are quite limited in practice

33. BUT MAYBE YOUR PROGRAM IS ONE OF THE FEW Avoid problems not using threads, but processes

34. threading multiprocess sequential 100,000ms 10,000ms 1,000ms 100ms 10ms 1ms 1000 10000 100000 1000000 10000000

35. Great, detailed talk “Understanding GIL” by David Beazly http://www.dabeaz.com/python/UnderstandingGIL.pdf

36. AVOID PROBLEMS

37. SIMPLE ARCHITECTURE

38. ALL PYTHON OPERATIONS ARE ATOMIC Hey, that’s what the GIL is for

39. LOCKING TO BE USED WITH EXTREME CAUTION If you need to set exclusive sections, you are probably doing it wrong

40. BUT WHEN I DO, I USE WITH from threading import Lock my_lock = Lock() def some_function(args): with my_lock: protected_section()

41. USE QUEUES (AND PIPES)

42. PROCESS THE TASK WITH WORKERS

43. LIMIT THE NUMBERS!!!

44. THREAD COORDINATION IS HELL

45. Main periodic thread task A task B task C task D Main periodic thread Main periodic thread

46. but that’s probably not the best use of Python

47. QUESTIONS? THANKS FOR YOUR ATTENTION @jaimebuelta wrongsideofmemphis.wordpress.com

Notas del editor

\n
The first point is, why complicate the code trying to do more than one thing at the same time. The answer is to better use the resources. We need spare resources to start with\n\nThe typical applications that need these kind of techniques are three (divide a big tasks into several computers/cores, do the same thing for different actors, do different things for the same actor)\n
The typical example is crunching numbers. Render a movie, data mining, etc. Usually can be treated as the next case (do the same thing more than once) with an algorithm that works with parts of the data.\n
Do the same thing, for different actors\n Web server\n Different tabs on a web browser\n
Do different things for the same actor (usually some coordination is needed, more later):\n Game (AI of each enemy, render image, sound)\n
Debug can be very tricky, you&#x2019;ve been advised. If coordination is needed, that can be tricky. If only one thing is do (several times), the complexity is highly reduced due help from the Operative System\n
There are several ways of execute code concurrently, we&#x2019;ll discuss threads, processes and asynchronous programming\n
\n
module thread\n Low level, in general is better avoid it\n module threading\n Higher level. More functionality\n
\n
\n
\n
A long, long time ago, that was the way of dealing with multiprocessing\nOnly available on Unix machines\nYou can fork using os.fork() in Python!\nFork is not in common use anymore due problems (fork bomb, etc)\n
\n
OS are very good at multitasking (and have been for some time). Communication can be achieved through ports, files, pipes, external queues, etc...\nBe radical, use more than one program\n
Threads share all the memory, so they can access whatever. That is good (no communication overhead), but can lead to abuse and instability. \n
it is the new cool technology, and there has been some recent uses, like NodeJS\n
On a thread model, you have a supervisor (OS) that will stop one thread and execute the other, transparently for the program\n
On asynchronous programing, every task is waiting for the others to end their dance to get on the dance floor\n
Or voluntarily releasing control (sleep, yield, etc). The typical way of calling the next block is to add callbacks to the code (when this result is available, use it as input for this code).\n
Callbacks make the code hard to follow and debug\n
No threads! All the fetch functions start, there is no need for callbacks, the urlopen call will yield and resume at the same point\n
The typical example is a web app, which normally takes time in get data from a DB and spend a small amount of time in composing the response.\nYou can keep a huge number of connections, like in chat servers, etc.\n
Number crunching, 3d calculations, etc... In a threaded model, the rest of the thread will be executed from time to time.\n
Use of Twisted and Tornado a some Node.js\n
\n
This will only happen on multicore processors.There is some overhead on the blocking. It also make UNIX signals to act weird (CTRL + C example)\n
The OS could (and probably do, depending on the load) set a thread to run on each core.\n
Num iterations is constant, with different threads. Increasing the number of threads magnifies the problem, as adds more overhead. Adding threads adds MORE time instead of reducing it.\n
\n
it is a very &#x201C;first world problem&#x201D;. It is only causing problems on a subset of programs, in general the effects are limited.\n
Only if CPU-bound, multithread program is running on a multicore machine, you need to worry.\nEven one CPU bound thread can have an effect due overhead and could be more efficient to use sequential programming\n\n
\n
time vs iterations for 10 threads/processes\n
He has a couple more articles on his web page www.dabeaz.com\n
In general, for concurrent programming\n
probably true for all software development\n\nkeep the tasks separated and reduce communication between elements\n\n\n
That also includes C extensions. You don&#x2019;t need a lock for reads, only for writes that are not a single python operation.\n
Python should be enough for most of the situations. Think carefully if you need a lock, mutex or semaphore\n\n\n
\n
There are different kind of Queues, like LIFO, FIFO, Priority, etc. multiprocess has also queues.\nExternal queues can also be useful (RabbitMQ, AMQP, etc)\nA pipe is a queue with one input and one output.\n
\n
Limit the number of workers. Unlimited workers is unsafe and can block the system.\n\nA lot of threads can be worse than less threads. Do some tests to find the sweet spo\n
Thread coordination is very difficult. Try to create and destroy short-lived threads instead.\n
This structure is typical of games (and real time systems)\n\nEach interval (1 second), you evaluate the needs and throw the needed threads (or execute sequentially)\nThe main periodic thread can cancel next time threads that haven&#x2019;t finished their task.\n\nKill unused threads\n
So consider using other languages...\n
\n

Do more than one thing at the same time, the Python way

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Do more than one thing at the same time, the Python way

Similar a Do more than one thing at the same time, the Python way (20)

Último

Último (20)

Do more than one thing at the same time, the Python way

Notas del editor