The document discusses doing more than one thing at a time in Python using threads and processes. It describes how to create threads using the threading module and processes using the multiprocessing module. While threads are easier to use, the Global Interpreter Lock (GIL) in Python prevents true parallelism. Processes can better utilize multiple CPUs but require more work for communication. Asynchronous programming is recommended for I/O-bound tasks while processes are better for CPU-bound work. The talk cautions that threading should be used carefully in Python due to the GIL.
10. THREAD EXAMPLE
import threading
ITERATIONS = 1000000
class Example1(threading.Thread):
def __init__(self, num):
self.num = num
super(Example1, self).__init__()
def run(self):
for i in xrange(ITERATIONS):
pass
def main():
for j in xrange(10):
t = Example1(j)
t.start()
if __name__ == '__main__':
main()
11. TIMERS
from threading import Timer
DELAYED_TIME = 10.5
def delayed():
print 'This call is delayed'
t = Timer(10.5, delayed)
t.start()
t.cancel() # Cancels the execution
14. MULTIPROCESS MODULE
import multiprocessing
ITERATIONS = 1000000
class Example1(multiprocessing.Process):
def __init__(self, num):
self.num = num
super(Example1, self).__init__()
def run(self):
for i in xrange(ITERATIONS):
pass
def main():
for j in xrange(10):
t = Example1(j)
t.start()
if __name__ == '__main__':
main()
20. Task 1 Task 2 Task 3
The task will release
waiting control once they are
ready! blocked waiting for an
done! input from IO
waiting
ready! Callback
done!
done!
47. QUESTIONS?
THANKS
FOR
YOUR
ATTENTION
@jaimebuelta
wrongsideofmemphis.wordpress.com
Notas del editor
\n
The first point is, why complicate the code trying to do more than one thing at the same time. The answer is to better use the resources. We need spare resources to start with\n\nThe typical applications that need these kind of techniques are three (divide a big tasks into several computers/cores, do the same thing for different actors, do different things for the same actor)\n
The typical example is crunching numbers. Render a movie, data mining, etc. Usually can be treated as the next case (do the same thing more than once) with an algorithm that works with parts of the data.\n
Do the same thing, for different actors\n Web server\n Different tabs on a web browser\n
Do different things for the same actor (usually some coordination is needed, more later):\n Game (AI of each enemy, render image, sound)\n
Debug can be very tricky, you’ve been advised. If coordination is needed, that can be tricky. If only one thing is do (several times), the complexity is highly reduced due help from the Operative System\n
There are several ways of execute code concurrently, we’ll discuss threads, processes and asynchronous programming\n
\n
module thread\n Low level, in general is better avoid it\n module threading\n Higher level. More functionality\n
\n
\n
\n
A long, long time ago, that was the way of dealing with multiprocessing\nOnly available on Unix machines\nYou can fork using os.fork() in Python!\nFork is not in common use anymore due problems (fork bomb, etc)\n
\n
OS are very good at multitasking (and have been for some time). Communication can be achieved through ports, files, pipes, external queues, etc...\nBe radical, use more than one program\n
Threads share all the memory, so they can access whatever. That is good (no communication overhead), but can lead to abuse and instability. \n
it is the new cool technology, and there has been some recent uses, like NodeJS\n
On a thread model, you have a supervisor (OS) that will stop one thread and execute the other, transparently for the program\n
On asynchronous programing, every task is waiting for the others to end their dance to get on the dance floor\n
Or voluntarily releasing control (sleep, yield, etc). The typical way of calling the next block is to add callbacks to the code (when this result is available, use it as input for this code).\n
Callbacks make the code hard to follow and debug\n
No threads! All the fetch functions start, there is no need for callbacks, the urlopen call will yield and resume at the same point\n
The typical example is a web app, which normally takes time in get data from a DB and spend a small amount of time in composing the response.\nYou can keep a huge number of connections, like in chat servers, etc.\n
Number crunching, 3d calculations, etc... In a threaded model, the rest of the thread will be executed from time to time.\n
Use of Twisted and Tornado a some Node.js\n
\n
This will only happen on multicore processors.There is some overhead on the blocking. It also make UNIX signals to act weird (CTRL + C example)\n
The OS could (and probably do, depending on the load) set a thread to run on each core.\n
Num iterations is constant, with different threads. Increasing the number of threads magnifies the problem, as adds more overhead. Adding threads adds MORE time instead of reducing it.\n
\n
it is a very “first world problem”. It is only causing problems on a subset of programs, in general the effects are limited.\n
Only if CPU-bound, multithread program is running on a multicore machine, you need to worry.\nEven one CPU bound thread can have an effect due overhead and could be more efficient to use sequential programming\n\n
\n
time vs iterations for 10 threads/processes\n
He has a couple more articles on his web page www.dabeaz.com\n
In general, for concurrent programming\n
probably true for all software development\n\nkeep the tasks separated and reduce communication between elements\n\n\n
That also includes C extensions. You don’t need a lock for reads, only for writes that are not a single python operation.\n
Python should be enough for most of the situations. Think carefully if you need a lock, mutex or semaphore\n\n\n
\n
There are different kind of Queues, like LIFO, FIFO, Priority, etc. multiprocess has also queues.\nExternal queues can also be useful (RabbitMQ, AMQP, etc)\nA pipe is a queue with one input and one output.\n
\n
Limit the number of workers. Unlimited workers is unsafe and can block the system.\n\nA lot of threads can be worse than less threads. Do some tests to find the sweet spo\n
Thread coordination is very difficult. Try to create and destroy short-lived threads instead.\n
This structure is typical of games (and real time systems)\n\nEach interval (1 second), you evaluate the needs and throw the needed threads (or execute sequentially)\nThe main periodic thread can cancel next time threads that haven’t finished their task.\n\nKill unused threads\n