2. Preamble
• I work for a global telco
• We push telco events and messages around
• All should be processed quickly and reliably
• Some generate a response
3. Scenario
• We send messages around in small text
files, sometimes with RFC 2822 headers
• Use the UNIX sendf/recfiled mechanism
10. pymq
• very hazy on the details
• depends on MySQL? and Django?!?
• ignored while looking at kombu / rabbitmq
11. ZeroMQ aka 0MQ
"ZeroMQ is a message orientated IPC Library."
- commenter on stackoverflow
12. AMQP
• Is a standard for message queueing
• Producers submit messages to brokers
• Brokers consist of exchanges and queues
• Exchanges route the messages
• Queues store messages
• Consumers pull messages out
46. Issues
• pika's BlockingConnection (my use case is
simple enough) hard-codes the socket
timeout and fails to cope with latency >1s
• Fails to cope at all with packet loss
47. Deleting a Queue
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters('server'))
channel = connection.channel()
channel.queue_delete(queue='hello')
channel.close()
connection.close()
49. Publisher
from kombu import BrokerConnection
with BrokerConnection('amqp://localhost') as conn:
with conn.SimpleQueue('hello') as queue:
queue.put('Hello World!')
print " [x] Sent 'Hello World!'"
51. Publisher
from kombu import BrokerConnection
with BrokerConnection('amqp://localhost//') as conn:
with conn.SimpleQueue('hello', queue_opts=dict(durable=False)) as queue:
queue.put('Hello World!')
print " [x] Sent 'Hello World!'"
59. stompclient
from kombu import BrokerConnection
with BrokerConnection('amqp://localhost') as conn:
with conn.SimpleQueue('hello') as queue:
queue.put('Hello World!')
print " [x] Sent 'Hello World!'"
from stompclient import PublishClient
client = PublishClient('localhost', 61613)
client.connect()
client.send('/queue/hello', 'Hello, world!')
client.disconnect()
60. stompclient
from stompclient import PublishClient
client = PublishClient('localhost', 61613)
client.connect()
client.send('/queue/hello', 'Hello, world!')
client.disconnect()
from stompclient import PublishClient
with PublishClient('localhost', 61613) as client:
client.send('/queue/hello', 'Hello, world!')
61. Celery
"a synchronous or asynchronous task queue/job
queue based on distributed message passing"
62. TcpCatcher
• TCP, SOCKS, HTTP(S) proxy & monitor
• Can introduce latency and transmission
errors
• Understands HTTP and images
• Can debug/interfere/log SSL traffic
• Free: www.tcpcatcher.fr
Notas del editor
\n
\n
\n
\n
Network latency currently means processing slows down.\n
One "facility" is the zipping up of multiple events into a batch and sending the zip across the WAN in one go. This obviously complicates things further and introduces additional latency onto the message transmission.\n
\n
\n
\n
... and I never actually came back to evaluate it\n
Is message queueing without a broker. Kind of like a glorified socket. Messages are routed in common MQ patterns right down at the network level. If you want store and forward, you're on your own for the persistence part.\n
\n
AMQP in a nutshell. I might add that amqp.org was no help whatsoever in figuring this out.\n
There's a bunch of AMQP implementations out there, but in the interests of keeping my own sanity I only looked at the one most popular free implementation. Since they all end up implementing the same thing anyway.\n
\n
\n
Queues are where your messages end up in the broker. They sit there until a client (a.k.a. consumer) connects to the queue and siphons them off. Queues may be configured so messages are discarded if there isn’t a consumer ready to accept them. Multiple consumers may connect to a queue - the messages will be passed to each consumer in turn.\n
Exchanges are routers with routing tables that sit in front of queues. They're declared by consumers, just like queues, except that there's a default "just pass the message to the queue" exchange for the simple case. Every message has what’s known as a “routing key”, which is simply a string. The exchange has a list of bindings (routes) that say, for example, messages with routing key “X” go to queue “spam”.\n
Messages come in to the broker, are routed by exchanges and stored in queues until slurped off by consumers. Within a broker you may have multiple logical systems called virtual hosts. I'm not sure why. There's a default one and it's probably all I'll ever need.\nQueues and exchanges are created programmatically by your producers or consumers - not via a configuration file or command line program - your MQ configuration is in-line with your app code.\n
An interesting aside for performance - exchanges all run in their own processes so adding more exchanges is a way to spread load and increase throughput.\n
“routing rules” (or bindings) link an exchange to a queue based on a routing key. It is possible for two binding rules to use the same routing key. For example, maybe messages with the routing key “audit” need to go both to the “log-forever” queue and the “alert-the-big-dude” queue. To accomplish this, just create two binding rules (each one linking the exchange to one of the queues) that both trigger on routing key “audit”. In this case, the exchange duplicates the message and sends it to both queues.\n
There are multiple types of exchanges. They all do routing, but they accept different styles of binding “rules”. \n
A "direct" exchange only matches if the routing key is “dogs” or not. The default exchange is a Direct exchange.\n
For example, a “topic” exchange tries to match a message’s routing key against a wildcard pattern like “dogs.*”.\n
A "fanout" exchange ignores the routing key and distributes the messages to all queues.\n
There's two levels of persistence in play in RabbitMQ: the structure of the broker and the messages in the broker's queues.\n\n
You may mark your queues and exchanges as “durable” so the queue or exchange will be re-created automatically on reboot. It does not mean the messages in the queues will survive the reboot. They won’t.\n
When you publish your message to an exchange, you may set a flag called “Delivery Mode” to the value 2, which means “persistent”. “Delivery Mode” usually (depending on your AMQP library) defaults to a value of 1, which means “non-persistent”.\n
So the steps for persistent messaging are...\nIf you bind a durable queue to a durable exchange, RabbitMQ will automatically preserve the binding. Similarly, if you delete any exchange/queue (durable or not) any bindings that depend on it get deleted automatically.\n
RabbitMQ will not allow you to bind a non-durable exchange to a durable queue, or vice-versa. Both the exchange and the queue must be durable for the binding operation to succeed.\nYou cannot change the creation flags on a queue or exchange after you’ve created it. For example, if you create a queue as “non-durable”, and want to change it to “durable”, the only way to do this is to destroy the queue and re-create it. It’s a good reason to double check your declarations\n
This is my typical setup - I have hosts separated by network (WAN) out of my control. I need all the events generated on the remote hosts to be processed in a timely, reliable manner by the processing server.\n
One solution is to write a forwarding consumer on the remote host - pretty simple code but not very elegant.\n
An alternative setup we could use with the processing server consumer pulling from all the remote host queues. This makes it more configuration work when adding new remote hosts though.\n
Then I discovered the Shovel plugin. I've ignored exchanges up until now, but using Shovel allows an exchange on one host to pull messages from a queue and fire them at another queue. It basically runs an erlang client in the remote broker to forward the messages. In theory. In practise...\n
Look hard - it says "Fun" in there. This is NOT FUN. This was bloody hard to figure out. The main problem is that once you decide you're going to use Shovel, you're now programming Erlang. See previous statement about fun.\n
Not only am I now learning Erlang but the errors you get back for malformed configuration files (Erlang programs) are really unhelpful. This error basically says "there's an error somewhere in the 37 lines of your program."\n
This is my minimum shovel configuration, eventually discovered after much trial-and-error. I shall not share with you the entire volume of bizarre and obscure errors and warnings I waded through to get to this point, nor the number of dead-end alleyways the poor documentation led me down. Unfortunately when you start up the server with this configuration the log file will FILL with warnings about the queue not existing until a client connects and creates the queue.\n
This would probably be simpler if I was familiar with Erlang, but I found the documentation to be basically impenetrable. Also, for some reason I couldn't declare the queue without declaring an exchange. Which isn't needed if I don't declare the queue. There was an awful lot of stumbling around in the dark to get this working, but in the end it does.\n
Default login is guest/guest...\n
It tries to stay fairly independent of the underlying network support library. It uses amqplib underneath by default, which is what most of the libraries do.\n
This code connects to our RabbitMQ server using a blocking connection (send and wait for successful delivery at server before continuing); it declares a durable "helllo" queue and publishes a simple message to the queue. The queue and undelivered messages will be persistent across restarts of RabbitMQ. It's a slightly bizarrely wordy API. The exchange argument is required, but we use the "no-op" or "default" exchange here. More on them later.\n
This code connects to our RabbitMQ server using a blocking connection (listen and wait for messages); it also declares a "helllo" queue (just in case no publisher is connected) and consumes messages from the queue. You can see that a bunch of the code is the same - connections, channels and queues.\n
Here we see the publisher running on my laptop (local) and the consumer running on the server where the RabbitMQ server is also running. You can see that we can publish into the queue without anything consuming the messages, and we can consume published events immediately. We can also listen as a consumer with no publisher publishing. It's all quite disconnected.\n
By making one small change to our consumer, we can consume remote queues.\n
Here we set up a second consumer that will consume the remote queue on the server from the local host. Messages published to the queue will be passed to each consumer in turn round-robin style. When a consumer stops consuming the queue will transparently feed all messages to the remaining consumer. And of course this extends to having multiple publishers as well.\n
Can fix the hard-coded socket timeout issue, and a try/except might be able to handle the packet loss.\n
If you set up a queue with the wrong parameters you'll need to delete it. For some unknown reason the rabbitmq control program doesn't provide the ability to delete queues, so you need to do it from a client. This code does that.\n
Kombu is a messaging framework for Python. It replaces Carrot.\nThe aim of Kombu is to make messaging in Python as easy as possible by providing an idiomatic high-level interface for the AMQ protocol, and also provide proven and tested solutions to common messaging problems. Its "transports" include AMQP variations and non-AMQP "virtual transports" such as Redis, MongoDB, CouchDB and Beanstalk or "database transports" such as SQLAlchemy and Django ORM.\n
This took me some time to write as the documentation for Kombu is quite limited. Attempting to set things up using channels lead nowhere. I couldn't find clear documentation on how to create a channel. In the end I found the SimpleQueue which worked after some effort, but I'm still not clear on the details. Then I discovered that the default queue parameters were different (see next slide.)\n
When I was testing early on I wasn't using durable queues. kombu's SimpleQueue sets durable to True by default which caused the above bizarro error (which basically says I'm trying to use a queue with different parameters to those it was created with.) This error is not specific to Kombu, but it was unexpected and inexplicable when I first encountered it until I guessed at the durable parameter setting.\n
Took me a while to figure out how to disable durable.\n\nBut again, there's no channel - it's really just API noise for simple code like this.\n
The puka module implements a client for AMQP 0-9-1 protocol. It’s tuned to work with RabbitMQ broker, but should work fine with other message brokers that support this protocol. It tries to be a nicer API than pika, which is honestly quite appalling.\n
Everything in puka works off the Client which is quite different but pretty convenient. The API is basically the same as pika at the business end. The big difference is in the promises and the ability to wait on the promised action being completed successfully. Puka wants to be asynchronous - the wait() calls effectively force it to be synchronous for simple code like this.\nI needed to patch puka version 0.0.5 as mentioned in issue #15 on their tracker to fix a connection-time issue. Unfortunately even with that fix this was still fragile and inexplicably stopped working at one point. Puka also does away with channels like kombu's simple case.\n
STOMP is the Simple (or Streaming) Text Orientated Messaging Protocol\n
"STOMP is a very simple and easy to implement protocol, coming from the HTTP school of design; the server side may be hard to implement well, but it is very easy to write a client to get yourself connected. For example you can use Telnet to login to any STOMP broker and interact with it!"\n
The STOMP website kinda mirrors the AMQP website: there's a specification there but little else. Not so simple. Fortunately there's a bunch of Python STOMP client libraries.\n
Enabling STOMP was easy enough in the RabbitMQ configuration file - the docs were much clearer and the configuration much simpler than Shovel.\n
So here's one Python library, stompclient, publishing to my "hello" queue. \n
So... not a whole lot simpler...\n
I made a small change to the stompclient library to allow this usage, which I think is pretty darned simple - I believe it'll be in the next release. I'm not completely sold on STOMP yet though. I'm not sure it improves things over AMQP enough to switch from just using Kombu.\n
RPC engine. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers. It's built over kombu and a variety of non-AMQP backends. The RPC nature and backend generality limit the abilities of MQ somewhat, though is nice if you're just after worker management (which I'm not.)\n
I found this worked sometimes, but not other times. RabbitMQ did NOT like me putting this in the middle of the Shovel setup, but the Python libraries I used were happy to use it as a proxy.\n