Boost PC performance: How more available memory can improve productivity
On Rabbits and Elephants
1. On Rabbits and Elephants
Data Distribution with pg_amqp and RabbitMQ
Gavin M. Roy @crad - pgConf.eu 2011
2. myYearbook.com
• #1 Comscore Teen Destination, Top Trafficked Sites in
US who is hiring
• Use of PostgreSQL predates my joining as CTO in
2007 and we’re hiring
• Use of RabbitMQ goes back to 2009, we were hiring
then too
• Awesome place to work, have been known to sponsor
Europeans for work in the US and we’re hiring
Gavin M. Roy @crad - pgConf.eu 2011
3. Don't try this at home.
(try it at work or at least on your laptop)
Gavin M. Roy @crad - pgConf.eu 2011
4. Audience Participation
• You’ll need PostgreSQL 9.1rc1 with pgbench
installed
• Python 2.6 or Python 2.7
• Tarball from http://192.168.103.135/~gmr/
• Untar and then ./setup-client.py
Gavin M. Roy @crad - pgConf.eu 2011
5. Use Cases
• Adding complexity to your environment
• Ensuring data inconsistency between
canonical data sources and data warehouses
• Impressing your hipster Ruby programmer
friends
• Replace your Slony replication setup
Gavin M. Roy @crad - pgConf.eu 2011
6. Use Cases
• Write distributed map-reduce functions in
Erlang
Gavin M. Roy @crad - pgConf.eu 2011
8. What is RabbitMQ?
(don’t let the cloud based marketing deter you)
Gavin M. Roy @crad - pgConf.eu 2011
9. About AMPQ
• Platform, Vendor Neutral Messaging format
• 0.9.1 finalized in November 2008
• JP Morgan Chase, Red Hat, Rabbit Technologies,
iMatrix, IONA Technologies, Cisco Systems, Envoy
Technologies, Twist Process Innovations
• Members now include Bank of America, Barclays
Banks, Microsoft Corporation, Novell, VMWare
• 1.0 Final: October 2011 “Changes some stuff”
Gavin M. Roy @crad - pgConf.eu 2011
10. RabbitMQ & AMQP
Concepts
Going to only scratch the surface
Gavin M. Roy @crad - pgConf.eu 2011
17. Queues
• Durability: Queue survives reboot
• Auto-Delete: Delete the queue when consumer
goes away
• Exclusive: Only one consumer may consume
• x-expires: Auto-Delete after given duration
• x-message-ttl: Auto-discard a message after ttl
Gavin M. Roy @crad - pgConf.eu 2011
20. Direct exchanges
• 1:1 Message Delivery Pattern
• Routing Keys are direct, no wildcarding
• All queues bound to a routing key get the
message
Gavin M. Roy @crad - pgConf.eu 2011
21. Topic exchanges
• Pattern Matching in Routing Keys
• Publish one message type to
public.pgbench_accounts another
public.pgbench_history
• Period delimiter, * stays within the period scope,
# does not
• Example patterns: *.pgbench_accounts, public.*
Gavin M. Roy @crad - pgConf.eu 2011
22. Other Exchange Types
• Built-In: Fanout, Headers
• 3rd Party or Plugin Types:
• Consistent Hash Exchange: Round-robin distribution to queues
• External: RPC Exchange Plugin
• Riak
• Last-Value Cache
• Recent History Exchange: Keeps last 20 messages routed
• Global Fanout: Sends to all queues, regardless of routing key
Gavin M. Roy @crad - pgConf.eu 2011
23. One More thing...
Exchange-to-Exchange Bindings
Gavin M. Roy @crad - pgConf.eu 2011
25. Step-By-Step Instructions
1. Install your Erlang on the machine you into intend to run your RabbitMQ broker on. My preference is for Erlang R14b2, which almost makes me think they name Erlang releases after Star Wars characters, but they
don't. Erlang is not as fun as Debian is. It doesn't need to be a beefy machine but it should have enough ram that if you don't have an active consumer it won't run out of memory.
2. Install RabbitMQ. This is a multistep process involving black magic. Depending on your distribution you can either download a package, or if your *nix impaired there is a windows installer too. Oh and don't forget
to install the management plugin. You can download the management plugin from the RqbbitMQ site pretty easy. After you have downloaded it you put it in the RabbitMQ plugins directory. Kind of makes sense
doesn't it?
3. Go to the store and buy RabbitMQ in Action and RabbitMQ in Detail. What do you mean you can't? No pre-order either? Well there are plenty of good blogs on the intarwebs that have good information such as
*cough* mine. Seriously though if you are crazy enough to follow my advice and get this far out of actual interest in the subject, make sure you verse yourself in the operation of and the flexibility that is RabbitMQ.
4. Install one of the following: pl/Python, pl/Perl, pl/Ruby, pl/PHP, pl/PGSQL and pg_amqp, or write your own damn addon in c using librabbitmq and then distribute it using the very cool pgxn service. Buoy only need
this on the databases your going to be writing triggers on. Are you still reading this? Seriously? I would have thought I had lost you by now. Either that or you are skipping ahead because I should have changed
the slide by now. It could be that Selena is operating the slides, in which case I have no control over these slides which is a terrifying thought. You see there is very little content on the slides, a real lack of
substance. If I stay on any one slide for too long then you'll realize I really don't know what I am talking about and then I'd not be asked back to give one of the most important talks of the conference.
5. It's almost time to write your publisher, but don't get too far ahead of yourself. You first need to make sure that you have figured out a routing paradigm to use. Don't you like that word? Paradigm. I hear it in my
head as para dig'um. Anyway, back to routing paradigms. Depending on your use case you will have the ability to broadcast your events to a whole slew of databases that care about what you have to send, or
just one. You can implement a 1-to-1 type of scenario where a "direct" exchange is setup and your data will be queued in a way that no message is duplicated to your client and there is transactional guarantees in
RabbitMQ. You probably don't want that thou, as you're a fly -by-the-seat-of-your-pants type of person, aren't you? For that you'll want to turn on noAck mode in your consumers. Rabbit will just fling your
messages to your consumers as fast as possible, just like an ape at a zoo and his favorite projectile.
6. This is where things get fun,. We are going to quite. Stored procedure in the language of your choice and have it publish to RabbitMQ using an exchange and a routing key, now at this point if you are still reading
this, I have to question your sanity. This was a one off gag slide to make people at the conference think that I wax going to read all of this. Anyway, wirte your function that publishes and you are almost done. No
seriously. Well not quite done because you still need the part that calls this function and then the consumer app that reads the data from RabbitMQ and then doe the actions in PostgreSQL that you want it to do.
7. Now, basically you're going to add after insert, update, delete triggers to the tables you care about. I'm. Not going to go into detail about how to do that since you are at a PostgreSQL. Conference and should
know how to do that already. Instead I am going to assume you know what I am writing about and say that in these triggers you will want to call the function we wrote in step 6 from these trigger events,
8. Go to http://github.com/gmr/On-Rabbits-and-Elephants/ and download the sample code and use that instead of all of this nonsense.
Gavin M. Roy @crad - pgConf.eu 2011
26. pg_amqp
• Is a publisher only PostgreSQL extension
• Transactional if you use amqp.publish
• Not if you use amqp.autonomous_publish
• Needs some <3 for more advanced features
like delivery mode changing and message
properties.
Gavin M. Roy @crad - pgConf.eu 2011
27. Use a Trigger
(benefit from the value of a trigger?)
Gavin M. Roy @crad - pgConf.eu 2011
28. Not that kind of trigger.
Gavin M. Roy @crad - pgConf.eu 2011
32. Demo
• Using pgbench
• PostgreSQL 9.1rc1
• pg_amqp and plpythonu
• RabbitMQ 2.6.1
• Python 2.7 (Should work in 2.5 or 2.6 too)
Gavin M. Roy @crad - pgConf.eu 2011
33. Generic Trigger
• Publishes to schema.table_name
• Wraps the entire trigger data set into a JSON
payload
Gavin M. Roy @crad - pgConf.eu 2011
34. consumer.py
Dynamically builds SQL based upon event type and
executes on a remote PostgreSQL database
./consumer.py 192.168.103.135
Gavin M. Roy @crad - pgConf.eu 2011