Message architectures are an important part of a distributed system. They are often overlooked because the prevailing sentiment is that the storage and processing engines are the important parts.
3. Personal Vanity
•
CTO of SimpleReach
•
Co-author of Practical Cassandra
•
Skydiver, Mixed Martial Artist,
Motorcyclist, Dog dad, NY Giants fan
•
IronMatt Foundation for Pediatric Brian
Tumors (ironmatt.org)
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
6. SimpleReach
•
Millions of URLs per day
•
Over 3.25 billion page views per month
•
1.4b events per day (~16k events/second)
•
Auto-scale 125-160 machines depending on traffic
•
Built a predictive measurement algorithm for the social web
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
7. Why is Messaging Important?
•
Most large scale systems discussions only talk about storage
•
Direct high volumes of data around your infrastructure
•
Control flow of data through your infrastructure
•
Decouple important systems
•
Scalability, Elasticity, Deliverability, and Redundancy
•
Buffering and Asynchronous communication
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
8. Data Flow
incoming request
❶
❸ send response
App
❹
async queue message
sync persist data
❷
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
9. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
•
Horizontal Scaling
•
Controlled Data Flow Patterns
•
Enrichment/In-stream Modification Schemes
•
Monitoring and Instrumentation
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
11. What Did SimpleReach Choose?
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
EricEric Lubow@elubow #ddtx14
Lubow
@elubow #ddtx14
12. NSQ
•
Distributed and de-centralized topology
•
At least once delivery guaranteed
•
Multicast style message routing
•
Simple to configure and deploy
•
Allow for maintenance windows with no downtime
•
Ephemeral channels for testing
•
Channel sampling
github.com/bitly/nsq
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
13. Topics and Channels
• a topic is a distinct stream of messages
(a single nsqd instance can have multiple
topics)
nsqd
separate hosts
Topics
• a channel is an independent queue for a
topic (a topic can have multiple
channels)
• consumers discover producers by
querying nsqlookupd (a discovery
service for topics)
• topics and channels are created at
Consumers
“event”
A
B
Channels
“metrics”
“enrichment”
“writer”
runtime (just start publishing/
subscribing)
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
Eric Lubow @elubow #ddtx14
Eric Lubow @elubow #ddtx14
14. Everyone Speaks The Same Language
http:// + {“content-type”: “application/json”}
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
EricEric Lubow@elubow #ddtx14
Lubow
@elubow #ddtx14
16. NSQ Tools
• nsqadmin provides a web interface to
administrate and introspect an NSQ cluster at
runtime (and empty, pause, or delete topics/
channels)
• nsq_to_http - utility that helps transport an
aggregate stream over HTTP
• nsq_to_file - utility that safely persists an
aggregated stream to disk
• nsq_stat - iostat like utility for a topic/channel
• nsq_tail - tail like utility for a topic/channel
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
Eric Lubow @elubow #ddtx14
Eric Lubow @elubow #ddtx14
17. Right Tool For The Job
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
18. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
19. How Does It Work?
API
API
API
NSQD
NSQ
NSQD
NSQ
NSQD
NSQ
PUBLISH
REGISTER
nsqlookupd
nsqlookupd
SUBSCRIBE
DISCOVER
consumer
consumer
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
Eric Lubow @elubow #ddtx14
Eric Lubow @elubow #ddtx14
20. The Schrute of the Problem
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
21. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
22. Simple Deployment & Automation
•
Chef cookbook - github.com/simplereach/chef-nsq
•
Written in Go
•
Easily distributable binaries
•
Deploy lookup nodes
•
Nsqd’s installed locally
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
23. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
24. Runtime Discovery
nsqlookupd
nsqlookupd
HTTP requests
consumer
➊ regularly poll for topic producers
➋ connect to all producers
Message Architectures in Distributed Systems
Message Architectures in Distributed Systems
Eric Lubow @elubow #ddtx14
Eric Lubow @elubow #ddtx14
25. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
•
Horizontal Scaling
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
26. Path of a Packet
Fire
Hose
API
SC
Internal API
Internet
Queue
EC
Consumers
Solr
C*
Mongo
Redis
Vertica
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
28. Controlled Data Flow
NSQ
Broadcast
NSQ
Batch & Write
Processed Data
Social Event
Collector
Social Data
Batch & Write
Raw Data
Calculate Score
Message Architectures in Distributed Systems
Eric Lubow
Write
@elubow #ddtx14
29. Broadcast Importance for Polyglottany
NSQ
Broadcast
Mongo Writer
Redis Writer
Writer
Aggregator
Cassandra Writer
Solr Writer
Vertica Writer
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
31. Controlled Data Flow
NSQ
Broadcast
NSQ
Batch & Write
Processed Data
Social Event
Collector
Social Data
Batch & Write
Raw Data
Calculate Score
Message Architectures in Distributed Systems
Eric Lubow
Write
@elubow #ddtx14
32. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
•
Horizontal Scaling
•
Controlled Data Flow
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
33. What Is Enrichment?
A mechanism to add
value to a message to
enhance processing in
your system
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
34. How Do We Enrich
NSQ
Broadcast
Consumer A
Raw Event
Enriched
Event
Consumer B
Consumer C
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
35. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
•
Horizontal Scaling
•
Controlled Data Flow
•
Enrichment
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
36. Monitoring / Instrumentation
•
Comes with statsd support built-in
•
Statsd talks to both Graphite and nsqadmin
•
Nsqadmin comes with graphs for message processing stats
•
Nagios plugins available for monitoring topic/channel depth
•
Average end to end latency calculations are done on a per-channel basis
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
37. Goals
•
Consistent interfaces between systems
•
Allow access to many toolsets
•
Minimize downtime/Minimize cost of downtime
•
High availability
•
Clients should have minimal architecture knowledge
•
Horizontal Scaling
•
Controlled Data Flow
•
Enrichment
•
Monitoring and Instrumentation
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
38. Summary
•
Large Systems are more than just storage
•
Abstraction
•
Highly Available
•
Controlled Data Flow Patterns
•
Monitoring & Automation
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14
40. Questions are guaranteed in life.
Answers aren’t.
Eric Lubow
@elubow
elubow@simplereach.com
#ddtx14
Thank you.
Message Architectures in Distributed Systems
Eric Lubow
@elubow #ddtx14