3. 3
● Sending messages
○ Internally in distributed systems
○ Externally between systems
● Communication at the application level
● Messages go from sender/producer to receiver/consumer
○ Asynchronously
○ Time decoupling
What is messaging?
8. 8
● Freedom of choice
○ On-premise or in the cloud
○ Ability to choose which cloud
○ Open Standards protocols allows users to choose client freely
● Migrating from one to the other can be complex
Cloud provider limitations
9. 9
● Open source cloud messaging running on Kubernetes and OpenShift
● enmasse.io
● github.com/enmasseproject/enmasse
EnMasse
Messaging-as-a-Service
10. 1
0
● Multiple communication patterns: request-response, publish-subscribe and competing
consumers
● Support for store-and-forward and direct messaging mechanisms
● Scale and elasticity of message brokers
● AMQP 1.0 and MQTT support
● Simple setup, management and monitoring
● Multitenancy: manage multiple independent instances
● Deploy on-premise or in the cloud
EnMasse
Features
11. 1
1
● Authentication and authorization
● Service broker API
● HTTP(S)
● Message grouping
● Distributed transactions
● Message ordering
● Multiple flavors
○ Apache Kafka?
● ...
EnMasse
In progress/TODO
34. MQTT support
3
4
● MQTT gateway
○ Handles connections with remote MQTT clients
○ Bridges MQTT - AMQP protocols
● MQTT lwt
○ Provides the “will testament” feature
○ In charge to recover & send the “will” if client dies
Notas del editor
The purpose of this presentation is twofold
Present an open source messaging platform for the cloud
Share our experiences building this platform and show how it integrates with OpenShift/Kubernetes
Software engineer at Red Hat, working in the messaging team
Team is maintaining the messaging middleware products at Red Hat
Starting off with a quick introduction to messaging for those not familiar, and what protocols/products already exists for messaging in the cloud
Then introducing the EnMasse open source messaging platform
Then moving on to different aspects of the messaging platform
Showing how we solved some of the elasticity problems
How we configure the platform
How we do CI
How we think in terms of the platform interface
My hope is that this will be useful for more than just people interested in messaging, but also those who are building platforms on top of OpenShift or Kubernetes
Really, messaging is so generic it can sometimes be a bit hard to explain what it is
But here is an attempt at least
Messaging is sort of a software defined network, in the sense that addresses are defined by the application and not the infrastructure
An important part of it is the time decoupling, where you have these intermediaries like a broker
In addition, you typically have different QoS levels, to allow a tradeoff between guarantees and performance
You also have more advanced features like distributed transactions, message grouping, but this should give an idea
Messaging components are often focused around the different patterns of communication you can do, and try to create simple ways to facilitate those
In addition, you typically have different QoS levels, to allow a tradeoff between guarantees and performance
You also have more advanced features like distributed transactions, message grouping, but this should give an idea
In the messaging world, there are many protocols
AMQP 1.0
MQTT
OpenWire
STOMP
HTTP
...
Standardized APIs
JMS
The important takeway with standards is not that there has to be one, but that they are open
Developer is free to choose library ...
… using the preferred programming language
Our main focus in the messaging team today is AMQP and MQTT
AMQP 1.0 is a fairly new standard that can support messaging both with and without brokers
Exists a lot of cloud messaging solutions today
Microsoft is heavy into the enterprise messaging, supporting AMQP
AWS has the SQS
Google FireBase popular with Android developers
Although these providers work quite well and might be the solution for many companies, there are some limitations
The software in the back of these services are not open source
Freedom of choosing is lost
Moving from one to the other is not trivial
Effort to build an open source messaging platform on Kubernetes and OpenShift
Goals is to
Scale and performance like just as any cloud provider solution
Simple to deploy and manage
Support both Kubernetes and OpenShift
Supports the messaging patterns described earlier
Store and forward
Basically message brokers, storing the message in a queue
When you don’t want to wait for consumers
Direct
Allows communicating directly with another client (or server for that matter)
When you don’t need the intermediate queue and want to get response/confirmation from the consumer directly
Elastic scaling
Want to be able to scale the backend capacity dynamically (and eventually automatically)
AMQP and MQTT
Open standards already in use by many messaging products
MQTT features like “will testament” and “retained messages” over AMQP
A higher level of abstraction for management and monitoring, brokers are an implementation detail
Multitenancy
Multiple address spaces that are managed independently
Using OpenShift and Kubernetes allows it to be deployed anywhere
Here is a list of coming features in the project that we are working on, and some of this is already there such as the service broker API
We want to support authentication of clients using SASL, bridging to a local or shared keycloak instance
Have an implementation of the open service broker API in the address-controller
Allows integration with OpenShift Service Catalog
A standard way of requesting services
We want to support more protocols like HTTP and CoAP, possibly other messaging protocols
There are some guarantees such as message ordering that suffer when we scale up brokers
The focus so far has been on scalability and performance
More important now is features that provide guarantees such as ordering within message groups
A network of AMQP routers (qpid-dispatch)
A set of brokers to provide store-and-forward semantics
Producers and consumers connect to any of the routers
Brokers are hidden from clients, allowing transparent scaling of brokers
Message are routed according to address configuration
Router network can be scaled from 1 to N
Brokers can be scaled from 1 to N
Broker decouples producer and consumer
Consumers can retrieve their data at any point (temporal decoupling)
Consumers can retrieve their data at their own pace (i.e. could be slower than the producers)
Broker persists messages to disk
Router requires consumer being active in order to give “credits” to the producer for sending messages
Router requires consumer to accept/reject message before responding to producer
Does not persist messages
Multiple routers in a network provides path redundancy
Routers and brokers
Clients are load balanced across a messaging service
Brokers connect to router network
Kubernetes integration
Messaging - deployment and service for routers
Admin - deployment and service with multiple containers managing configuration of brokers
Address controller - Deployment, service and Route/Ingress with API for deploying address configuration.
Originally part of Admin deployment, but moved out for multitenancy support
Implements a standalone REST/HTTP and AMQP API for deploying configuration
Implements the Open Service Broker API
Myqueue and mytopic, deployments (no service) of messaging brokers handling the same address
Created by the address-controller for store-and-forward=true addresses
Subscription: deployment + service for durable subscriptions. Messages with a special address are routed to this service
Creates subscriptions in the topic brokers
mqtt-gateway : deployment + service for MQTT.
Mqtt-lwt: deployment for mqtt last-will and testament service
One of the key advantages of running in Kubernetes is that mechanisms for scaling your deployments are already available
However, since we are deploying pods that needs to be connected, one does need an additional mechanism for discovering others
For routers, we want to connect them in a mesh, to minimize the hops required for a message
The router agent is part of the admin deployment, but is exposed on a designated port in the admin service
When a router starts, it creates a connection to the router agent service
Not shown in this picture is the connection between the other routers and the router agent.
Once connected to the agent, the agent instructs the router (using AMQP management) to create a connection to all the other routers it knows about
With multiple agents, the agents discovers each other by watching for other agent pods
Arrows indicate client connections to the router network
Brokers require a bit more care than routers, because they require persistence
Today, brokers uses deployments and not stateful sets, so they share persistent volumes
This may change in the future, where we might want to support local storage + replication for brokers
Queues and topics are handled a bit differently
In this diagram, there are 2 brokers handling address ‘a’, and 2 brokers handling address ‘b’
Address ‘a’ is a queue, while ‘b’ is a topic
Queues are really simple, because the brokers do not have to know about eachother, the routers will simply handle balance messages across the broker cluster
Topics is a harder problem, because subscribers might be spread across multiple brokers, thus requiring a message to go to both brokers
The intuitive approach would be for the router to handle this distribution, but this is not yet supported
The current approach is that we create a link between these two brokers
The topic brokers discover each other by watching pods with labels matching their use
When the broker starts, it simply connects to the router service, and will end up connecting to one of the routers
The router looks at the broker id, which matches that of address ‘a’, so it will create a route for messages to go to that broker
For topics, the same mechanism is used with a slight variation, so I won’t cover that in detail here
For topics, there is an additional container running alongside each broker that discovers the other brokers, and creates a forwarding link for each of the discovered brokers
When scaling brokers down, we use a kubernetes feature called preStop hooks
A prestop hook will send a SIGTERM to any of the broker pods, then SIGKILL after grace period
The broker pod will _block_ the SIGTERM
The preStop hook will execute: kill router link and send messages in the broker queue back into the router network, which will forward them to one of the other brokers
Once finished, the preStop hook will instruct the broker to shut down
What if the hook is not able to move all messages?
The terminationGracePeriod is default 30 seconds, but may be adjusted
Why scale down if you have a lot of messages to move?
Might complement the hook with another approach where the hook would spin up a pod that handles the migration by spinning up a temporary broker to read the data from disk
For topics, need to move durable subscriptions and messages for those subscriptions to one of the other brokers in the cluster
Uses a preStop hook as well, but slightly more advanced
Does broker discovery
Creates queues on one other broker and moves messages to that queue
Configuration management in OpenShift is quite simple, but there are two parts to this
The user interface for configuring the platform
The mechanism for distributing the configuration
Address configuration is stored in OpenShift as config maps, one per address
The address controller acts as the API and writes these config maps
Configserv consumes these maps and provides an AMQP interface for configuration data
Consumed by router agent, queue scheduler, and subscription service
The address controller supports both HTTP/REST, AMQP 1.0 and Service Broker API for deploying address configuration
In the HTTP and AMQP API, address configuration modeled as kubernetes resources
Allows for extending as a ThirdPartyResource once API is stable
An address can currently have 4 different semantics
Queue - standard queue
Topic - pub sub
Direct anycast - no queue involed, a message is delivered to a single consumer
Direct multicast - no queue involved, a message is delivered to all consumers
The configuration is currently transformed into an internal config map ( + deployment for queues and topics)
Flavors is a mechansim for queues and topics to be able to select configurations for that queue
The available flavors in the system is controlled by the messaging operator
Stored as config maps, and only read only in the address controller
When creating a platform today, you need to think CI from the start
How will components deal with upgrades
How do you build and test your components and to an integration test
We split most of the components across multiple github repositories
Each of them runs a build, tests, builds a docker image, and pushes that image to docker hub
Runs on travis
Systemtests is a repository with java unit tests that runs against EnMasse deployed on OpenShift
The openshift instance is created locally within the travis job for each test run
This approach is simple, though travis build time limitations may become a problematic limit soon
Want to setup a jenkins CI as well
When you create a platform, you typically end up with another role in the system
In EnMasse, there is
Application
Messaging tenant
Messaging operator
Infrastructure (Kubernetes or OpenShift) operator
Going to quickly show a few pictures, and then demo the interface
The standard console in openshift and kubernetes is very low level
All deployments and pods are exposed
This is good for the messaging operators running the service
Not appropriate for messaging tenants
A platform specific UI that hides details from platform users
Focus on concepts in the domain of messaging
Senders
Receivers
Connections
Queue depths
Same way as for the console, need to provide application level metrics, not just CPU /disk per pod
OpenShift has an affinity toward graphana
Messaging components expose prometheus or jolokia metrics