SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Developing real-time data
pipelines with Spring and
Kafka
Marius Bogoevici
Staff Engineer, Pivotal
@mariusbogoevici
Agenda
• The Spring ecosystem today
• Spring Integration and Spring Integration Kafka
• Data integration
• Spring XD
• Spring Cloud Data Flow
Spring Framework
• Since 2002
• Java-based enterprise application development
• “Plumbing” should not be a developer concern
• Platform agnostic
Have you seen Spring lately?
• XML-less operation (since Spring 3.0, 2009)
• Component detection via @ComponentScan
• Declarative stereotypes:
• @Component, @Controller, @Repository
• Dependency injection @Autowired
• Extensive ecosystem
A simple REST controller
@RestController
public class GreetingController {
private static final String template = "Hello, %s!";
private final AtomicLong counter = new AtomicLong();
@RequestMapping("/greeting")
public Greeting greeting(@PathVariable(value="name", defaultValue="World") String name) {
return new Greeting(counter.incrementAndGet(),
String.format(template, name));
}
}
Spring ecosystem
Spring Data:
case study
Spring Data
• Spring-based data access model
• Data mapping and repository abstractions
• Retains the characteristics of underlying data store
• Framework generated implementation
• Customized query support
Spring Data Repositories
public interface PersonRepository extends CrudRepository<Person, Long> {
Person findByFirstName(String firstName);
}
@RestController
public class PersonController {
@Autowired PersonRepository repository;
@RequestMapping(“/“)
public List<Person> getAll() {
return repository.findAll();
}
@RequestMapping(“/{firstName}”)
public Person readOne(@PathVariable String firstName) {
return repository.findByFirstname(String name);
}
}
Only declare the interfaces
Implementation is generated and injected
Spring Data
JPA
Spring Data
REST
Spring Boot
• Auto configuration: infrastructure automatically
created based on class path contents
• Smart defaults
• Standalone executable artifacts (“just run”)
• Uberjar + embedded runtime
• Configuration via CLI, environment
Spring Boot Application
@Controller
@EnableAutoConfiguration
public class SampleController {
@RequestMapping("/")
@ResponseBody
String home() {
return "Hello World!";
}
public static void main(String[] args) throws Exception {
SpringApplication.run(SampleController.class, args);
}
}
java -jar application.jar
Spring Integration
• Since 2007
• Pipes and Filters: Messages, Channels, Endpoints
• Enterprise Integration Patterns as first-class
constructs
• Large set of adapters
• Java DSL
Spring Integration
Message
Encapsulates
Data
(headers + payload)
Channel
Transports
Data
Endpoint
Handles
Data
Example: a simple pipeline
Message
Translator
integerMessageSource inputChannel queueChannel
myFlow
(transform, filter)
Example: a simple pipeline
@Configuration	
@EnableIntegration	
public	class	MyConfiguration	{	
				@Bean	
				public	MessageSource<?>	integerMessageSource()	{	
								MethodInvokingMessageSource	source	=	new	MethodInvokingMessageSource();	
								source.setObject(new	AtomicInteger());	
								source.setMethodName("getAndIncrement");	
								return	source;	
				}	
				@Bean	
				public	DirectChannel	inputChannel()	{	
								return	new	DirectChannel();	
				}	
				@Bean	
				public	IntegrationFlow	myFlow()	{	
								return	IntegrationFlows.from(this.integerMessageSource(),	c	->		
																																																			c.poller(Pollers.fixedRate(100)))	
																				.channel(this.inputChannel())	
																				.filter((Integer	p)	->	p	>	0)	
																				.transform(Object::toString)	
																				.channel(MessageChannels.queue())	
																				.get();	
				}	
}
Spring Integration:
Cafe Example
Spring Integration
Components
• Enterprise Integration Patterns:
• Filter, Transform, Gateway, Service Activator,
Aggregator, Channel Adapter, Routing Slip
• Adapters:
• JMS, RabbitMQ, Kafka, MongoDB, JDBC,
Splunk, AWS (S3, SQS), Twitter, Email, etc.
Spring Integration Kafka
• Started in 2011
• Goal: adapting to the abstractions Spring Messaging and
Spring Integration
• Easy access to the unique features of Kafka;
• Namespace, Java DSL support
• To migrate to 0.9 once available
• Defaults focused towards performance (disable ID
generation, timestamp)
Spring Integration Kafka:
Channel Adapters
Kafka Inbound
Channel Adapter
Kafka Outbound
Channel Adapter
Message Channel
Message Message
Spring Integration Kafka
Producer Configuration
• Default producer configuration
• Distinct per-topic producer configurations
• Destination target or partition controlled via
expression evaluation or headers
Spring Integration Kafka
Consumer
• Own client based on Simple Consumer API
• Listen to specific partitions!
• Offset control - when to be written and where (no
Zookeeper);
• Programmer-controlled acknowledgment;
• Concurrent message processing (preserving per-
partition ordering)
• Basic operations via KafkaTemplate
• Kafka specific headers
Spring Integration Kafka
Message Listener
• Auto-acknowledging
• With manual acknowledgment
public	interface	MessageListener	{	
	 void	onMessage(KafkaMessage	message);	
}	
public	interface	AcknowledgingMessageListener	{	
	 void	onMessage(KafkaMessage	message,	Acknowledgment	
acknowledgment);	
}
Spring Integration Kafka:
Offset Management
• Injectable strategy
• Allows customizing the starting offsets
• Implementations: SI MetadataStore-backed (e.g. Redis, Gemfire),
Kafka compacted topic-backed (pre-0.8.2), Kafka 0.8.2 native
• Messages can be auto acknowledged (by the adapter) or manually
acknowledged (by the user)
• Manual acknowledgment useful when messages are processed
asynchronously
• Acknowledgment passed as message header or as argument
Stream processing with
Spring XD
• Higher abstractions are required
• Integrating seamlessly and transparently with the
middleware
• Building on top of Spring Integration and Spring
Batch
• Pre-built modules using the entire power of the
Spring ecosystem
Streams in Spring XD
HTTP$
JMS$
Ka*a$
RabbitMQ$
JMS$
Gemfire$
File$
SFTP$
Mail$
JDBC$
Twi;er$
Syslog$
TCP$
UDP$
MQTT$
Trigger$
Filter$
Transformer$
Spli;er$
Aggregator$
HTTP$Client$
JPMML$Evaluator$
Shell$
Python$
Groovy$
Java$
RxJava$
Spark$Streaming$
File$
HDFS$
HAWQ$
Ka*a$
RabbitMQ$
Redis$
Splunk$
Mongo$
Redis$
JDBC$
TCP$
Log$
Mail$
Gemfire$
MQTT$
Dynamic$Router$
Counters$
Note: Named channels allow for a
directed graph of data flow
channel
Spring XD: Stream DSL
Spring XD - Message Bus
abstraction
• Binds module inputs and outputs to a transport
Binds module inputs and outputs to a transport
Performs Serialization (Kryo)
Local, Rabbit, Redis, and Kafka
XD Modules
XD
Admin
XD Containers
ZooKeeper
ZooKeeper
Admin / Flo UI
Shell
CURL
Spring XD Architecture
Database
© Copyright 2015 Pivotal. All rights reserved.
Spring XD and Kafka - the
message bus
• Each pipe between modules is a topic;
• Spring XD creates topics automatically;
• Topics are pre-partitioned based on module count
and concurrency;
• Overpartitioning is available as an option;
• Multiple consumer modules ‘divide’ the partition set
of a topic using a deterministic algorithm;
Partitioning in Spring XD
• Required in distributed stateful processing: related data must be
processed on the same node;
• Partitioning logic configured in Spring XD via deployment manifest
• partitionKeyExpression=payload.sensorId
• When using Kafka as a bus, partition key logic maps directly to Kafka
transport partitioning natively
Partitioned Streams with
Kafka
Partition 0
Partition 1
HTTP
HTTP
HTTP
Average Processor
Average Processor
Topic
http | avg-temperatures
Performance metrics of
Spring XD 1.2
Performance metrics of Spring XD 1.2
Spring Cloud Data Flow
(Spring XD 2.0)
© Copyright 2015 Pivotal. All rights reserved.
Goals
• Scale without undeploying running stream or
batch pipelines
• Avoid hierarchical ‘classloader' issues, inadvertent
spiral of ‘xd/lib’
• Skip network hops within a stream
• Do rolling upgrades and continuous deployments
Spring Cloud Data Flow is
a cloud native programming and operating model
for composable data microservices on a
structured platform
© Copyright 2015 Pivotal. All rights reserved.
Spring Cloud Data Flow is
a cloud native programming and operating
model for composable data microservices on a
structured platform
© Copyright 2015 Pivotal. All rights reserved.
Spring Cloud Data Flow is
a cloud native programming and operating
model for composable data microservices on a
structured platform
@EnableBinding(Source.class)

public class Greeter {

@InboundChannelAdapter(Source.OUTPUT)

public String greet() {

return "hello world”;

}

}
@EnableBinding(Source.class)
@EnableBinding(Processor.class)
@EnableBinding(Sink.class)
public interface Source {	 

String OUTPUT = "output";

	 

@Output(Source.OUTPUT)

MessageChannel output();

}
© Copyright 2015 Pivotal. All rights reserved.
Spring Cloud Data Flow is
a cloud native programming and operating
model for composable data microservices on a
structured platform
continuous
delivery
continuous
deployment
monitoring
© Copyright 2015 Pivotal. All rights reserved.
Spring Cloud Data Flow is
a cloud native programming and operating model
for composable data microservices on a
structured platform
http transform jdbc
job foo
< bar || baz & jaz
> bye
Streams
Jobs
foo
bar jaz
baz
bye
| |
© Copyright 2015 Pivotal. All rights reserved.
Spring Cloud Data Flow is
a cloud native programming and operating model
for composable data microservices on a
structured platform
YARN
?
LATTICE
© Copyright 2015 Pivotal. All rights reserved.
Admin
Admin / Flo UI
Shell
CURL
??X
YARN
Bootified Modules
New Architecture
© Copyright 2015 Pivotal. All rights reserved.
XD [ Container ] Orchestration
ZooKeeper
HOST
XD ContainerXD Container
XD Modules
© Copyright 2015 Pivotal. All rights reserved.
Messaging-Driven
Data Microservices
HOST
Spring Cloud Stream Modules
© Copyright 2015 Pivotal. All rights reserved.
Orchestrate Composable
Data Microservices
HOST
Cloud Foundry YARN X
Spring Cloud Data Flow
Lattice
Spring Cloud Stream ModulesSpring Cloud Stream Binders [Rabbit, Kafka, Redis]
Partitioned stream scaling
with SCDF and Kafka
INSTANCE_INDEX=0
HTTP
…
INSTANCE_INDEX=0
INSTANCE_INDEX=1
INSTANCE_INDEX=6
…
LOG
Kafka Service
http.0 (0) http.0 (1)
http.0 (2)
http.0 (6)
…
Broker 0 Broker 1 Broker 4
stream create logger
--definition "http | log"
stream deploy logger
--properties
module.log.partitioned=true,
module.log.count=7
Summary
• Scalable pipelines composed of Spring Boot cloud
native applications
• Spring Cloud Stream provides the programming
model
• Transparently mapping to Kafka-native concepts
• Spring Cloud Data Flow provides the orchestration
model
Questions?
https://spring.io
http://projects.spring.io/spring-xd/
http://cloud.spring.io/spring-cloud-dataflow/
http://projects.spring.io/spring-integration/
http://cloud.spring.io/spring-cloud-stream/

Más contenido relacionado

La actualidad más candente

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...confluent
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereGwen (Chen) Shapira
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connectKnoldus Inc.
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Managementconfluent
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...confluent
 
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...confluent
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityJean-Paul Azar
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Fieldconfluent
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformVMware Tanzu
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 

La actualidad más candente (20)

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connect
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Management
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
 
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka Security
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
Docker Kubernetes Istio
Docker Kubernetes IstioDocker Kubernetes Istio
Docker Kubernetes Istio
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 

Destacado

jVoiD - the enterprise ecommerce Java by Schogini
jVoiD - the enterprise ecommerce Java by SchoginijVoiD - the enterprise ecommerce Java by Schogini
jVoiD - the enterprise ecommerce Java by SchoginiSchogini Systems Pvt Ltd
 
Java EE and Spring Side-by-Side
Java EE and Spring Side-by-SideJava EE and Spring Side-by-Side
Java EE and Spring Side-by-SideReza Rahman
 
Fun with EJB 3.1 and Open EJB
Fun with EJB 3.1 and Open EJBFun with EJB 3.1 and Open EJB
Fun with EJB 3.1 and Open EJBArun Gupta
 
Ejb3 1 Overview Glassfish Webinar 100208
Ejb3 1 Overview Glassfish Webinar 100208Ejb3 1 Overview Glassfish Webinar 100208
Ejb3 1 Overview Glassfish Webinar 100208Eduardo Pelegri-Llopart
 
Lightweight J2EE development with Spring (special for UADEV)
Lightweight J2EE development with Spring (special for UADEV)Lightweight J2EE development with Spring (special for UADEV)
Lightweight J2EE development with Spring (special for UADEV)springbyexample
 
Lightweight J2EE development using Spring
Lightweight J2EE development using SpringLightweight J2EE development using Spring
Lightweight J2EE development using Springspringbyexample
 

Destacado (8)

jVoiD - the enterprise ecommerce Java by Schogini
jVoiD - the enterprise ecommerce Java by SchoginijVoiD - the enterprise ecommerce Java by Schogini
jVoiD - the enterprise ecommerce Java by Schogini
 
Tu1 1 5l
Tu1 1 5lTu1 1 5l
Tu1 1 5l
 
Java EE and Spring Side-by-Side
Java EE and Spring Side-by-SideJava EE and Spring Side-by-Side
Java EE and Spring Side-by-Side
 
Fun with EJB 3.1 and Open EJB
Fun with EJB 3.1 and Open EJBFun with EJB 3.1 and Open EJB
Fun with EJB 3.1 and Open EJB
 
Ejb3 1 Overview Glassfish Webinar 100208
Ejb3 1 Overview Glassfish Webinar 100208Ejb3 1 Overview Glassfish Webinar 100208
Ejb3 1 Overview Glassfish Webinar 100208
 
Lightweight J2EE development with Spring (special for UADEV)
Lightweight J2EE development with Spring (special for UADEV)Lightweight J2EE development with Spring (special for UADEV)
Lightweight J2EE development with Spring (special for UADEV)
 
EJB 3.1 by Bert Ertman
EJB 3.1 by Bert ErtmanEJB 3.1 by Bert Ertman
EJB 3.1 by Bert Ertman
 
Lightweight J2EE development using Spring
Lightweight J2EE development using SpringLightweight J2EE development using Spring
Lightweight J2EE development using Spring
 

Similar a Developing real-time data pipelines with Spring and Kafka

Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Stream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data MicroservicesStream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data Microservicesmarius_bogoevici
 
Stream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data MicroservicesStream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data Microservicesmarius_bogoevici
 
Multi Client Development with Spring
Multi Client Development with SpringMulti Client Development with Spring
Multi Client Development with SpringJoshua Long
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSuzquiano
 
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...VMware Tanzu
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsMichael Häusler
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSAOracle Korea
 
Spring5 New Features
Spring5 New FeaturesSpring5 New Features
Spring5 New FeaturesJay Lee
 
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...Orkhan Gasimov
 
Microservices with kubernetes @190316
Microservices with kubernetes @190316Microservices with kubernetes @190316
Microservices with kubernetes @190316Jupil Hwang
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeOfir Sharony
 
Contributors Guide to the Jakarta EE 10 Galaxy
Contributors Guide to the Jakarta EE 10 GalaxyContributors Guide to the Jakarta EE 10 Galaxy
Contributors Guide to the Jakarta EE 10 GalaxyJakarta_EE
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streamsRadu Tudoran
 
Spring Web Services: SOAP vs. REST
Spring Web Services: SOAP vs. RESTSpring Web Services: SOAP vs. REST
Spring Web Services: SOAP vs. RESTSam Brannen
 

Similar a Developing real-time data pipelines with Spring and Kafka (20)

Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
Stream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data MicroservicesStream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data Microservices
 
Stream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data MicroservicesStream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data Microservices
 
Practical OData
Practical ODataPractical OData
Practical OData
 
Multi Client Development with Spring
Multi Client Development with SpringMulti Client Development with Spring
Multi Client Development with Spring
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
 
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSA
 
Spring5 New Features
Spring5 New FeaturesSpring5 New Features
Spring5 New Features
 
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...
Cloud Native Spring - The role of Spring Cloud after Kubernetes became a main...
 
Microservices with kubernetes @190316
Microservices with kubernetes @190316Microservices with kubernetes @190316
Microservices with kubernetes @190316
 
MeteorJS Introduction
MeteorJS IntroductionMeteorJS Introduction
MeteorJS Introduction
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
 
Contributors Guide to the Jakarta EE 10 Galaxy
Contributors Guide to the Jakarta EE 10 GalaxyContributors Guide to the Jakarta EE 10 Galaxy
Contributors Guide to the Jakarta EE 10 Galaxy
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streams
 
Spring Web Services: SOAP vs. REST
Spring Web Services: SOAP vs. RESTSpring Web Services: SOAP vs. REST
Spring Web Services: SOAP vs. REST
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Developing real-time data pipelines with Spring and Kafka

  • 1. Developing real-time data pipelines with Spring and Kafka Marius Bogoevici Staff Engineer, Pivotal @mariusbogoevici
  • 2. Agenda • The Spring ecosystem today • Spring Integration and Spring Integration Kafka • Data integration • Spring XD • Spring Cloud Data Flow
  • 3. Spring Framework • Since 2002 • Java-based enterprise application development • “Plumbing” should not be a developer concern • Platform agnostic
  • 4. Have you seen Spring lately? • XML-less operation (since Spring 3.0, 2009) • Component detection via @ComponentScan • Declarative stereotypes: • @Component, @Controller, @Repository • Dependency injection @Autowired • Extensive ecosystem
  • 5. A simple REST controller @RestController public class GreetingController { private static final String template = "Hello, %s!"; private final AtomicLong counter = new AtomicLong(); @RequestMapping("/greeting") public Greeting greeting(@PathVariable(value="name", defaultValue="World") String name) { return new Greeting(counter.incrementAndGet(), String.format(template, name)); } }
  • 7.
  • 9. Spring Data • Spring-based data access model • Data mapping and repository abstractions • Retains the characteristics of underlying data store • Framework generated implementation • Customized query support
  • 10. Spring Data Repositories public interface PersonRepository extends CrudRepository<Person, Long> { Person findByFirstName(String firstName); } @RestController public class PersonController { @Autowired PersonRepository repository; @RequestMapping(“/“) public List<Person> getAll() { return repository.findAll(); } @RequestMapping(“/{firstName}”) public Person readOne(@PathVariable String firstName) { return repository.findByFirstname(String name); } } Only declare the interfaces Implementation is generated and injected
  • 12. Spring Boot • Auto configuration: infrastructure automatically created based on class path contents • Smart defaults • Standalone executable artifacts (“just run”) • Uberjar + embedded runtime • Configuration via CLI, environment
  • 13. Spring Boot Application @Controller @EnableAutoConfiguration public class SampleController { @RequestMapping("/") @ResponseBody String home() { return "Hello World!"; } public static void main(String[] args) throws Exception { SpringApplication.run(SampleController.class, args); } } java -jar application.jar
  • 14. Spring Integration • Since 2007 • Pipes and Filters: Messages, Channels, Endpoints • Enterprise Integration Patterns as first-class constructs • Large set of adapters • Java DSL
  • 15. Spring Integration Message Encapsulates Data (headers + payload) Channel Transports Data Endpoint Handles Data
  • 16. Example: a simple pipeline Message Translator integerMessageSource inputChannel queueChannel myFlow (transform, filter)
  • 17. Example: a simple pipeline @Configuration @EnableIntegration public class MyConfiguration { @Bean public MessageSource<?> integerMessageSource() { MethodInvokingMessageSource source = new MethodInvokingMessageSource(); source.setObject(new AtomicInteger()); source.setMethodName("getAndIncrement"); return source; } @Bean public DirectChannel inputChannel() { return new DirectChannel(); } @Bean public IntegrationFlow myFlow() { return IntegrationFlows.from(this.integerMessageSource(), c -> c.poller(Pollers.fixedRate(100))) .channel(this.inputChannel()) .filter((Integer p) -> p > 0) .transform(Object::toString) .channel(MessageChannels.queue()) .get(); } }
  • 19. Spring Integration Components • Enterprise Integration Patterns: • Filter, Transform, Gateway, Service Activator, Aggregator, Channel Adapter, Routing Slip • Adapters: • JMS, RabbitMQ, Kafka, MongoDB, JDBC, Splunk, AWS (S3, SQS), Twitter, Email, etc.
  • 20. Spring Integration Kafka • Started in 2011 • Goal: adapting to the abstractions Spring Messaging and Spring Integration • Easy access to the unique features of Kafka; • Namespace, Java DSL support • To migrate to 0.9 once available • Defaults focused towards performance (disable ID generation, timestamp)
  • 21. Spring Integration Kafka: Channel Adapters Kafka Inbound Channel Adapter Kafka Outbound Channel Adapter Message Channel Message Message
  • 22. Spring Integration Kafka Producer Configuration • Default producer configuration • Distinct per-topic producer configurations • Destination target or partition controlled via expression evaluation or headers
  • 23. Spring Integration Kafka Consumer • Own client based on Simple Consumer API • Listen to specific partitions! • Offset control - when to be written and where (no Zookeeper); • Programmer-controlled acknowledgment; • Concurrent message processing (preserving per- partition ordering) • Basic operations via KafkaTemplate • Kafka specific headers
  • 24. Spring Integration Kafka Message Listener • Auto-acknowledging • With manual acknowledgment public interface MessageListener { void onMessage(KafkaMessage message); } public interface AcknowledgingMessageListener { void onMessage(KafkaMessage message, Acknowledgment acknowledgment); }
  • 25. Spring Integration Kafka: Offset Management • Injectable strategy • Allows customizing the starting offsets • Implementations: SI MetadataStore-backed (e.g. Redis, Gemfire), Kafka compacted topic-backed (pre-0.8.2), Kafka 0.8.2 native • Messages can be auto acknowledged (by the adapter) or manually acknowledged (by the user) • Manual acknowledgment useful when messages are processed asynchronously • Acknowledgment passed as message header or as argument
  • 26. Stream processing with Spring XD • Higher abstractions are required • Integrating seamlessly and transparently with the middleware • Building on top of Spring Integration and Spring Batch • Pre-built modules using the entire power of the Spring ecosystem
  • 27. Streams in Spring XD HTTP$ JMS$ Ka*a$ RabbitMQ$ JMS$ Gemfire$ File$ SFTP$ Mail$ JDBC$ Twi;er$ Syslog$ TCP$ UDP$ MQTT$ Trigger$ Filter$ Transformer$ Spli;er$ Aggregator$ HTTP$Client$ JPMML$Evaluator$ Shell$ Python$ Groovy$ Java$ RxJava$ Spark$Streaming$ File$ HDFS$ HAWQ$ Ka*a$ RabbitMQ$ Redis$ Splunk$ Mongo$ Redis$ JDBC$ TCP$ Log$ Mail$ Gemfire$ MQTT$ Dynamic$Router$ Counters$ Note: Named channels allow for a directed graph of data flow channel
  • 29. Spring XD - Message Bus abstraction • Binds module inputs and outputs to a transport Binds module inputs and outputs to a transport Performs Serialization (Kryo) Local, Rabbit, Redis, and Kafka
  • 30. XD Modules XD Admin XD Containers ZooKeeper ZooKeeper Admin / Flo UI Shell CURL Spring XD Architecture Database © Copyright 2015 Pivotal. All rights reserved.
  • 31. Spring XD and Kafka - the message bus • Each pipe between modules is a topic; • Spring XD creates topics automatically; • Topics are pre-partitioned based on module count and concurrency; • Overpartitioning is available as an option; • Multiple consumer modules ‘divide’ the partition set of a topic using a deterministic algorithm;
  • 32. Partitioning in Spring XD • Required in distributed stateful processing: related data must be processed on the same node; • Partitioning logic configured in Spring XD via deployment manifest • partitionKeyExpression=payload.sensorId • When using Kafka as a bus, partition key logic maps directly to Kafka transport partitioning natively
  • 33. Partitioned Streams with Kafka Partition 0 Partition 1 HTTP HTTP HTTP Average Processor Average Processor Topic http | avg-temperatures
  • 35. Performance metrics of Spring XD 1.2
  • 36. Spring Cloud Data Flow (Spring XD 2.0) © Copyright 2015 Pivotal. All rights reserved.
  • 37. Goals • Scale without undeploying running stream or batch pipelines • Avoid hierarchical ‘classloader' issues, inadvertent spiral of ‘xd/lib’ • Skip network hops within a stream • Do rolling upgrades and continuous deployments
  • 38. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform © Copyright 2015 Pivotal. All rights reserved.
  • 39. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform © Copyright 2015 Pivotal. All rights reserved.
  • 40. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform @EnableBinding(Source.class) public class Greeter { @InboundChannelAdapter(Source.OUTPUT) public String greet() { return "hello world”; } } @EnableBinding(Source.class) @EnableBinding(Processor.class) @EnableBinding(Sink.class) public interface Source { 
 String OUTPUT = "output";
 @Output(Source.OUTPUT) MessageChannel output();
 } © Copyright 2015 Pivotal. All rights reserved.
  • 41. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform continuous delivery continuous deployment monitoring © Copyright 2015 Pivotal. All rights reserved.
  • 42. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform http transform jdbc job foo < bar || baz & jaz > bye Streams Jobs foo bar jaz baz bye | | © Copyright 2015 Pivotal. All rights reserved.
  • 43. Spring Cloud Data Flow is a cloud native programming and operating model for composable data microservices on a structured platform YARN ? LATTICE © Copyright 2015 Pivotal. All rights reserved.
  • 44. Admin Admin / Flo UI Shell CURL ??X YARN Bootified Modules New Architecture © Copyright 2015 Pivotal. All rights reserved.
  • 45. XD [ Container ] Orchestration ZooKeeper HOST XD ContainerXD Container XD Modules © Copyright 2015 Pivotal. All rights reserved.
  • 46. Messaging-Driven Data Microservices HOST Spring Cloud Stream Modules © Copyright 2015 Pivotal. All rights reserved.
  • 47. Orchestrate Composable Data Microservices HOST Cloud Foundry YARN X Spring Cloud Data Flow Lattice Spring Cloud Stream ModulesSpring Cloud Stream Binders [Rabbit, Kafka, Redis]
  • 48. Partitioned stream scaling with SCDF and Kafka INSTANCE_INDEX=0 HTTP … INSTANCE_INDEX=0 INSTANCE_INDEX=1 INSTANCE_INDEX=6 … LOG Kafka Service http.0 (0) http.0 (1) http.0 (2) http.0 (6) … Broker 0 Broker 1 Broker 4 stream create logger --definition "http | log" stream deploy logger --properties module.log.partitioned=true, module.log.count=7
  • 49. Summary • Scalable pipelines composed of Spring Boot cloud native applications • Spring Cloud Stream provides the programming model • Transparently mapping to Kafka-native concepts • Spring Cloud Data Flow provides the orchestration model