Leverage event streaming framework to build intelligent applications

Luca Mattia Ferrari
Luca Mattia FerrariEMEA Solution Architect en Red Hat
Leverage event streaming
framework to build
intelligent applications
Luca Ferrari
EMEA SSA for API
Management
2
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
4
What can you expect during this session:
➔ Introductions
➔ Context
➔ Basics of Kafka
➔ Use Cases
➔ ML elements
➔ ML applied
➔ Demo
➔ Key takeaways
➔ Where next
2019 RED HAT TECH EXCHANGE
5
Introduction
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
Introductions
6
Name:
Luca Ferrari
Role/team:
EMEA SSA
Where you’re from:
Barcelona & Pavia
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
7
WHY am I here?
In the news
2019 RED HAT TECH EXCHANGE
8
Context
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
9
Agile Integration foundations
DISTRIBUTED
INTEGRATION
CONTAINERS APIs
LIGHTWEIGHT
PATTERN BASED
EVENT-ORIENTED
COMMUNITY-SOURCED
CLOUD-NATIVE SOLUTIONS
LEAN ARTIFACTS, INDIVIDUALLY
DEPLOYABLE
CONTAINER-BASED SCALING & HIGH
AVAILABILITY
WELL-DEFINED, REUSABLE, &
WELL-MANAGED
ENDPOINTS
ECOSYSTEM LEVERAGE
API
SERVICES
SECURITY, AUTHENTICATION, AUDIT (RH-SSO)
RED HAT
FUSE
RED HAT
AMQ
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
10
AMQ offering
Flexible, standards-based messaging for the enterprise, cloud and Internet of Things
Self-service Messaging
- Scalable, easy-to-manage messaging utility for OpenShift Container Platform (Beta)
- Red Hat-managed deployment (Tech Preview)
Broker
(AMQ)
- Store & Forward
- Volatile & Durable
- Full JMS 2.0 Support
- Best-in-class perf
Interconnect
(AMQ)
- High-performance direct
messaging
- Distributed messaging
backbone
Streams
(AMQ Streams)
- Streaming platform
- Durable pub/sub
- Replayable streams
- Based on Apache
Kafka and Strimzi
Standard
Protocols
Polyglot
Clients
CommonManagement
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
11
AMQ offering
Broker
High-performance
messaging
implementation based on
ActiveMQ Artemis
Interconnect
Message router to build
large-scale messaging
networks using the
AMQP protocol to create
a redundant
application-level
messaging network
Streams
Streams simplifies the
deployment,
configuration,
management and use of
Apache Kafka on
OpenShift using the
Operator concept
2019 RED HAT TECH EXCHANGE
12
Basics of Kafka
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
13
Pub / Sub
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
14
topic, partitions and offset
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
15
example
truck_gps
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
16
Brokers and partitions
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
17
Replication and leaders
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
18
Producers
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
19
Consumers
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
20
Delivery semantics
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
21
Broker
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
22
Zookeeper
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
23
Complete
2019 RED HAT TECH EXCHANGE
24
Kafka & AMQ
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
25
Different tradeoffs ...
… Different Use Cases
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
26
JBOSS AMQ
● You care about individual messages
● You want clients to use standard APIs (e.g., JMS) or wire protocols (e.g., AMQP)
● You need transactional sends and receives
● You’re doing request-reply messaging
● Heterogeneous client/protocol messaging (ie, AMQP, MQTT, STOMP, etc)
● You send metadata/headers/properties with your messages
● You don’t want to implement broker functionality in your clients (ie, partitioning,
dispatching, coordination)
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
27
KAFKA
● You care about messages in volume
● You care about raw throughput, high performance
● You need sliding-window replay abilities
● Large numbers of subscribers for published events
● You need to finely control the parallelism/scalability of consumers
● You want to leverage application-level replication vs HA storage
● You need total order guarantees at the partition level
2019 RED HAT TECH EXCHANGE
28
Use cases
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
29
Industries
➔ Travel companies
➔ Finance and fintech companies
➔ Retailers and online shopping
➔ Automotive and manufacturing companies
➔ Video Streaming companies
➔ Social networks
➔ Transportation
➔ ...
[ https://kafka.apache.org/powered-by ]
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
30
SINGAPORE AIRLINES
[ https://speakerdeck.com/devacto/predictive-maintenance-pipeline-using-kafka-connect-streams-and-ksql ]
Problem:
Many
airplanes
Many components
per airplane
Each airplane
different
flight-plan
! Predictive maintenance is a hard !
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
31
SINGAPORE AIRLINES
Solution:
Kafka
Connector
Kafka
Streams App ML model
Web App
2019 RED HAT TECH EXCHANGE
32
ML elements
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
33
Text is everywhere
Most of it unstructured
> 40 million articles in Wikipedia
> 4.5 billion web pages
> 500 million tweets a day
> 1.5 trillion queries on Google a day
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
34
Problem Taxonomy
Classification
Supervised
Learning
Machine
Learning
Unsupervised
Learning
Regression
Clustering
...
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
35
NLP
Natural language
Natural Language Processing
English, Deutsch, Italiano
#rhte2019 for a great week in #vienna
c u l8r
Example:
Frequency count
Normalization & Stemming
Tokenization
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
36
Common Text Problems
How do I represent the text in a compact
and computer friendly way?
Bag of Words model
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
37
Common Text Problems
How do I represent the text in a compact
and computer friendly way?
How do I find a text similar to the one I
have?
Bag of Words model
Similarity measure
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
38
Common Text Problems
How do I represent the text in a compact
and computer friendly way?
How do I find a text similar to the one I
have?
How do we extract the core meaning of
the text? How do we get the important
words?
Bag of Words model
Similarity measure
Term frequency - inverse document
frequency (tf-idf)
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
39
Common Text Problems
How do I represent the text in a compact
and computer friendly way?
How do I find a text similar to the one I
have?
How do we extract the core meaning of
the text? How do we get the important
words?
How do I classify an article based on
certain categories?
Bag of Words model
Similarity measure
Term frequency - inverse document
frequency (tf-idf)
Naive Bayes model
2019 RED HAT TECH EXCHANGE
40
ML applied
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
41
Overall ML model
20 newsgroup
training data
Continuous Bag of
Words
IF-IDF
Multinomial Bayes
classifier
Trained model
Classification engine
Twitter stream
Boris Johnson on the phone. He’s
apparently not able to return to the UK
to answer questions in parliament as he
was booked on a Thomas Cook flight.
=> Politics
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
42
Overall architecture
2019 RED HAT TECH EXCHANGE
43
Demo
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
44
FIRST STEP
Start a local Kafka instance:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
Create a topic:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1
--partitions 1 --topic twitter-stream
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
45
SECOND STEP
Start the producer:
python twitter_kafka_producer.py
Start the consumer:
python doc_classifier.py
Had to fix some issues with original code:
● Add the Kafka api_version in the consumer and producer
● Add timeouts and retries since tweepy doesn’t implement that and Twitter is very restrictive with
RateLimits on its APIs
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
46
LIVE
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
47
NEXT STEPS
● Test locally by exposing Kafka endpoint
(https://strimzi.io/2019/04/30/accessing-kafka-part-3.html)
● Deploy the python app to OpenShift and test everything on it
● Build a chart visualization
● Cache tweets locally for optimization and then leverage Debezium
● Using Faust, a stream processing library, porting the ideas from Kafka Streams to Python.
2019 RED HAT TECH EXCHANGE
48
Outro
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
49
Key takeaways
➢ Learn possible use cases around Kafka and understand where does it fit better
➢ Learn the basics elements of an ML text classification model
➢ Learn the building blocks of a streaming application
CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE
50
Resources
You can find more material on a similar topic here
Related project: OpenDataHub
AMQ Streams use cases
As reference you can use this book:
https://www.amazon.es/Building-Streaming-Applications-Apache-Kafka/dp/1787283984/
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHat
Red Hat is the world’s leading provider of enterprise
open source software solutions. Award-winning
support, training, and consulting services make
Red Hat a trusted adviser to the Fortune 500.
Thank you
52
1 de 49

Recomendados

What is the best approach to tdd por
What is the best approach to tddWhat is the best approach to tdd
What is the best approach to tddLuca Mattia Ferrari
99 vistas49 diapositivas
ITCamp 2019 - Andrea Saltarello - Implementing bots and Alexa skills using Az... por
ITCamp 2019 - Andrea Saltarello - Implementing bots and Alexa skills using Az...ITCamp 2019 - Andrea Saltarello - Implementing bots and Alexa skills using Az...
ITCamp 2019 - Andrea Saltarello - Implementing bots and Alexa skills using Az...ITCamp
101 vistas39 diapositivas
WSO2-Yenlo Integration Summit Stuttgart 15 may 2019 por
WSO2-Yenlo Integration Summit Stuttgart 15 may 2019WSO2-Yenlo Integration Summit Stuttgart 15 may 2019
WSO2-Yenlo Integration Summit Stuttgart 15 may 2019Yenlo
255 vistas242 diapositivas
The 3 pillars of agile integration: Container, Connector and API por
The 3 pillars of agile integration:  Container, Connector and APIThe 3 pillars of agile integration:  Container, Connector and API
The 3 pillars of agile integration: Container, Connector and APIJudy Breedlove
1.1K vistas27 diapositivas
Industrializing Machine learning pipelines por
Industrializing Machine learning pipelinesIndustrializing Machine learning pipelines
Industrializing Machine learning pipelinesGermain Tanguy
1.2K vistas47 diapositivas
From Copycat Codelets to an AI Market Internet Protocol por
From Copycat Codelets to an AI Market Internet ProtocolFrom Copycat Codelets to an AI Market Internet Protocol
From Copycat Codelets to an AI Market Internet ProtocolStefan Ianta
179 vistas27 diapositivas

Más contenido relacionado

Similar a Leverage event streaming framework to build intelligent applications

Xamarin - Under the bridge por
Xamarin - Under the bridgeXamarin - Under the bridge
Xamarin - Under the bridgeDan Ardelean
632 vistas35 diapositivas
JHipster & blueprint 02-07-2019 - casablanca jug por
JHipster & blueprint 02-07-2019 - casablanca jugJHipster & blueprint 02-07-2019 - casablanca jug
JHipster & blueprint 02-07-2019 - casablanca jugAnthony Viard
175 vistas27 diapositivas
Cloud Native with Kyma por
Cloud Native with KymaCloud Native with Kyma
Cloud Native with KymaPiotr Kopczynski
592 vistas53 diapositivas
Xamarin Under The Hood - Dan Ardelean por
 Xamarin Under The Hood - Dan Ardelean Xamarin Under The Hood - Dan Ardelean
Xamarin Under The Hood - Dan ArdeleanITCamp
1.6K vistas35 diapositivas
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle... por
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...ITCamp
71 vistas45 diapositivas
Oracle Modern AppDev Approach to Cloud & Container Native App por
Oracle Modern AppDev Approach to Cloud & Container Native AppOracle Modern AppDev Approach to Cloud & Container Native App
Oracle Modern AppDev Approach to Cloud & Container Native AppPaulo Alberto Simoes ∴
271 vistas50 diapositivas

Similar a Leverage event streaming framework to build intelligent applications(20)

Xamarin - Under the bridge por Dan Ardelean
Xamarin - Under the bridgeXamarin - Under the bridge
Xamarin - Under the bridge
Dan Ardelean632 vistas
JHipster & blueprint 02-07-2019 - casablanca jug por Anthony Viard
JHipster & blueprint 02-07-2019 - casablanca jugJHipster & blueprint 02-07-2019 - casablanca jug
JHipster & blueprint 02-07-2019 - casablanca jug
Anthony Viard175 vistas
Xamarin Under The Hood - Dan Ardelean por ITCamp
 Xamarin Under The Hood - Dan Ardelean Xamarin Under The Hood - Dan Ardelean
Xamarin Under The Hood - Dan Ardelean
ITCamp1.6K vistas
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle... por ITCamp
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...
ITCamp 2019 - Emil Craciun - RoboRestaurant of the future powered by serverle...
ITCamp71 vistas
Experiences from Incorporating Sign Language in Customer Interactions por Alan Quayle
Experiences from Incorporating Sign Language in Customer InteractionsExperiences from Incorporating Sign Language in Customer Interactions
Experiences from Incorporating Sign Language in Customer Interactions
Alan Quayle179 vistas
Confluent Partner Tech Talk with SVA por confluent
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVA
confluent95 vistas
The next generation of ap is luis weir.cwin18.telford por Capgemini
The next generation of ap is   luis weir.cwin18.telfordThe next generation of ap is   luis weir.cwin18.telford
The next generation of ap is luis weir.cwin18.telford
Capgemini421 vistas
ITCamp 2019 - Mihai Tataran - Governing your Cloud Resources por ITCamp
ITCamp 2019 - Mihai Tataran - Governing your Cloud ResourcesITCamp 2019 - Mihai Tataran - Governing your Cloud Resources
ITCamp 2019 - Mihai Tataran - Governing your Cloud Resources
ITCamp131 vistas
WSO2 User Group Bangalore Meetup por WSO2
WSO2 User Group Bangalore MeetupWSO2 User Group Bangalore Meetup
WSO2 User Group Bangalore Meetup
WSO2173 vistas
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ... por OpenNebula Project
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebula Project2.1K vistas
Transforming enterprise it with containers, ap is and integration api manage... por Judy Breedlove
Transforming enterprise it with containers, ap is and integration  api manage...Transforming enterprise it with containers, ap is and integration  api manage...
Transforming enterprise it with containers, ap is and integration api manage...
Judy Breedlove364 vistas
[apidays Live Australia] - Breaking down the barriers between Pro-Code, Low-C... por WSO2
[apidays Live Australia] - Breaking down the barriers between Pro-Code, Low-C...[apidays Live Australia] - Breaking down the barriers between Pro-Code, Low-C...
[apidays Live Australia] - Breaking down the barriers between Pro-Code, Low-C...
WSO2149 vistas
Axway's Journey to the Cloud por Axway
Axway's Journey to the CloudAxway's Journey to the Cloud
Axway's Journey to the Cloud
Axway1.4K vistas
MissionGraph QTD Training Materials por Benjamin Huston
MissionGraph QTD Training MaterialsMissionGraph QTD Training Materials
MissionGraph QTD Training Materials
Benjamin Huston31 vistas
Serverless survival kit por Steve Houël
Serverless survival kitServerless survival kit
Serverless survival kit
Steve Houël56 vistas
Confluent Steaming Webinar - Cape Town - Vitality por confluent
Confluent Steaming Webinar - Cape Town - VitalityConfluent Steaming Webinar - Cape Town - Vitality
Confluent Steaming Webinar - Cape Town - Vitality
confluent393 vistas

Más de Luca Mattia Ferrari

Meetup 2023 - Gateway API.pdf por
Meetup 2023 - Gateway API.pdfMeetup 2023 - Gateway API.pdf
Meetup 2023 - Gateway API.pdfLuca Mattia Ferrari
12 vistas36 diapositivas
Meetup 2022 - APIs with Quarkus.pdf por
Meetup 2022 - APIs with Quarkus.pdfMeetup 2022 - APIs with Quarkus.pdf
Meetup 2022 - APIs with Quarkus.pdfLuca Mattia Ferrari
27 vistas23 diapositivas
Meetup 2022 - API Gateway landscape.pdf por
Meetup 2022 - API Gateway landscape.pdfMeetup 2022 - API Gateway landscape.pdf
Meetup 2022 - API Gateway landscape.pdfLuca Mattia Ferrari
46 vistas28 diapositivas
APIs at the Edge por
APIs at the EdgeAPIs at the Edge
APIs at the EdgeLuca Mattia Ferrari
17 vistas46 diapositivas
Opa in the api management world por
Opa in the api management worldOpa in the api management world
Opa in the api management worldLuca Mattia Ferrari
144 vistas40 diapositivas
How easy (or hard) it is to monitor your graph ql service performance por
How easy (or hard) it is to monitor your graph ql service performanceHow easy (or hard) it is to monitor your graph ql service performance
How easy (or hard) it is to monitor your graph ql service performanceLuca Mattia Ferrari
140 vistas35 diapositivas

Más de Luca Mattia Ferrari(20)

How easy (or hard) it is to monitor your graph ql service performance por Luca Mattia Ferrari
How easy (or hard) it is to monitor your graph ql service performanceHow easy (or hard) it is to monitor your graph ql service performance
How easy (or hard) it is to monitor your graph ql service performance
Luca Mattia Ferrari140 vistas
statement of accomplishment - heterogeneous parallel programming por Luca Mattia Ferrari
statement of accomplishment - heterogeneous parallel programmingstatement of accomplishment - heterogeneous parallel programming
statement of accomplishment - heterogeneous parallel programming
Luca Mattia Ferrari239 vistas

Último

FIMA 2023 Neo4j & FS - Entity Resolution.pptx por
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptxNeo4j
17 vistas26 diapositivas
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... por
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...NimaTorabi2
15 vistas17 diapositivas
Dapr Unleashed: Accelerating Microservice Development por
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice DevelopmentMiroslav Janeski
12 vistas29 diapositivas
Understanding HTML terminology por
Understanding HTML terminologyUnderstanding HTML terminology
Understanding HTML terminologyartembondar5
6 vistas8 diapositivas
Keep por
KeepKeep
KeepGeniusee
78 vistas10 diapositivas
Flask-Python.pptx por
Flask-Python.pptxFlask-Python.pptx
Flask-Python.pptxTriloki Gupta
7 vistas12 diapositivas

Último(20)

FIMA 2023 Neo4j & FS - Entity Resolution.pptx por Neo4j
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptx
Neo4j17 vistas
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... por NimaTorabi2
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
NimaTorabi215 vistas
Dapr Unleashed: Accelerating Microservice Development por Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski12 vistas
Understanding HTML terminology por artembondar5
Understanding HTML terminologyUnderstanding HTML terminology
Understanding HTML terminology
artembondar56 vistas
Myths and Facts About Hospice Care: Busting Common Misconceptions por Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
JioEngage_Presentation.pptx por admin125455
JioEngage_Presentation.pptxJioEngage_Presentation.pptx
JioEngage_Presentation.pptx
admin1254556 vistas
Navigating container technology for enhanced security by Niklas Saari por Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy14 vistas
Introduction to Git Source Control por John Valentino
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source Control
John Valentino6 vistas
predicting-m3-devopsconMunich-2023.pptx por Tier1 app
predicting-m3-devopsconMunich-2023.pptxpredicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptx
Tier1 app7 vistas
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports por Ra'Fat Al-Msie'deen
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
Airline Booking Software por SharmiMehta
Airline Booking SoftwareAirline Booking Software
Airline Booking Software
SharmiMehta7 vistas
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action por Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok15 vistas

Leverage event streaming framework to build intelligent applications

  • 1. Leverage event streaming framework to build intelligent applications Luca Ferrari EMEA SSA for API Management 2
  • 2. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 4 What can you expect during this session: ➔ Introductions ➔ Context ➔ Basics of Kafka ➔ Use Cases ➔ ML elements ➔ ML applied ➔ Demo ➔ Key takeaways ➔ Where next
  • 3. 2019 RED HAT TECH EXCHANGE 5 Introduction
  • 4. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE Introductions 6 Name: Luca Ferrari Role/team: EMEA SSA Where you’re from: Barcelona & Pavia
  • 5. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 7 WHY am I here? In the news
  • 6. 2019 RED HAT TECH EXCHANGE 8 Context
  • 7. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 9 Agile Integration foundations DISTRIBUTED INTEGRATION CONTAINERS APIs LIGHTWEIGHT PATTERN BASED EVENT-ORIENTED COMMUNITY-SOURCED CLOUD-NATIVE SOLUTIONS LEAN ARTIFACTS, INDIVIDUALLY DEPLOYABLE CONTAINER-BASED SCALING & HIGH AVAILABILITY WELL-DEFINED, REUSABLE, & WELL-MANAGED ENDPOINTS ECOSYSTEM LEVERAGE API SERVICES SECURITY, AUTHENTICATION, AUDIT (RH-SSO) RED HAT FUSE RED HAT AMQ
  • 8. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 10 AMQ offering Flexible, standards-based messaging for the enterprise, cloud and Internet of Things Self-service Messaging - Scalable, easy-to-manage messaging utility for OpenShift Container Platform (Beta) - Red Hat-managed deployment (Tech Preview) Broker (AMQ) - Store & Forward - Volatile & Durable - Full JMS 2.0 Support - Best-in-class perf Interconnect (AMQ) - High-performance direct messaging - Distributed messaging backbone Streams (AMQ Streams) - Streaming platform - Durable pub/sub - Replayable streams - Based on Apache Kafka and Strimzi Standard Protocols Polyglot Clients CommonManagement
  • 9. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 11 AMQ offering Broker High-performance messaging implementation based on ActiveMQ Artemis Interconnect Message router to build large-scale messaging networks using the AMQP protocol to create a redundant application-level messaging network Streams Streams simplifies the deployment, configuration, management and use of Apache Kafka on OpenShift using the Operator concept
  • 10. 2019 RED HAT TECH EXCHANGE 12 Basics of Kafka
  • 11. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 13 Pub / Sub
  • 12. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 14 topic, partitions and offset
  • 13. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 15 example truck_gps
  • 14. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 16 Brokers and partitions
  • 15. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 17 Replication and leaders
  • 16. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 18 Producers
  • 17. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 19 Consumers
  • 18. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 20 Delivery semantics
  • 19. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 21 Broker
  • 20. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 22 Zookeeper
  • 21. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 23 Complete
  • 22. 2019 RED HAT TECH EXCHANGE 24 Kafka & AMQ
  • 23. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 25 Different tradeoffs ... … Different Use Cases
  • 24. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 26 JBOSS AMQ ● You care about individual messages ● You want clients to use standard APIs (e.g., JMS) or wire protocols (e.g., AMQP) ● You need transactional sends and receives ● You’re doing request-reply messaging ● Heterogeneous client/protocol messaging (ie, AMQP, MQTT, STOMP, etc) ● You send metadata/headers/properties with your messages ● You don’t want to implement broker functionality in your clients (ie, partitioning, dispatching, coordination)
  • 25. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 27 KAFKA ● You care about messages in volume ● You care about raw throughput, high performance ● You need sliding-window replay abilities ● Large numbers of subscribers for published events ● You need to finely control the parallelism/scalability of consumers ● You want to leverage application-level replication vs HA storage ● You need total order guarantees at the partition level
  • 26. 2019 RED HAT TECH EXCHANGE 28 Use cases
  • 27. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 29 Industries ➔ Travel companies ➔ Finance and fintech companies ➔ Retailers and online shopping ➔ Automotive and manufacturing companies ➔ Video Streaming companies ➔ Social networks ➔ Transportation ➔ ... [ https://kafka.apache.org/powered-by ]
  • 28. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 30 SINGAPORE AIRLINES [ https://speakerdeck.com/devacto/predictive-maintenance-pipeline-using-kafka-connect-streams-and-ksql ] Problem: Many airplanes Many components per airplane Each airplane different flight-plan ! Predictive maintenance is a hard !
  • 29. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 31 SINGAPORE AIRLINES Solution: Kafka Connector Kafka Streams App ML model Web App
  • 30. 2019 RED HAT TECH EXCHANGE 32 ML elements
  • 31. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 33 Text is everywhere Most of it unstructured > 40 million articles in Wikipedia > 4.5 billion web pages > 500 million tweets a day > 1.5 trillion queries on Google a day
  • 32. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 34 Problem Taxonomy Classification Supervised Learning Machine Learning Unsupervised Learning Regression Clustering ...
  • 33. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 35 NLP Natural language Natural Language Processing English, Deutsch, Italiano #rhte2019 for a great week in #vienna c u l8r Example: Frequency count Normalization & Stemming Tokenization
  • 34. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 36 Common Text Problems How do I represent the text in a compact and computer friendly way? Bag of Words model
  • 35. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 37 Common Text Problems How do I represent the text in a compact and computer friendly way? How do I find a text similar to the one I have? Bag of Words model Similarity measure
  • 36. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 38 Common Text Problems How do I represent the text in a compact and computer friendly way? How do I find a text similar to the one I have? How do we extract the core meaning of the text? How do we get the important words? Bag of Words model Similarity measure Term frequency - inverse document frequency (tf-idf)
  • 37. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 39 Common Text Problems How do I represent the text in a compact and computer friendly way? How do I find a text similar to the one I have? How do we extract the core meaning of the text? How do we get the important words? How do I classify an article based on certain categories? Bag of Words model Similarity measure Term frequency - inverse document frequency (tf-idf) Naive Bayes model
  • 38. 2019 RED HAT TECH EXCHANGE 40 ML applied
  • 39. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 41 Overall ML model 20 newsgroup training data Continuous Bag of Words IF-IDF Multinomial Bayes classifier Trained model Classification engine Twitter stream Boris Johnson on the phone. He’s apparently not able to return to the UK to answer questions in parliament as he was booked on a Thomas Cook flight. => Politics
  • 40. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 42 Overall architecture
  • 41. 2019 RED HAT TECH EXCHANGE 43 Demo
  • 42. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 44 FIRST STEP Start a local Kafka instance: bin/zookeeper-server-start.sh config/zookeeper.properties bin/kafka-server-start.sh config/server.properties Create a topic: bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic twitter-stream
  • 43. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 45 SECOND STEP Start the producer: python twitter_kafka_producer.py Start the consumer: python doc_classifier.py Had to fix some issues with original code: ● Add the Kafka api_version in the consumer and producer ● Add timeouts and retries since tweepy doesn’t implement that and Twitter is very restrictive with RateLimits on its APIs
  • 44. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 46 LIVE
  • 45. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 47 NEXT STEPS ● Test locally by exposing Kafka endpoint (https://strimzi.io/2019/04/30/accessing-kafka-part-3.html) ● Deploy the python app to OpenShift and test everything on it ● Build a chart visualization ● Cache tweets locally for optimization and then leverage Debezium ● Using Faust, a stream processing library, porting the ideas from Kafka Streams to Python.
  • 46. 2019 RED HAT TECH EXCHANGE 48 Outro
  • 47. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 49 Key takeaways ➢ Learn possible use cases around Kafka and understand where does it fit better ➢ Learn the basics elements of an ML text classification model ➢ Learn the building blocks of a streaming application
  • 48. CONFIDENTIAL INTERNAL USE2019 RED HAT TECH EXCHANGE 50 Resources You can find more material on a similar topic here Related project: OpenDataHub AMQ Streams use cases As reference you can use this book: https://www.amazon.es/Building-Streaming-Applications-Apache-Kafka/dp/1787283984/
  • 49. linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat Red Hat is the world’s leading provider of enterprise open source software solutions. Award-winning support, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. Thank you 52