Más contenido relacionado La actualidad más candente (20) Similar a [db tech showcase Tokyo 2016] E22: Getting real time Oracle data into Kafka and unlocking the data in your database by Dbvisit Software Limited Chris Lawless (20) Más de Insight Technology, Inc. (20) [db tech showcase Tokyo 2016] E22: Getting real time Oracle data into Kafka and unlocking the data in your database by Dbvisit Software Limited Chris Lawless1. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
Dbvisit So*ware
Real-‐2me Oracle Database Streaming into Ka9a
Chris Lawless
2. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
Agenda
• Oracle OLTP
• Evolu2on of data warehouses
• Data Lake
• Intro to Ka9a -‐ what need does it fill?
• Marriage of the two
3. © 2016 Dbvisit Software | dbvisit.com
About Dbvisit So*ware
• Real-time Oracle Database Streaming software solutions
• In the Cloud | Hybrid | On-Premise
• New Zealand-based, US office, Asia sales office, EU office (Prague)
• Unique offering: disaster recovery solutions for Oracle Standard Edition
• Low cost Oracle GoldenGate alternative
• Flexible licensing, pricing models available
• Peerless customer support
4. © 2016 Dbvisit Software | dbvisit.com
Result: 1,100+ customers in 6 con2nents
6. © 2016 Dbvisit Software | dbvisit.com
About Chris
• 4 Years at Oracle University teaching DBA courses
• 5 Years at GoldenGate Support and Product Management
• 4 Years at Oracle GoldenGate Product Management
• Past 3 years at Dbvisit
7. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
The World we live in
The Situa2on:
ü The enterprise is increasingly powered by data
ü OLTP transac2onal data essen2al
ü The use of real-‐2me data for compe22ve advantage is disrup2ng most
industries
ü Tradi2onal databases are not going away, new database technologies are
being added
ü Con2nuous replica2on data streams becoming a “first class ci2zen”
8. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
Reality of RDBMS
RDBMS
ü Millions of Oracle databases out there
ü OLTP databases are ingrained in the business
ü Pervasive
ü ERPs
ü CRMs
9. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
OLTP
RDBMS
ü MySQL #1 leader in databases
ü MSSQL #1 leader is sold
ü IBM DB2 #1 in most installs
ü Oracle #1 in most sales
ü Oracle is reported to have over 50% of all RDBMS sales
ü Oracle is here to stay
10. © 2016 Dbvisit Software | dbvisit.com
OLTP Structured Data
• Nice and structured
• Columns
• Rows
• Rela2onships
11. © 2016 Dbvisit Software | dbvisit.com
OLTP systems
• Banking
• Online shopping
• Stock Markets
• Healthcare
• ERP Systems
• Customer Rela2ons Management
13. © 2016 Dbvisit Software | dbvisit.com
OLTP systems with Data Warehouses Old
school
• OLTP systems typically will feed Data Warehouses via Batch jobs
• Banking statements that get mailed monthly
• Sales analysis on what was sold last month
• Repor2ng on ERP systems
• Quarterly Financial reports
14. © 2016 Dbvisit Software | dbvisit.com
OLTP with Data Warehouse
Batch ETL
Process
Data Warehouse
database
OLTP Database
15. © 2016 Dbvisit Software | dbvisit.com
OLTP systems with Data Warehouses
REAL-‐TIME
• Online Shopping with INSTANT emails regarding your shopping habits
• ERP systems with INSTANT informa2on regarding current sales
• Online Banking with access to years of historical data
16. © 2016 Dbvisit Software | dbvisit.com
OLTP with Data Warehouse
Real-‐2me
Streaming
Data Warehouse
database
OLTP Database
17. © 2016 Dbvisit Software | dbvisit.com
The concept of Data Lake or Data
Reservoir
Not all data is structured
• What about IOT data?
• What about machine data?
• What about log data?
• Semi Structured data?
18. © 2016 Dbvisit Software | dbvisit.com
The new concept of Data Lake or Data
Reservoir
• A Data Lake is storage to hold vast amounts of RAW data that is typically
kept in the na2ve format
• O*en using huge unstructured nodes
• Hadoop is the frequent repository of choice
19. © 2016 Dbvisit Software | dbvisit.com
The new concept of Data Lake or Data Reservoir
Machine
Data
loT Web
logs
Applica2on
logs
Streaming
Web Data
Other
OLTP Database
OLTP Database
ETL
Real-‐2me
Streaming
Data Lake
20. © 2016 Dbvisit Software | dbvisit.com
Ka9a a brief History
• Open Sourced in 2011
• Developed at Linkedin and then ‘released to the world’ as part of Apache
Founda2on.
• These guys spun off to form Confluent
-‐ Ka9a Connect. A framework which makes it simple to define connectors to
move data in and out of Ka9a
• Key features:
-‐ Simple API for producers and consumers
-‐ High Throughput
-‐ Scaled out Architecture
-‐ Non formaeed messages
21. © 2016 Dbvisit Software | dbvisit.com
Intro to Ka9a
What is Ka9a?
A distributed system where messages are kept in topics that are par22oned
and replicated across mul2ple nodes.
Message
Simply put… the data
Messages can be in any format:
Common ones are String, JSON, Avro
22. © 2016 Dbvisit Software | dbvisit.com
Intro to Ka9a
Topics
One or more Par22ons that are ordered sequences of messages.
Producers (Publishers)
Produce data to one or more topics
Consumers (Subscribers)
Subscribe to topics and process the messages
23. © 2016 Dbvisit Software | dbvisit.com
Old method
Source Target
Target
Target
Ka9a Source
Source
Target
Target
Target
Source
Source
Source
24. © 2016 Dbvisit Software | dbvisit.com
Intro to Ka9a
Producer
Producer
Producer
Consumer
Consumer
Consumer
Ka9a
25. © 2016 Dbvisit Software | dbvisit.com
Ka9a
Par22on 0 Par22on 1 Par22on 2
Old
New
26. © 2016 Dbvisit Software | dbvisit.com
Ka9a
• Ka9a treats each topic par22on as a log (a sequen2al ordered set of
messages)
• You can call Ka9a a log reader and a log writer
27. © 2016 Dbvisit Software | dbvisit.com
Ka9a
• Log compac2on/log reten2on
• Ka9a Streams – the new stuff from Confluent
-‐ No need for Spark or other tools
-‐ Pure streaming of the data -‐ process data “on the fly”
-‐ Ka9a 0.10.0
28. © 2016 Dbvisit Software | dbvisit.com
Marriage of two worlds
• If we mix the ‘old world’ log readers with the new world log readers and
writers.
• Blended technology
-‐ Using the Oracle logical replica2on tools with Ka9a as the message
broker
-‐ Oracle becomes ‘just another feed for Ka9a’
29. © 2016 Dbvisit Software | dbvisit.com
Oracle Redo logs
• Reading the Oracle redo logs is not easy. Oracle doesn’t really publish the
API.
• Because of this replica2on companies have ‘sprung up’ around the moving of
Oracle data.
31. © 2016 Dbvisit Software | dbvisit.com
Logical Replica2on to Ka9a high level
overview
JSON
THL
33. © 2016 Dbvisit Software | dbvisit.com
Key Concepts
Real-‐Time Data/Event Streaming
• A con2nuous flow of instantaneous data with as close to zero latency as possible.
Real-‐Time Stream Processing
• Systems that con2nuously process incoming data, and will con2nue to process that
incoming data un2l the applica2on is stopped, rather than opera2ng on a fixed set of data.
• Indica2ve use cases:
-‐ Financial Trading
-‐ Real-‐2me System Monitoring
-‐ Business Intelligence
-‐ Real-‐2me Analy2cs
34. © 2016 Dbvisit Software | dbvisit.com
Automo2ve: Ka9a Streaming
OLTP and Ka9a
Streaming data that can be USED as it moves
• Weather
• Tolls
• Sensor
• Mileage data
• Tire pressure
• GPS
35. © 2016 Dbvisit Software | dbvisit.com
Healthcare: Ka9a Streaming
OLTP and Ka9a
• Prescrip2ons
• Insurance
• Medical devices
• Medical history
• etc
37. © 2016 Dbvisit Software | dbvisit.com
© 2016 Dbvisit Software | dbvisit.com
Thank you
Q & A