DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc.
1. Kx Technology – a Big Data Solution
Kx Community Zurich Meetup
Kyra Coyne
November 2016
2. 2
• Global company, division of First Derivatives plc (listed on LSE)
• Large user community
• Widely adopted in financial services over two decades
• Software & industry solutions, consulting and implementation services
Known for:
• Processing and analysis of large volumes of real-time and historical time series data
• Extreme performance (low latency)
• Integrates with and co-exists with other technologies
• Ability to scale without requiring significant infrastructure
About Kx
3. 3
About the Technology
• Integrated in-memory,
columnar database &
programming system
• Streaming, real-time and
historical data
• Map-Reduce built-in
• Native time-series
functions
• Light-weight (~500kb)
• Standard OS & hardware
• Extreme Performance
4. 4
Kx Technology
• Integrated columnar database & programming system
• Streaming, real time and historical data
• Built for massive data volumes
• In-database analytics
• Parallelism
• Compression
WHAT IS Kx TECHNOLOGY
KEY FEATURES
• Interpreted
• Event-driven
• Functional
• Array / Vector
• Query
• Time-series
q PROGRAMMING LANGUAGE
WHY Kx TECHNOLOGY?
HIGH PERFORMANCE, LOW LATENCY
• We are fast not only due to data architecture, our
native programming language, q, runs inside the
database not in separate processes with costly
data passing.
POWER
• We are one of the few fully 64 bit databases
and unique in having time as a native type, with
nanosecond resolution and a full set of operations
over time.
QUICK TO DELIVER
• Unlike many compile-link and run approaches q is
dynamic allowing much shorter development and
deployment cycles
5. 5
The Big Data Landscape
4
• 20 year track record; mission
critical systems
• Streaming, real-time and
historical data
• Processing and analyzing
data in microseconds
• Hundreds of millions of
transactions per second
• Terabytes to petabytes
• Trusted globally by largest
institutions
6. 6
COST
• Fully transparent costing models
• Reduction in personnel, training, hardware and facilities costs
for clients
SECURITY
• Robust, high performance infrastructure
• Highly Secure
• Comprehensive disaster recovery and business continuity planning
SERVICE
• Best Practice and processes
• Large pool of highly skilled engineers
RISK
• Scalable model to respond to changes in demand
7. 7
• Large Canadian Utility (IESO)
• Meter Data Management System
• Processes 4.7 million meters
• 120 million meter readings per day
• Oracle RDBMS has 300+ billion records
• Could not accommodate demand for
analytics
Downstream Case Study
8. 8
Downstream Pattern Case Study (before)
Meter Data Management
Distributors
CIS / AMI
System
Consumers
Billing
Statement
Web
Presentment
Meter Data
Master
Data
Meter
Data
Meter Data
Billing Request
Billing Response
Web Service
Request
Web Service
Response
Reports
Master Data
9. 9
Downstream Pattern Case Study (after)
Meter Data Management
Distributors
CIS / AMI
System
Consumers
Billing
Statement
Web
Presentment
Meter Data
Meter Reads
Retrieval
Web Services
Master
Data
Meter
Data
Meter Data
Billing Request
Billing Response
Web Service
Request
Web Service
Response
Reports
Master Data
Transform &
Load
Bulk Data
Extract
Change
Data
Capture
(Real Time)
Initial
Extract
Intra-
Day
Extract
Queries &
Visualisation
Kdb+
Database
Kx Technologies
11. 11
Downstream Pattern
Pros
• Maintains investment in
existing system
• Rapid implementation of Kx
technology
• Low risk or impact on existing
system
• Functionality and availability
improved
Cons
• Added storage and possibly
licensing costs
• Updates must flow through
existing system
• Doesn’t address streaming
Data Feed
Existing System
of
Record
Time series
& Master
Data in kdb+
Kx Technologies
Ad hoc
Queries
Analysis
Dashboards
Time series data
Real-time or scheduled replication
Query results
13. 13
Kx Use Cases
BUSINESS USE CASES APPLICATION
Real-time Analytics Tick-capture and streaming data is analysed and enriched in real-time to produce
live indicators of current conditions for further action.
Quantitative Research Run trade, quote and fundamental analysis on large datasets and produce trading
indicators faster.
Risk Management Intra-day, pricing, credit, exposure and P&L alerts with visual tools, including heat
maps and OLAP drill downs to monitor activity.
Market Surveillance Implement trading control alerts related to in-house and regulatory requirements, as
well as the generation of planning reports.
Depth-of-Book Analysis Create real-time depth-of-book views for any instrument across thousands of
symbols. Build order books from disk in sub second.
Network and Hardware
Management
Manage multi-server distributed environments from a single dashboard. Monitor the
health of thousands of processes and servers across plants spread throughout
multiple regions.
Internet of Things (IoT) Real-time capture and processing of data generated by sensors in machines, homes,
cars, smart meters, mobile phones and other devices.
16. 16
Kx Performance
Legend
• DNF = Did Not Finish
• RAM is memory used for queries
• Query times are in milliseconds
kdb+ is 10 to 100 times faster than other colstore
(vertical, big3accel, hadoop / impala / parquet, ..)
kdb+ is 100 to 1000 times faster than the rowstore
(postgres, big3rdbms, mongodb, spark, ..)
17. 17
Reference Architecture
| Kx Systems
Server 1
2
n
RDB 1 RDB n RTE
GW
Server 1
2
n
FH 1 FH 2 FH n
TP
Real-time Data
Server 1
2
n
PDB
HDB 1 HDB n
Client
Legend
Data Flow
Queries & Results
Data Persistence
Acronyms
FH = Feed Handler
TP = Tickerplant
RDB = Realtime DB
RTE = CEP Engine
GW = Gateway
HDB = Historic DB
PDB = Persisting DB
19. 19
Engineering for Performance
The worlds leading time series database, specifically designed for handling massive data volumes and real-time streaming analytics.
• To improve search performance and data consumption we apply sharding, where data is split
between multiple servers.
• Stream for Kx offers inbuilt horizontal scaling across all data micro services.
• Horizontal scalability is applied at the point of data capture and data querying.
• In-built Map Reduce means that results are virtually instantaneous without the additional
overhead of defining unique aggregation logic.
• For increased capacity, we use replication. This is the process of mirroring our data-set.
20. 20
How do we use our technology?
• Simplify both real-time & historical data using one powerful
enterprise platform.
TOOLS for Kx - HARNESS THE POWER OF DATA
Kx - CORE TECHNOLOGY
• Streaming analytics, in-memory compute and database
technology, providing a full application server with a
powerful functional scripting language.
Kx SOLUTIONS - BUILD POWERFUL BUSINESS INSIGHTS
• Our tools are used to accelerate implementation of proven solutions
for complex problems.
OUR PEOPLE - CHALLENGE US WITH YOUR UNIQUE PROBLEMS
• Our engineers develop, deploy and support solutions for virtually
any problem involving massive amounts of data
21. 21
Our Industry Solutions
Stream for Kx
Tools to rapidly develop and deploy streaming real-time and historical analytics
Kx Technology
Kx for
Algo
Build, Test, and
deploy
Algorithmic
Trading Strategies
Kx for
Pharma
Kx for
Analytics
Real Time and
Historical Market
Analytics
Kx for
Sensors
Dashboards for Kx
Kx for
Surveillance
Real time market
monitoring
Surveillance
Workflow
IMS patient
record analytics,
manufacturing,
clinical research
Smart meters,
utilities,
geolocation,
customer
analytics
22. 22
Kx® and kdb+ are registered trademarks of Kx Systems, Inc., a subsidiary of First Derivatives plc
Resources
• Free 32bit download version: http://kx.com/software-download.php
• Kx Wiki: http://code.kx.com/wiki/Main_Page
• Google Group: https://groups.google.com/forum/#!forum/personal-kdbplus
• Kx Github: http://kxsystems.github.io/
• STAC benchmarks: https://stacresearch.com/kx
• Kx Meetups: http://kx.meetup.com/
Kyra Coyne
kcoyne@firstderivatives.com
+4917659883653