SlideShare una empresa de Scribd logo
1 de 26
BI Converges with AI - GPUs for Fast Data
James Mesney | Principal Solution Engineer | Kinetica EMEA
Analytics challenges faced by
US Army Intelligence
2
Kinetica incubated as a massively parallel
computational engine for US Army INSCOM
200+ sources of streaming data producing 20B
new records per day.
Requirements to do ad-hoc analysis with human-
response time on hot data.
Reduce reliance on expensive racks of premium
hardware.
Why a GPU Database? Why Now?
• Leverage Innovations in CPU and GPU technology
• Big Data
• In-memory and Parallel Processing
• Traditional Analytics
• Emerging AI/ML/Deep Learning Computing
• Real-Time and Streaming Data
• Geospatial and Temporal
• Use Commodity Hardware (and less of it)
• With Simplified Architecture / software stack
3
Why GPUs?
3
“By 2020, 80% of Big Data and Analytics deployments will need
distributed micro analytics and 40% of all business analytics software
will incorporate prescriptive analytics built on cognitive computing
functionality. Both of these trends require a dramatic increase in
processing power that could be enabled by GPUs.”
— IDC
“By 2018, over 50% of developer teams will embed cognitive services
in their apps (vs 1% today) providing U.S. enterprises with over
$60 billion annual savings by 2020.”
— IDC
5
5,000+ cores per device
versus 16 to 32 cores per
typical CPU device.
High performance computing
trend to using GPU’s to solve
massive processing
challenges GPU acceleration brings high
performance compute to
commodity hardware
Parallel processing is ideal for
scanning entire dataset &
brute force compute.
GPUs are designed around thousands of small, efficient cores that are well suited to performing repeated
similar instructions in parallel. This makes them well-suited to the compute-intensive workloads required of
large data sets.
What is a GPU?
GPU Benefits – One Tenth of the Hardware
SOLUTION
• Replacing a 300 node database cluster with
30 nodes of Kinetica powered by GPUs
BENEFITS
• 1/10 the size
• 100x to 200x faster than other In-memory
Databases
• Significant datacenter operations cost
savings – headcount, environmental
footprint, etc
• Better deployment flexibility
• Very high performance, at scale
CPU clusters
NVIDIA
Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp
1980 1990 2000 2010 2020
102
103
104
105
106
107
Single-threaded perf
1.5X per year
1.1X per year
GPU-Computing perf
1.5X per year 1000X
By 2025
The Rise of GPU Computing
SpecINT
Video
Mythbusters CPU vs GPU Demonstration!
Product
What is Kinetica?
GPU-accelerated
In-memory, MPP
Relational Database
Natural language
processing and
full-text search
Native Geo-spatial
support and Data
Visualisation
Real time data
handlers to ingest
structured and
unstructured data
Deep integration with open
source and commercial
frameworks / apps: TensorFlow,
Hadoop, Spark, NiFi, Kafka, Storm,
Tableau, Kibana, Caravel…
Linear, predictable
scale out for data
ingestion, retention
and querying
No typical tuning,
indexing, and
tweaking
Huge range of API’s:
ODBC, JDBC, SQL, Java,
JS, C++, Python, C#,
Node.js, REST
KINETICA
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory Database
HTTP Head Node
KINETICA
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory Database
HTTP Head Node
Kinetica: Core
12
ANALYTICS DATABASE ACCELERATED BY GPUs
Columnar in-memory database. Data persisted to disk
Data available much like a traditional RDBMS… tables,
rows, columns, views
Interact with Kinetica through its native REST API, Java,
Python, JavaScript, NodeJS, C++, SQL, ODBC, JDBC.
Native GIS support and Visualisation
High concurrency
Security + Administration + Backup + Monitoring + Audit
Typical hardware setup: 256GB –
1.5TB memory. 2-4 GPUs per node.
KINETICA
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory Database
Interfaces & Orchestration
Kinetica High-Level Architecture
VISUALIZATION via ODBC/JDBCAPIs
Java API
JavaScript API
REST API
C# and C++ API
Node.js API
Python API
OPEN SOURCE
INTEGRATION
Apache NiFi
Apache Kafka
Apache Spark
Apache Storm
GEOSPATIAL CAPABILITIES
Geometric
Objects
Tracks
Geospatial
Endpoints
WMS
WKT
KINETICA CLUSTER On-Demand Scale
OTHER
INTEGRATION
Message Queues
ETL Tools
Streaming Tools
SERVER 1 SERVER 2 SERVER 3 SERVER n…
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory
Database
Coordination &
Orchestration
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory
Database
Coordination &
Orchestration
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory
Database
Coordination &
Orchestration
Commodity Hardware with GPUs
Disk
GPU Accelerated
Columnar In-memory
Database
Coordination &
Orchestration
Kinetica Reveal – Interactive Real-Time Data Exploration
14
Visualization Pipeline: Outputs, Maps, Video
15
RENDER MASSIVE DATASETS IN SUB-SECOND
e.g. 4bn Twitter posts on a map in < 1 second
VIDEOS for GEO-TEMPORAL VISUALISATION
HEAT MAPS
FIND MATCHING RESULTS IN A CUSTOM AREA
BI and AI Convergence
More Sophisticated Analytics Benefits from GPU
17
Simple
Reporting
Standard
Analytics
Real-time Analytics Machine
Learning
Deep Learning
List defaults from
customers in the last
3 years.
What is the default
rate for customers
over a certain age, by
region? by income?
What is the risk-
profile of this
customer up to and
including the
transactions he made
10 seconds ago?
Given location,
buying history,
demographic, past-
history, past-
purchases, what is
the likelihood this
customer will default?
Deduce from
unspecified signals
across a wide range
of datasets the
likelihood this
customer will default?
INCREASING BENEFIT FROM GPUs
Advanced In-Database Analytics
1. User-defined functions (UDFs) can receive table data, do
arbitrary computations, and save output to a separate table
in a distributed manner.
2. UDFs have direct access to CUDA APIs – enables compute-
to-grid analytics for logic deployed within Kinetica.
3. Works with custom code, or packaged code. Opens the way
for machine learning/artificial intelligence libraries such as
TensorFlow, BIDMach, Caffe and Torch to work on data
directly within Kinetica.
4. Available now with C++, Python & Java bindings.
18
ORCHESTRATION LAYER WITH USER-DEFINED FUNCTIONS (UDFs)
PHYSICAL / VIRTUAL SERVER
Table A
Table n
GPU
Data returned to
output table for
further analysis &
Visualisation
CUDA Libraries
n number of Kinetica servers
Table B
Table C
Proc Server
UDF_A UDF_B UDF_n
Execution
Applications
Kinetica Enables Broad Enterprise Solutions
RETAIL/CPG
Omni-Channel
Customer Experience
Supply Chain Optimization
Targeted Marketing
UTILITIES
Smart Meters
Smart Grid Optimization
Infrastructure MGMT
CROSS INDUSTRY
Real-Time Analytics
Converge AI & BI
Location-Based Analytics
IoT Analytics
FINANCIAL SERVICES
Risk Modeling
Financial Crimes
Compliance
Customer Experience
HEALTHCARE
Drug Development
Precision Medicine
Patient 360
MEDIA/ENTERTAINMENT
Sentiment Analytics
Recommendation Engines
Ad Targeting
COMMUNICATIONS
Customer Churn
Network Optimization
Content Targeting
LAW ENFORCEMENT
INTEL & DEFENCE
Cyber Security
Counter-Terrorism
Border Control
Threat Detection 16
CASE STUDY : LOCATION BASED ANALYTICS
INTELLIGENCE: US Army - INSCOM
Oracle Spatial
(92 Minutes)
42x Lower Space
28x Lower Cost
38x Lower Power Cost
U.S Army INSCOM Migrated from Oracle to Kinetica
GPUdb
(20ms)
1 GPUdb server vs. 42 servers with Oracle 10gR2 (2011)
MISSION OBJECTIVE
• Kill or capture terrorists in real-time
• Move from document-based to entity-based search
NEW CAPABILITIES DELIVERED
• Intel analysts can do real-time geospatial analytics on 200B
new records per day from 200+ UAV, SIGINT, ISR, and
GEOINT streaming big data feeds
• Military analysts are able to query and visualise billions to
trillions of near real-time objects
SOLUTION OVERVIEW
• US Army’s in-memory computational engine for geospatial
and temporal data. A major joint cloud initiative within the
Intelligence Community (IC ITE)
• Queries down from 92 minutes to less than 1 second
• Replaced 42 x Oracle 10gR2 servers with SINGLE Kinetica
server – 42x lower space, 28x lower cost, 38X less power
Availability
Kinetica Platform Availability
27
Certified to run on-premise with:
Or in the cloud:
Accelerated by:
Coming soon:
Test Drive Kinetica @ www.kinetica.com/trial
28
Demonstration
James Mesney | Principal Solution Engineer | Kinetica EMEA

Más contenido relacionado

La actualidad más candente

Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
Ulf Mattsson
 
Proof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-seriesProof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-series
DataWorks Summit
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
Raul Chong
 
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Con LA
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
DataWorks Summit
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
Databricks
 

La actualidad más candente (20)

Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
 
LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGate
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Big data using Public Cloud
Big data using Public CloudBig data using Public Cloud
Big data using Public Cloud
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
Proof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-seriesProof of Concept for Hadoop: storage and analytics of electrical time-series
Proof of Concept for Hadoop: storage and analytics of electrical time-series
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao KambleGoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
 
Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data Science
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
 
Enabling Fast Data Strategy: What’s new in Denodo Platform 6.0
Enabling Fast Data Strategy: What’s new in Denodo Platform 6.0Enabling Fast Data Strategy: What’s new in Denodo Platform 6.0
Enabling Fast Data Strategy: What’s new in Denodo Platform 6.0
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Similar a Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data

Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Kinetica
 

Similar a Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data (20)

GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
M|18 GPU Accelerated Data Processing
M|18 GPU Accelerated Data ProcessingM|18 GPU Accelerated Data Processing
M|18 GPU Accelerated Data Processing
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
 
Omniverse for the Metaverse
Omniverse for the MetaverseOmniverse for the Metaverse
Omniverse for the Metaverse
 
Hardware in Space
Hardware in SpaceHardware in Space
Hardware in Space
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services WebinarDell NVIDIA AI Powered Transformation in Financial Services Webinar
Dell NVIDIA AI Powered Transformation in Financial Services Webinar
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
GTC-DC 2017 Session: Advanced Analytics and Machine Learning with Geospatial ...
GTC-DC 2017 Session: Advanced Analytics and Machine Learning with Geospatial ...GTC-DC 2017 Session: Advanced Analytics and Machine Learning with Geospatial ...
GTC-DC 2017 Session: Advanced Analytics and Machine Learning with Geospatial ...
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Vertex Perspectives | AI-optimized Chipsets | Part I
Vertex Perspectives | AI-optimized Chipsets | Part IVertex Perspectives | AI-optimized Chipsets | Part I
Vertex Perspectives | AI-optimized Chipsets | Part I
 
Vertex perspectives ai optimized chipsets (part i)
Vertex perspectives   ai optimized chipsets (part i)Vertex perspectives   ai optimized chipsets (part i)
Vertex perspectives ai optimized chipsets (part i)
 

Más de Matt Stubbs

Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Matt Stubbs
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Matt Stubbs
 

Más de Matt Stubbs (20)

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data Platform
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPR
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
 

Último

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Último (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data

  • 1. BI Converges with AI - GPUs for Fast Data James Mesney | Principal Solution Engineer | Kinetica EMEA
  • 2. Analytics challenges faced by US Army Intelligence 2 Kinetica incubated as a massively parallel computational engine for US Army INSCOM 200+ sources of streaming data producing 20B new records per day. Requirements to do ad-hoc analysis with human- response time on hot data. Reduce reliance on expensive racks of premium hardware.
  • 3. Why a GPU Database? Why Now? • Leverage Innovations in CPU and GPU technology • Big Data • In-memory and Parallel Processing • Traditional Analytics • Emerging AI/ML/Deep Learning Computing • Real-Time and Streaming Data • Geospatial and Temporal • Use Commodity Hardware (and less of it) • With Simplified Architecture / software stack 3
  • 4. Why GPUs? 3 “By 2020, 80% of Big Data and Analytics deployments will need distributed micro analytics and 40% of all business analytics software will incorporate prescriptive analytics built on cognitive computing functionality. Both of these trends require a dramatic increase in processing power that could be enabled by GPUs.” — IDC “By 2018, over 50% of developer teams will embed cognitive services in their apps (vs 1% today) providing U.S. enterprises with over $60 billion annual savings by 2020.” — IDC
  • 5. 5 5,000+ cores per device versus 16 to 32 cores per typical CPU device. High performance computing trend to using GPU’s to solve massive processing challenges GPU acceleration brings high performance compute to commodity hardware Parallel processing is ideal for scanning entire dataset & brute force compute. GPUs are designed around thousands of small, efficient cores that are well suited to performing repeated similar instructions in parallel. This makes them well-suited to the compute-intensive workloads required of large data sets. What is a GPU?
  • 6. GPU Benefits – One Tenth of the Hardware SOLUTION • Replacing a 300 node database cluster with 30 nodes of Kinetica powered by GPUs BENEFITS • 1/10 the size • 100x to 200x faster than other In-memory Databases • Significant datacenter operations cost savings – headcount, environmental footprint, etc • Better deployment flexibility • Very high performance, at scale CPU clusters NVIDIA
  • 7. Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp 1980 1990 2000 2010 2020 102 103 104 105 106 107 Single-threaded perf 1.5X per year 1.1X per year GPU-Computing perf 1.5X per year 1000X By 2025 The Rise of GPU Computing SpecINT
  • 8. Video Mythbusters CPU vs GPU Demonstration!
  • 9.
  • 11. What is Kinetica? GPU-accelerated In-memory, MPP Relational Database Natural language processing and full-text search Native Geo-spatial support and Data Visualisation Real time data handlers to ingest structured and unstructured data Deep integration with open source and commercial frameworks / apps: TensorFlow, Hadoop, Spark, NiFi, Kafka, Storm, Tableau, Kibana, Caravel… Linear, predictable scale out for data ingestion, retention and querying No typical tuning, indexing, and tweaking Huge range of API’s: ODBC, JDBC, SQL, Java, JS, C++, Python, C#, Node.js, REST
  • 12. KINETICA Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database HTTP Head Node KINETICA Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database HTTP Head Node Kinetica: Core 12 ANALYTICS DATABASE ACCELERATED BY GPUs Columnar in-memory database. Data persisted to disk Data available much like a traditional RDBMS… tables, rows, columns, views Interact with Kinetica through its native REST API, Java, Python, JavaScript, NodeJS, C++, SQL, ODBC, JDBC. Native GIS support and Visualisation High concurrency Security + Administration + Backup + Monitoring + Audit Typical hardware setup: 256GB – 1.5TB memory. 2-4 GPUs per node. KINETICA Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database Interfaces & Orchestration
  • 13. Kinetica High-Level Architecture VISUALIZATION via ODBC/JDBCAPIs Java API JavaScript API REST API C# and C++ API Node.js API Python API OPEN SOURCE INTEGRATION Apache NiFi Apache Kafka Apache Spark Apache Storm GEOSPATIAL CAPABILITIES Geometric Objects Tracks Geospatial Endpoints WMS WKT KINETICA CLUSTER On-Demand Scale OTHER INTEGRATION Message Queues ETL Tools Streaming Tools SERVER 1 SERVER 2 SERVER 3 SERVER n… Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database Coordination & Orchestration Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database Coordination & Orchestration Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database Coordination & Orchestration Commodity Hardware with GPUs Disk GPU Accelerated Columnar In-memory Database Coordination & Orchestration
  • 14. Kinetica Reveal – Interactive Real-Time Data Exploration 14
  • 15. Visualization Pipeline: Outputs, Maps, Video 15 RENDER MASSIVE DATASETS IN SUB-SECOND e.g. 4bn Twitter posts on a map in < 1 second VIDEOS for GEO-TEMPORAL VISUALISATION HEAT MAPS FIND MATCHING RESULTS IN A CUSTOM AREA
  • 16. BI and AI Convergence
  • 17. More Sophisticated Analytics Benefits from GPU 17 Simple Reporting Standard Analytics Real-time Analytics Machine Learning Deep Learning List defaults from customers in the last 3 years. What is the default rate for customers over a certain age, by region? by income? What is the risk- profile of this customer up to and including the transactions he made 10 seconds ago? Given location, buying history, demographic, past- history, past- purchases, what is the likelihood this customer will default? Deduce from unspecified signals across a wide range of datasets the likelihood this customer will default? INCREASING BENEFIT FROM GPUs
  • 18. Advanced In-Database Analytics 1. User-defined functions (UDFs) can receive table data, do arbitrary computations, and save output to a separate table in a distributed manner. 2. UDFs have direct access to CUDA APIs – enables compute- to-grid analytics for logic deployed within Kinetica. 3. Works with custom code, or packaged code. Opens the way for machine learning/artificial intelligence libraries such as TensorFlow, BIDMach, Caffe and Torch to work on data directly within Kinetica. 4. Available now with C++, Python & Java bindings. 18 ORCHESTRATION LAYER WITH USER-DEFINED FUNCTIONS (UDFs) PHYSICAL / VIRTUAL SERVER Table A Table n GPU Data returned to output table for further analysis & Visualisation CUDA Libraries n number of Kinetica servers Table B Table C Proc Server UDF_A UDF_B UDF_n Execution
  • 20. Kinetica Enables Broad Enterprise Solutions RETAIL/CPG Omni-Channel Customer Experience Supply Chain Optimization Targeted Marketing UTILITIES Smart Meters Smart Grid Optimization Infrastructure MGMT CROSS INDUSTRY Real-Time Analytics Converge AI & BI Location-Based Analytics IoT Analytics FINANCIAL SERVICES Risk Modeling Financial Crimes Compliance Customer Experience HEALTHCARE Drug Development Precision Medicine Patient 360 MEDIA/ENTERTAINMENT Sentiment Analytics Recommendation Engines Ad Targeting COMMUNICATIONS Customer Churn Network Optimization Content Targeting LAW ENFORCEMENT INTEL & DEFENCE Cyber Security Counter-Terrorism Border Control Threat Detection 16
  • 21. CASE STUDY : LOCATION BASED ANALYTICS INTELLIGENCE: US Army - INSCOM Oracle Spatial (92 Minutes) 42x Lower Space 28x Lower Cost 38x Lower Power Cost U.S Army INSCOM Migrated from Oracle to Kinetica GPUdb (20ms) 1 GPUdb server vs. 42 servers with Oracle 10gR2 (2011) MISSION OBJECTIVE • Kill or capture terrorists in real-time • Move from document-based to entity-based search NEW CAPABILITIES DELIVERED • Intel analysts can do real-time geospatial analytics on 200B new records per day from 200+ UAV, SIGINT, ISR, and GEOINT streaming big data feeds • Military analysts are able to query and visualise billions to trillions of near real-time objects SOLUTION OVERVIEW • US Army’s in-memory computational engine for geospatial and temporal data. A major joint cloud initiative within the Intelligence Community (IC ITE) • Queries down from 92 minutes to less than 1 second • Replaced 42 x Oracle 10gR2 servers with SINGLE Kinetica server – 42x lower space, 28x lower cost, 38X less power
  • 23. Kinetica Platform Availability 27 Certified to run on-premise with: Or in the cloud: Accelerated by: Coming soon:
  • 24. Test Drive Kinetica @ www.kinetica.com/trial 28
  • 26. James Mesney | Principal Solution Engineer | Kinetica EMEA