SlideShare una empresa de Scribd logo
1 de 22
How we broke Apache Ignite by adding
persistence
Stephen Darlington
16 December, 2019
2018 © GridGain Systems
2019 © GridGain Systems GridGain Company Confidential2
(spoiler: already fixed)
2019 © GridGain Systems GridGain Company Confidential
What is Ignite?
Distributed memory-centric storage
Combines the performance and scale of in-
memory computing together with the disk
durability and strong consistency in one system
Co-located Computations
Brings the computations to the servers where
the data actually resides, eliminating need to
move data over the network
Distributed Key-Value
Read, write and transact with
fast key-value APIs
Distributed SQL ACID Transactions Machine and Deep Learning
Horizontally, fault-tolerant distributed SQL
database that treats memory and disk as
active storage tiers
Supports distributed ACID transactions for
key-value as well as SQL operations
Set of simple, scalable and efficient tools that
allow building predictive machine learning
models without costly data transfers (ETL)
2019 © GridGain Systems GridGain Company Confidential
Apache Ignite In-Memory Computing Platform
Mainframe NoSQL HadoopIgnite Persistence
Persistent Layer
RDBMS
Machine and Deep Learning
EventsStreamingMessaging
Transactio
ns
SQLKey-Value
Service GridCompute Grid
Application Layer
Web SaaS SocialMobile IoT
In-Memory Data Store
2019 © GridGain Systems GridGain Company Confidential
Apache Ignite’s History
5
Data Grid Local Store
Transactional
Persistence
?
2019 © GridGain Systems GridGain Company Confidential
Circa 2011
6
Local Store
Transactional
Persistence
?Data Grid
2019 © GridGain Systems GridGain Company Confidential
Circa 2014
7
Transactional
Persistence
?Data Grid Local Store
2019 © GridGain Systems GridGain Company Confidential
Circa 2017
• Start time does not depend on the data volume
• Can store more data than memory
• Crash recovery
• Single in-memory & native persistence architecture
2019 © GridGain Systems GridGain Company Confidential
Ignite 2.0: What we wanted
“We will just save everything to disk”
2019 © GridGain Systems GridGain Company Confidential
Circa 2017
10
?Data Grid Local Store
Transactional
Persistence
2019 © GridGain Systems GridGain Company Confidential
Beginning: Durable Memory
2019 © GridGain Systems GridGain Company Confidential
Beginning: Durable Memory
• ARIES Architecture
• Page-based
• Write-ahead log (when persistence is enabled)
• Everything is off heap
2019 © GridGain Systems GridGain Company Confidential
Beginning: Durable Memory
• PK Index: how to replace a HashMap
• Concurrent B+ Tree: a well-known data structure
• Separate PK Index per each partition
• Compare key hash first
• Bonus: guaranteed iteration order in a hash map
2019 © GridGain Systems GridGain Company Confidential
Baseline Topology
16
• [16:21:01] Ignite node started OK (id=326bab44)
• [16:21:01] >>> Ignite cluster is not active (limited functionality available). Use control.(sh|bat)
script or IgniteCluster interface to activate.
• [16:21:01] Topology snapshot [ver=1, locNode=326bab44, servers=1, clients=0, state=INACTIVE, CPUs=8,
offheap=3.2GB, heap=3.6GB]
• [16:21:01] ^-- Baseline [id=11, size=3, online=1, offline=2]
• [16:21:01] ^-- 2 nodes left for auto-activation [6213b7af-23bb-4c8d-a045-157d7f2d7718, db969788-
fc01-41f4-a91c-c03f2d201f76]
• [16:21:19] Joining node doesn't have encryption data [node=89b6ef6c-1055-4678-bcfa-00fb222208ce]
• [16:21:19] Topology snapshot [ver=2, locNode=326bab44, servers=2, clients=0, state=INACTIVE, CPUs=8,
offheap=6.4GB, heap=7.1GB]
• [16:21:19] ^-- Baseline [id=11, size=3, online=2, offline=1]
• [16:21:19] ^-- 1 nodes left for auto-activation [6213b7af-23bb-4c8d-a045-157d7f2d7718]
• [16:21:37] Joining node doesn't have encryption data [node=dd55ff24-da61-42cd-bbaf-c7940fab07d3]
• [16:21:37] Topology snapshot [ver=3, locNode=326bab44, servers=3, clients=0, state=INACTIVE, CPUs=8,
offheap=9.6GB, heap=11.0GB]
• [16:21:37] ^-- Baseline [id=11, size=3, online=3, offline=0]
• [16:21:37] ^-- All baseline nodes are online, will start auto-activation
2019 © GridGain Systems GridGain Company Confidential
Disk. Predictable access speed
• Disks are slow (even NVMe)
• At peak load naïve implementation steps on it’s tail easily
• Sudden performance drops to 0
2019 © GridGain Systems GridGain Company Confidential
Disk. Predictable access speed
• So, we need to… make Ignite slower
• Throttle input load depending on
• How fast we produce “dirty” pages
• How fast we write to disk
• How free the Copy-On-Write buffer is
2019 © GridGain Systems GridGain Company Confidential
Disk. Predictable access speed
• Page cache: what can go wrong
• We already have one-page cache in Ignite (durable memory)
• OS-level page cache
• Effectively doubles the memory consumption
2019 © GridGain Systems GridGain Company Confidential
Disk. Predictable access speed
• Page cache: solution is Direct IO
• Available in Java 10, but we build on Java 8
• Need to have native/platform specific calls or Java-
dependent module
2019 © GridGain Systems GridGain Company Confidential
The future?
21
Data Grid Local Store
Transactional
Persistence
?
2019 © GridGain Systems GridGain Company Confidential22
Questions?
2019 © GridGain Systems GridGain Company Confidential
More information
23
• Main landing page: https://ignite.apache.org
• Documentation: https://apacheignite.readme.io/docs
• Please complete our survey on how Apache Ignite should evolve:
https://docs.google.com/forms/d/e/1FAIpQLSdUveEVXer3lpkyiqfFw4175T
vZzGHUOS4snPfnkO0NDku0eQ/viewform
• Realtime data loading: https://www.imcsummit.org/2019/us/session/best-
practices-loading-real-time-data-distributed-systems-change-data-capture
2019 © GridGain Systems GridGain Company Confidential24
Stephen Darlington
Senior Consultant, GridGain Systems
@sdarlington

Más contenido relacionado

La actualidad más candente

20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stackAlexandre BERGERE
 
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...Edureka!
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summits
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute EngineArun Nagarajan
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformDr. Ketan Parmar
 
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...OCCIware
 
StackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinStackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinBoyd Hemphill
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud PlatformGeneXus
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Chris Jang
 
Google cloud computing
Google cloud computingGoogle cloud computing
Google cloud computingBrian Pichman
 
code lab live Google Cloud Endpoints [DevFest 2015 Bari]
code lab live Google Cloud Endpoints [DevFest 2015 Bari]code lab live Google Cloud Endpoints [DevFest 2015 Bari]
code lab live Google Cloud Endpoints [DevFest 2015 Bari]Nicola Policoro
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platformdhruv_chaudhari
 
Intro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersIntro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersLynn Langit
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Chetan Sharma
 
Microservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteMicroservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteDenis Magda
 
Apache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationApache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationAriel Jatib
 
Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Korea
 
Build with all of Google Cloud
Build with all of Google CloudBuild with all of Google Cloud
Build with all of Google Cloudwesley chun
 

La actualidad más candente (20)

20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack20210427 azure lille_meetup_azure_data_stack
20210427 azure lille_meetup_azure_data_stack
 
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
 
GCP Cloud Storage Security
GCP Cloud Storage SecurityGCP Cloud Storage Security
GCP Cloud Storage Security
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...
OCCIware, an extensible, standard-based XaaS consumer platform to manage ever...
 
StackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinStackEngine Demo - Docker Austin
StackEngine Demo - Docker Austin
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
 
TIAD : Automate everything with Google Cloud
TIAD : Automate everything with Google CloudTIAD : Automate everything with Google Cloud
TIAD : Automate everything with Google Cloud
 
Google cloud computing
Google cloud computingGoogle cloud computing
Google cloud computing
 
code lab live Google Cloud Endpoints [DevFest 2015 Bari]
code lab live Google Cloud Endpoints [DevFest 2015 Bari]code lab live Google Cloud Endpoints [DevFest 2015 Bari]
code lab live Google Cloud Endpoints [DevFest 2015 Bari]
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Intro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersIntro to the Google Cloud for Developers
Intro to the Google Cloud for Developers
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)
 
Microservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteMicroservices Architectures With Apache Ignite
Microservices Architectures With Apache Ignite
 
Apache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationApache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database Orchestration
 
Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017
 
Build with all of Google Cloud
Build with all of Google CloudBuild with all of Google Cloud
Build with all of Google Cloud
 

Similar a How we broke Apache Ignite by adding persistence

OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricNETWAYS
 
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...Stephen Darlington
 
In-Memory Computing Essentials for Software Engineers
In-Memory Computing Essentials for Software EngineersIn-Memory Computing Essentials for Software Engineers
In-Memory Computing Essentials for Software EngineersDenis Magda
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Provectus
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...Tom Diederich
 
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017Codemotion
 
Loading data into Apache Ignite
Loading data into Apache IgniteLoading data into Apache Ignite
Loading data into Apache IgniteStephen Darlington
 
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...Dataconomy Media
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Stephen Darlington
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit
 
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
 Improving Apache Spark™ In-Memory Computing with Apache Ignite™ Improving Apache Spark™ In-Memory Computing with Apache Ignite™
Improving Apache Spark™ In-Memory Computing with Apache Ignite™Tom Diederich
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
 
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...In-Memory Computing Summit
 
How to become an big data rockstar in 15 minutes - Akmal Chaudhri
How to become an big data rockstar in 15 minutes - Akmal ChaudhriHow to become an big data rockstar in 15 minutes - Akmal Chaudhri
How to become an big data rockstar in 15 minutes - Akmal ChaudhriDataconomy Media
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDenis Magda
 
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...Tom Diederich
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFaithWestdorp
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with SnowflakeMatillion
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using HadoopDataWorks Summit
 

Similar a How we broke Apache Ignite by adding persistence (20)

OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
 
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
 
In-Memory Computing Essentials for Software Engineers
In-Memory Computing Essentials for Software EngineersIn-Memory Computing Essentials for Software Engineers
In-Memory Computing Essentials for Software Engineers
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
 
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
 
Loading data into Apache Ignite
Loading data into Apache IgniteLoading data into Apache Ignite
Loading data into Apache Ignite
 
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...
Big Data London 2019 v.10 I 'Loading data into ignite' - Stephen Darlington, ...
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos Erotocritou
 
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
 Improving Apache Spark™ In-Memory Computing with Apache Ignite™ Improving Apache Spark™ In-Memory Computing with Apache Ignite™
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
 
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
 
Strangeloop2012
Strangeloop2012Strangeloop2012
Strangeloop2012
 
How to become an big data rockstar in 15 minutes - Akmal Chaudhri
How to become an big data rockstar in 15 minutes - Akmal ChaudhriHow to become an big data rockstar in 15 minutes - Akmal Chaudhri
How to become an big data rockstar in 15 minutes - Akmal Chaudhri
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
 
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...
Comparing Apache Ignite and Cassandra for Hybrid Transactional/Analytical Pro...
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deployment
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with Snowflake
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using Hadoop
 

Último

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 

Último (20)

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 

How we broke Apache Ignite by adding persistence

  • 1. How we broke Apache Ignite by adding persistence Stephen Darlington 16 December, 2019 2018 © GridGain Systems
  • 2. 2019 © GridGain Systems GridGain Company Confidential2 (spoiler: already fixed)
  • 3. 2019 © GridGain Systems GridGain Company Confidential What is Ignite? Distributed memory-centric storage Combines the performance and scale of in- memory computing together with the disk durability and strong consistency in one system Co-located Computations Brings the computations to the servers where the data actually resides, eliminating need to move data over the network Distributed Key-Value Read, write and transact with fast key-value APIs Distributed SQL ACID Transactions Machine and Deep Learning Horizontally, fault-tolerant distributed SQL database that treats memory and disk as active storage tiers Supports distributed ACID transactions for key-value as well as SQL operations Set of simple, scalable and efficient tools that allow building predictive machine learning models without costly data transfers (ETL)
  • 4. 2019 © GridGain Systems GridGain Company Confidential Apache Ignite In-Memory Computing Platform Mainframe NoSQL HadoopIgnite Persistence Persistent Layer RDBMS Machine and Deep Learning EventsStreamingMessaging Transactio ns SQLKey-Value Service GridCompute Grid Application Layer Web SaaS SocialMobile IoT In-Memory Data Store
  • 5. 2019 © GridGain Systems GridGain Company Confidential Apache Ignite’s History 5 Data Grid Local Store Transactional Persistence ?
  • 6. 2019 © GridGain Systems GridGain Company Confidential Circa 2011 6 Local Store Transactional Persistence ?Data Grid
  • 7. 2019 © GridGain Systems GridGain Company Confidential Circa 2014 7 Transactional Persistence ?Data Grid Local Store
  • 8. 2019 © GridGain Systems GridGain Company Confidential Circa 2017 • Start time does not depend on the data volume • Can store more data than memory • Crash recovery • Single in-memory & native persistence architecture
  • 9. 2019 © GridGain Systems GridGain Company Confidential Ignite 2.0: What we wanted “We will just save everything to disk”
  • 10. 2019 © GridGain Systems GridGain Company Confidential Circa 2017 10 ?Data Grid Local Store Transactional Persistence
  • 11. 2019 © GridGain Systems GridGain Company Confidential Beginning: Durable Memory
  • 12. 2019 © GridGain Systems GridGain Company Confidential Beginning: Durable Memory • ARIES Architecture • Page-based • Write-ahead log (when persistence is enabled) • Everything is off heap
  • 13. 2019 © GridGain Systems GridGain Company Confidential Beginning: Durable Memory • PK Index: how to replace a HashMap • Concurrent B+ Tree: a well-known data structure • Separate PK Index per each partition • Compare key hash first • Bonus: guaranteed iteration order in a hash map
  • 14. 2019 © GridGain Systems GridGain Company Confidential Baseline Topology 16 • [16:21:01] Ignite node started OK (id=326bab44) • [16:21:01] >>> Ignite cluster is not active (limited functionality available). Use control.(sh|bat) script or IgniteCluster interface to activate. • [16:21:01] Topology snapshot [ver=1, locNode=326bab44, servers=1, clients=0, state=INACTIVE, CPUs=8, offheap=3.2GB, heap=3.6GB] • [16:21:01] ^-- Baseline [id=11, size=3, online=1, offline=2] • [16:21:01] ^-- 2 nodes left for auto-activation [6213b7af-23bb-4c8d-a045-157d7f2d7718, db969788- fc01-41f4-a91c-c03f2d201f76] • [16:21:19] Joining node doesn't have encryption data [node=89b6ef6c-1055-4678-bcfa-00fb222208ce] • [16:21:19] Topology snapshot [ver=2, locNode=326bab44, servers=2, clients=0, state=INACTIVE, CPUs=8, offheap=6.4GB, heap=7.1GB] • [16:21:19] ^-- Baseline [id=11, size=3, online=2, offline=1] • [16:21:19] ^-- 1 nodes left for auto-activation [6213b7af-23bb-4c8d-a045-157d7f2d7718] • [16:21:37] Joining node doesn't have encryption data [node=dd55ff24-da61-42cd-bbaf-c7940fab07d3] • [16:21:37] Topology snapshot [ver=3, locNode=326bab44, servers=3, clients=0, state=INACTIVE, CPUs=8, offheap=9.6GB, heap=11.0GB] • [16:21:37] ^-- Baseline [id=11, size=3, online=3, offline=0] • [16:21:37] ^-- All baseline nodes are online, will start auto-activation
  • 15. 2019 © GridGain Systems GridGain Company Confidential Disk. Predictable access speed • Disks are slow (even NVMe) • At peak load naïve implementation steps on it’s tail easily • Sudden performance drops to 0
  • 16. 2019 © GridGain Systems GridGain Company Confidential Disk. Predictable access speed • So, we need to… make Ignite slower • Throttle input load depending on • How fast we produce “dirty” pages • How fast we write to disk • How free the Copy-On-Write buffer is
  • 17. 2019 © GridGain Systems GridGain Company Confidential Disk. Predictable access speed • Page cache: what can go wrong • We already have one-page cache in Ignite (durable memory) • OS-level page cache • Effectively doubles the memory consumption
  • 18. 2019 © GridGain Systems GridGain Company Confidential Disk. Predictable access speed • Page cache: solution is Direct IO • Available in Java 10, but we build on Java 8 • Need to have native/platform specific calls or Java- dependent module
  • 19. 2019 © GridGain Systems GridGain Company Confidential The future? 21 Data Grid Local Store Transactional Persistence ?
  • 20. 2019 © GridGain Systems GridGain Company Confidential22 Questions?
  • 21. 2019 © GridGain Systems GridGain Company Confidential More information 23 • Main landing page: https://ignite.apache.org • Documentation: https://apacheignite.readme.io/docs • Please complete our survey on how Apache Ignite should evolve: https://docs.google.com/forms/d/e/1FAIpQLSdUveEVXer3lpkyiqfFw4175T vZzGHUOS4snPfnkO0NDku0eQ/viewform • Realtime data loading: https://www.imcsummit.org/2019/us/session/best- practices-loading-real-time-data-distributed-systems-change-data-capture
  • 22. 2019 © GridGain Systems GridGain Company Confidential24 Stephen Darlington Senior Consultant, GridGain Systems @sdarlington

Notas del editor

  1. First, I’m not Alexey Second… we didn’t break it
  2. Alternate title: How and why Apache Ignite™ is Changing from an In-Memory Data Grid into an In-Memory Database. In this talk, we will follow the path that led Apache Ignite™ from a compute grid and data grid product to a distributed database and an in-memory computing platform. We will examine technical tasks and decisions that were driving the transformations. Step back: who knows about Ignite?
  3. Traditional databases don’t scale. Buy bigger and bigger boxes until you run out of money. Traditional compute grids have to copy data across the network, which at modern scale is just impractical. Ignite scales horizontally and sends compute to the data rather than the other way around. In memory for speed. Disk persistence for volume.
  4. Memory data grid Optionally backed by the “Cache Store” SQL works as fork-join – SQL executed on every node, results joined together Resuts correct results in limited use cases SQL only in-memory Restart means loading all data Centralised store – scalablity issues
  5. LocalStore designed to allow faster restarts Local Store is an append-only structure with periodical compactions Still need to load data from local store to memory (on datasets of 100s of GBs warmup could take 3+ hours) Keys are still residing in memory, so the data set size is limited SQL functionality is improved (two-step queries like plain ORDER BY, GROUP BY, some cases of distributed joins are implemented, but still there are restrictions, not all queries return correct results, a user must check collocation rules)   So better but still missing stuff. What are our requirements for The Best system?
  6. Algorithms for Recovery and Isolation Exploiting Semantics Ignite 1.x had options for on- and off-heap storage. 2.x only off-heap Unlike LocalStore (and competition), it uses a WAL… like legacy databases. Crash recovery
  7. ABA problem -- thread synchronization (https://en.wikipedia.org/wiki/ABA_problem)
  8. We’ve mostly talked about the Good Stuff.. But there are complications What’s happening here? Baseline topology
  9. … and the discussion about Java 10 brings us to … the futur
  10. Improve SQL What’s wrong with H2? H2 is not a distributed database with a very simple planner. Ignite uses H2 internal APIs and hacks to integrate with H2. No query execution graph. A query is effectively executed based on AST with minimum transformations H2 planner does not know about distributed SQL, it is almost impossible to execute complex SQL queries (subqueries, complex aggregations, non-colocated tables) effectively No memory control Currently investigating using Apache Calcite as a query optimizer + custom execution engine. See Apache Developers list (+IEP)   Improve Split-Brain handling out-of-the-box Require users to implement TopologyValidator Should work without users writing a single line of code Modularisation don’t download everything Dependencies for stuff like Spark / Kafka with different release schedules Even “internal” stuff like ML