In memory grids IMDG

100K times faster apps.
In Memory Grids
Prateek Jain

Agenda
• In Memory Grids
– 10,000 foot view.
– Present scenario
– Why
– Why now
• Use Cases
• Types of In Memory Grids
– Compute Grid
– Data Grid
• Reference Architecture
• Sample application demo
• Further Resources
• Questions & Feedback

In Memory Grids
• 10,000 foot view
Breaking your problem to solve it using multiple resources on network.
Using main memory instead of Disk to do file I/O.

BigData landscape
Traditional App Associated Challenges
RDBMS -- Used to run many analytics
systems
Performance ( Not Real time), Scaling,
Cost++
CEP -- Designed to correlate data in real
time
Scaling (often necessary to aggregate
events into a centralized source ), not
designed for historical data.
Hadoop -- Designed for batch analytics and
complex correlation
Not designed for Real time.
NoSQL -- Designed to handle large data
volumes at low cost
Processing capability: Sheer amount of
data can be challenging.
IMDG -- Fast for storing and processing
data
Storing vast amounts of information in-
memory doesn’t scale, in terms of both
system scaling and cost
Different problems, so are the solutions.

Why ?
• Speed matters
– Citi : 100ms == $1 M
– Google : 500ms == 20% traffic drop
• Disk up to 107 times slower than RAM.

In Memory Grids• Why now?
– Hardware, ability++ and cost--
• 1TB RAM & 48 core cluster (can hold full week tweets) ~ $40K
Data Growth, PB DRAM Cost, $
BigData tech. plannedData is growing exponentially 30% drop each 12-18 months

Use Cases
• Trading Systems
– Handle large volume of transactions
• Real time risk analytics
– Analysis of trading positions and risk
• Online gaming
– Online real-time backbone for gaming
• Geo Mapping
– Real-time geographical route and traffic information
• Bio Informatics
– Real-time DNA sequencing and matching

In Memory Grids
1. In Memory -- Compute Grid.
Compute Grids allow you to take a computation, optionally split it into multiple parts, and
execute them on different grid nodes in parallel.

Functionality
• Distributed Execution Models - map-reduce, Streaming
Processing & CEP, MPP, MPI style
• Distributed Execution Management Services – task
distribution, failover, load balancing, collision resolution,
job stealing, redundant mapping support, task scheduling,
asynchronous reduction, task checkpoints
• Distributed Deployment & Provisioning.
• Distributed Resources Management - Automatic discovery

In Memory Grids
2. In Memory -- Data Grid. (aka, Distributed data caching )
Provides applications with ability to keep data in memory for high availability rather than
constantly fetching it from slower storage elsewhere, like RDBMS or shared file systems.

IMDG ?
• Several JVMs sharing in-memory partitioned data.
• Provides extremely low latency access to,
and high availability of, application data by keeping it in
memory and to do so in a highly parallelized way.
• Support most of the Big Data processing requirements.

Common Features
• Distributed maps
• Caching , Evictions
• Code execution (executor service, map-reduce)
• Listeners
• Queries (SQL like)
• Pluggable indexing
• Hibernate L 2 cache (optional)
• ACID Transactions
• MapStore (write-behind, write-through, read-through)
• Optimized Serialization

Common Features
• The same object your business logic is using can be kept in the data grid.
• No extra step of marshaling and un-marshaling.
• Embeddable (optional)

IMDG is not a
• NoSQL database
• In Memory Database (IMDB)
• How?
• Support for true distributed ACID transactions with highly optimized 2PC protocol implementation.
• Scalable Data Partitioning across a cluster including both partitioned or fully replicated scenarios
• Ability to work directly with application domain objects rather than with primitive types or “documents”
• Tight integration with In-Memory Compute Grid (IMCG)
• Pluggable segmentation (a.k.a. "brain split" problem) resolution
• Pluggable expiration policies
• Pluggable indexing support

Further Reading
• http://www.ventanaresearch.com/uploadedFiles/Content/Landing_Pages/Ventana_Research_Big_
Data_Benchmark_Research_Presentation.pdf
• http://wikibon.org/wiki/v/Data_in_DRAM_is_a_Flash_in_the_Pan#Data_in_Memory_Solutions_for
_Real-Time_High-Performance_Transaction_Analytics
• http://www.gridgain.com/book/book.html
• http://java.dzone.com/articles/compute-grids-vs-data-grids
• http://www.infoq.com/articles/in-memory-data-grids
• http://natishalom.typepad.com/nati_shaloms_blog/2011/07/real-time-analytics-for-big-data-an-
alternative-approach-to-facebooks-new-realtime-analytics-system.html
• https://del.sapient.resultspace.com/scm/gmtechip/POCs/gridgain_risk_analytics

In memory grids IMDG

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (8)

Similar a In memory grids IMDG

Similar a In memory grids IMDG (20)

Último

Último (20)

In memory grids IMDG