SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
© 2013 IBM Corporation
Brian K. Martin – Distinguished Engineer, WebSphere eXtreme Scale
17 September 2013
Lightning Fast access to Big Data
Document number
BOF-5957
© 2013 IBM Corporation
Important Disclaimers
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.
WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION
CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED.
ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED
ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR
INFRASTRUCTURE DIFFERENCES.
ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.
IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT
PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.
IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE
USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR
SUPPLIERS AND/OR LICENSORS
2
© 2013 IBM Corporation
About me
§  Brian K. Martin
§  Chief Architect of WebSphere eXtreme Scale.
Previously a lead architect on WebSphere
Application Server. Currently leading
CloudFoundry middleware services inside IBM
§  Email: bkmartin@us.ibm.com
§  Twitter: @bkmartin
§  Blog: http://www.attheextreme.com/
§  Visit the IBM booth #5112 and meet other IBM
developers at JavaOne 2013
3
© 2013 IBM Corporation
What is a Cache Anyways?
A cache allows you to get stuff faster and helps you avoid doing something
over and over again (which may be redundant and may not make sense)
(far away)
(near)
(happy)
© 2012 IBM Corporation
© 2013 IBM Corporation
How to overwhelm an enterprise
Laptops, Ultrabooks, Tablets,
Smartphones
+ Transaction
Overload
Mobile Access
=
E-mail, SMS, Pop-up, Click-thru
Promotions, Web Crawlers +
All of the
above = Transaction
Overload
Targeted Advertising
=TV, Movie, Sports Personality
mentions / endorses product
+ Transaction
Overload
Social Media
=
Retail, Banking, Finance, Insurance, Telecom, Travel & Transportation …
© 2013 IBM Corporation
Where do we cache?
A database cache? A page fragment cache? A service Cache?
TOO SPECIFIC!
•  A cache is a tool for reducing application path length
•  OR the distance data has to travel before it gets to the customer/data sink
Web Channel
Mobile Channel
DBData
Service Logic
OR
Map
© 2013 IBM Corporation
But I’m already caching…
Problems with local caching:
Ø  Local cache doesn’t scale
Ø  Local cache is not fault tolerant or highly available
Ø  Need to handle invalidation across a cluster
Ø  Local cache is typically single function or application specific
Ø  Local cache memory requirements could actually degrade performance
due to Garbage Collection cycles on large JVM heap sizes
Ø  Resource contention for managing local cache (CPU, memory, I/O)
© 2013 IBM Corporation
Solution: In Memory Data Grid
Distributed in-memory object cache
Capable of massive volumes of transactions
Self-healing, allow scale-out / scale-in
Splits a given dataset into shards or partitions
• Elastic, scalable, coherent in-memory cache
• Dynamically caches, partitions, replicates and manages application data
and business logic across multiple servers
• Provides qualities of service such as transaction integrity,
high availability, and predictable response times
•  Automatic failure recovery
• on-the-fly addition / removal of memory capacity
• Primary and Replica shards
© 2013 IBM Corporation
In-memory Database versus In-memory Data grid
Data Grid
Capabilities
In-memory
Database
Database
Capabilities
In-memory Data Grid
Sophisticated
Query Support
Linear Scale‑Out
Data Flexibility
Simple OLTP
Access Data as
POJOs
Fast In-Memory Data
Access
© 2013 IBM Corporation
Why you might need a data grid
Scalability issues with database servers
Large volume of data
Fault tolerance and self-healing
Data redundancy and replication
•  Adding extra hardware is not easy
•  Licensing costs
•  Ability to handle volumes of data without
slowing down data access
•  Handle data surges during product launches
and live events
•  Need for automatic mechanisms to avert
system failure affecting end-users
•  Data integrity
•  Maintain data reliability in case of failover
© 2013 IBM Corporation
Characteristics of a Data Grid
•  Data is de-normalized
•  Data stored as key-value pairs. (Think: hash map of ‘infinite’ size)
•  Data Grid can be transactional
•  Simple APIs
•  Get, Insert, Update, Delete
•  SQL-like query language
•  Map-reduce, grid based applications
•  Can be horizontally partitioned
•  list-based, hash-based, range-banged partitioning schemes can be applied to the
data
•  Often (but not always) transient or referential data
•  HTTP sessions, user profile, etc.
•  Mainframe DBMS offloading
•  Read-only or read-mostly
•  Can tolerate some staleness
© 2013 IBM Corporation
Elastic Cache
(In-Memory Data Grid)
Enterprise Architecture
Application Server Tier
RDBMS TPMBack-end Services
Web Server Tier
© 2013 IBM Corporation
Elastic Cache
(In-Memory Data Grid)
Application State Store Pattern
Application Server
• Single replacement for multiple local caches
• Consistent response times
• Reduces Application Server JVM heap size
• Improved memory utilization - more memory for applications
• Faster Application Server start-up
• Removes invalidation chatter of local caches
• Applications move application state to grid
• Stateless applications scale elastically
• Application state can be shared across data
centers for high availability
Applications use single coherent, highly- available, scalable cache
© 2013 IBM Corporation
Elastic Cache
(In-Memory Data Grid)
HTTP Session Distribution
Application Server
•  Improved Performance
‒  Improves start-up time when bringing a new server on-
line
•  Better Scalability (Less expensive)
‒  Replaces memory to memory replication
‒  Replaces need for database persistence
‒  Less expensive than scaling the database
‒  Faster, more consistent response times
‒  Makes better use of system resources
‒  Larger cache capacity
•  Higher Availability
‒  Provides fault tolerance and high availability of session
‒  Not only within the datacenter, but across datacenters
‒  Session replication or distribution is crucial in highly
available systems to provide uninterrupted user
experience (e.g. Shopping Cart).
•  No new code required! Easy to configure
© 2013 IBM Corporation
Elastic Cache
(In-Memory Data Grid)
Active/Active Datacenter HTTP session failover
Application Server
Elastic Cache
(In-Memory Data Grid)
Application Server
Multi-master replication (MMR)
Load Balancers
Datacenter 1 Datacenter 2
© 2013 IBM Corporation
Elastic Cache
(In-Memory Data Grid)
Application Server Elasticity – Cloud Web Application
Application Server
Intelligent Routers
Deployment
Manager
Deploy
New server
instance
• Intelligent Routers monitor
• Number of connections
• Response Time
• Application Server health
• Based on SLA, Intelligent Router coordinates with
Deployment Manager to deploy another server
instance to meet SLA
• Hypervisor instance (App Server/Portal)
• Dynamic cluster member
• Application state is immediately available to new
instance via elastic cache
• During periods of reduced load, system can scale
down and release resources for other purposes
© 2013 IBM Corporation
Elastic Cache
Side Cache Pattern
Application Server
•  Client first checks the grid before using the
data access layer to connect to a back end
data store.
•  If an object is not returned from the grid (a
cache “miss”), the client uses the data access
layer as usual to retrieve the data.
•  The result is put into the grid to enable faster
access the next time.
•  The back end remains the system of record,
and usually only a small amount of the data is
cached in the grid.
•  An object is stored only once in the cache,
even if multiple clients use it. Thus, more
memory is available for caching, more data can
be cached, which increases the cache hit rate.
•  Improve performance and offload unnecessary
workload on backend systems.
Back-end Services
RDBMS
TPM
© 2013 IBM Corporation
Enterprise Service Bus – Side Cache
Elastic Cache
Application Server
RDBMS
TPM
Back-end Services
Enterprise
ServiceBus
•  Easily integrates into the existing business process
‒  No code changes to the client application or back-
end application
‒  Simply add the side cache mediation at the ESB
layer
•  Significantly reduces the load on the back-end
system by eliminating redundant requests
‒  Eliminates costly MIPS by eliminating redundant
request
‒  Allows for more “REAL” work to be performed
‒  Improves overall response time
‒  Minimizes the need to scale hardware to increase
processing capacity since the back-end system
no longer has to handle redundant requests.
•  Response time from elastic cache is sub millisecond
© 2013 IBM Corporation
Enterprise Service Bus – Global Cache for Performance
Elastic Cache
RDBMS
TPM
Back-end Services
Back-end Data
ESB 1 ESB 2
Response 1
Normal response
Response 2
Faster response
• Lay foundation for low latency access to
data required for real-time analytics
• Response time is key driver in choice of
infrastructure provider for mobile
implementations
• Fast access to cached data for common
DB lookups
• Reduce load on DB by eliminating
redundant lookups
© 2013 IBM Corporation
Enterprise Service Bus – Global Cache for Scalability
Elastic Cache
RDBMS
TPM
Back-end Services
Client State
ESB 1 ESB 2
• Allow the building of scalable
infrastructure to support growth in
demand for services
• Provide a cache for sharing data
between ESBs, which
enables...Ability to correlate replies
to share workload between
ESBs in request-response
scenarios
© 2013 IBM Corporation
Elastic Cache
Mobile Gateway Acceleration
Mobile Gateway
• By integrating Elastic Cache with
Mobile Gateway, users can see
improved performance without the
penalty of having to scale to a large
cluster of Mobile Gateways.
• Use Side cache to cache XSLT
transforms
• Directly access the Elastic Cache to
retrieve cached objects
• Use Elastic Cache to provide session
state for stateless communication
Application Server
TPM
RDBMS
© 2013 IBM Corporation
Elastic Cache
In-line cache – Database shock absorber
•  The grid can be used as a special data access layer where it
is configured to use a loader to get data from the back-end
system.
‒  Read through cache
‒  Write through cache (Synchronous writes)
‒  Write-behind cache (Asynchronous writes)
•  System of Record Data Store
‒  Cache is used as the system of record
‒  Write behind technology pushes changes asynchronously
to the backend.
Ø Changes batched
Ø Only last change written
‒  Runs through backend outages!
•  Benefits
‒  Writes faster (memory vs. disk speed)
‒  Backend load reduced, throughput improved
‒  Increased availability and scalability
Application Server
Back-end Services
RDBMS
© 2013 IBM Corporation
Elastic Cache
In-line cache – Real time access to Big Data
•  Use case: high speed ingest of data
‒  Read through cache
‒  Write through cache (Synchronous writes)
‒  Write-behind cache (Asynchronous writes)
•  Store data in CSV, JSON
•  Keep operational data in-memory
•  Evict data from IMDG using time based evictor or space
based evictor
•  (e.g. 30 days, 7 days)
•  Use Hadoop MR jobs for offline analytics
•  Benefits
‒  Writes faster (memory vs. disk speed)
‒  Reduced response time access to operational data
Application Server
© 2013 IBM Corporation
IMDG
Cache – MR Job output store
•  Use case: high throughput, low latency access to the MR
results
Application Server
MR Output
© 2013 IBM Corporation
Elastic Cache
eXtreme Transaction Processing
Agent
• Lowest possible latency
• Application code (Agent) runs in
the grid itself
• Map/Reduce API supported
• Events routed to correct
partitions for processing
• Databases relegated to durable
log and reports
RDBMS
© 2013 IBM Corporation
Elastic Cache
Map Reduce Parallel Processing
Agent
• Parallel Map
• Allows the entries for a set of
Entities or Objects to be
processed and returns a result
for each entry processed
• Parallel Reduction
• Processes a subset of the
entries and calculates a single
result for the group of entries
• Since the Elastic Cache is the system
of record, there is little to no load on
the back-end data stores
RDBMS
© 2013 IBM Corporation
Elastic Cache
Real-Time Business Rules / Event Processing
Agent
• Lowest possible latency
• Application code runs in the grid
itself
• Events routed to correct
partitions for processing
• Extension of Write behind
scenario
• Databases relegated to durable
log and reports
RDBMS
= Business Rules
Business Process Management
© 2013 IBM Corporation
Multi-datacenter - High Availability/Distributed Computing
Datacenter 1 Datacenter 2
Public Cloud
© 2013 IBM Corporation
Elastic Cache Shared Service
• Provides Elastic Caching resource for
cloud based architectures
• Elastic Cache service is multi-tenant
• Support grid capping
• Individual maps per cloud group
• Authentication/Authorization per
map/grid
• Used for
• Simple Cache
• HTTP session distribution
• Dynamic Cache provider
• http://www.bluemix.net
© 2013 IBM Corporation
§ Java and .NET applications can now interact natively
with the same data in the same data grid, leading the
way toward a true enterprise-wide data grid.
§ Native off heap storage, overflow to disk
§ A new REST Gateway provides simple access from
other languages.
§ WXS 8.6 delivers a faster, more compact serialization
format called eXtreme Data Format (XDF), which is
neutral to programming languages.
§ A new transport mechanism, eXtreme IO (XIO)
removes the dependency on the IBM ORB, enabling
easier integration with existing environments.
§ Built in pub/sub capabilities enable WXS 8.6 to update
client “near caches” whenever data is updated,
deleted, or invalidated on the server side.
§ API enhancements enable continuous query or data
that is inserted and updated in the grid.
IBM Elastic Caching Delivers
Consistent Response Times, High Availability of Data & Linear Scalability for Enterprise-wide Data Grids
WebSphere eXtreme Scale V8.6
A powerful, scalable, elastic in-
memory grid for your business-
critical applications
Rapid, “drop-in” use with a
broad range of Java and non-
Java application environments
DataPower XC10 Appliance V2.5
•  Rapid drop-in use across a broad range of application server
technologies and programming languages
•  New data format (eXtreme Data Format –XDF) improves
performance and allows data to be shared natively between Java
& .NET applications
•  Built in notification infrastructure allows for client-side event
notification, continuous query cache and near-cache invalidation
•  Improved usability, serviceability
•  Supports FIPS security protocol for government and financial
sector compliance
•  Improved performance
•  Improved monitoring and administration capabilities
•  Native disk overflow, extending grid capacity by moving less
frequently used data to disk
© 2013 IBM Corporation
Infrastructure
• WebSphere Application Server
• Liberty
• Rational Team Concert
Stack Product Integration
• WebSphere Commerce
• WebSphere Portal
• IBM Mobile Platform / Worklight
Need IMDG with That….?
ü Improve Performance Scalability &
Availability
ü Consistent Response Times
ü Reduces cost by eliminating redundant
transactions
Distributed caching is
becoming a central element of
transaction processing
Cloud
• IBM PureSystems
• IBM Workload Deployer
• Cast Iron Live (Saas)
• IBM Smart Cloud
Application services
• BlueMix
Security
Tivoli Access Manager
for eBusiness
Business Process & Connectivity
• DataPower Integration Appliance XI50/52 & XG45
• WebSphere Message Broker
• IBM Business Process Manager (WPS)
• WebSphere Registry and Repository
• IBM Operation Decision Management (iLOG JRules/WBE)
© 2013 IBM Corporation59
Getting Started
WebSphere eXtreme Scale Product Page
http://www-01.ibm.com/software/webservers/appserv/extremescale/
WebSphere DataPower XC10 Appliance Product Page
http://www-01.ibm.com/software/webservers/appserv/xc10/
WebSphere eXtreme Scale and WebSphere DataPower XC10 wiki
http://www.ibm.com/developerworks/connect/caching
WebSphere eXtreme Scale Free Trial
http://www.ibm.com/developerworks/downloads/ws/wsdg/
Virtual appliance free for developers
http://tinyurl.com/virtualXC10
Contact your IBM Representative

Más contenido relacionado

La actualidad más candente

Greenplum Database on HDFS
Greenplum Database on HDFSGreenplum Database on HDFS
Greenplum Database on HDFS
DataWorks Summit
 

La actualidad más candente (18)

An overview of reference architectures for Postgres
An overview of reference architectures for PostgresAn overview of reference architectures for Postgres
An overview of reference architectures for Postgres
 
Making your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly AvailableMaking your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly Available
 
An overview of reference architectures for Postgres
An overview of reference architectures for PostgresAn overview of reference architectures for Postgres
An overview of reference architectures for Postgres
 
Greenplum Database on HDFS
Greenplum Database on HDFSGreenplum Database on HDFS
Greenplum Database on HDFS
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?
 
Best Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupBest Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture Setup
 
Virtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireVirtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFire
 
Queues, Pools, Caches
Queues, Pools, CachesQueues, Pools, Caches
Queues, Pools, Caches
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
New Integration Options with Postgres Enterprise Manager 8.0
New Integration Options with Postgres Enterprise Manager 8.0New Integration Options with Postgres Enterprise Manager 8.0
New Integration Options with Postgres Enterprise Manager 8.0
 
Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...
 
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
 
DB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools Update
DB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools UpdateDB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools Update
DB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools Update
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmee
 
All of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL DeveloperAll of the Performance Tuning Features in Oracle SQL Developer
All of the Performance Tuning Features in Oracle SQL Developer
 
Effective admin and development in iib
Effective admin and development in iibEffective admin and development in iib
Effective admin and development in iib
 
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorEDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
 

Destacado

Destacado (7)

Resource management in java bof6823 - java one 2012
Resource management in java   bof6823 - java one 2012Resource management in java   bof6823 - java one 2012
Resource management in java bof6823 - java one 2012
 
Whats Next for JCA?
Whats Next for JCA?Whats Next for JCA?
Whats Next for JCA?
 
JavaOne 2012 CON3978 Scripting Languages on the JVM
JavaOne 2012 CON3978 Scripting Languages on the JVMJavaOne 2012 CON3978 Scripting Languages on the JVM
JavaOne 2012 CON3978 Scripting Languages on the JVM
 
JPA Performance Myths -- JavaOne 2013
JPA Performance Myths -- JavaOne 2013JPA Performance Myths -- JavaOne 2013
JPA Performance Myths -- JavaOne 2013
 
JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)
 
Efficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsEfficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java Applications
 
Three Key Concepts for Understanding JSR-352: Batch Programming for the Java ...
Three Key Concepts for Understanding JSR-352: Batch Programming for the Java ...Three Key Concepts for Understanding JSR-352: Batch Programming for the Java ...
Three Key Concepts for Understanding JSR-352: Batch Programming for the Java ...
 

Similar a JavaOne BOF 5957 Lightning Fast Access to Big Data

Presentation20130616
Presentation20130616Presentation20130616
Presentation20130616
Adrian Warman
 
Riverbed Granite
Riverbed GraniteRiverbed Granite
Riverbed Granite
CTI Group
 

Similar a JavaOne BOF 5957 Lightning Fast Access to Big Data (20)

NZS-4532 - Bringing Historical Data to Life with IBMs SMF Data Engine
NZS-4532 - Bringing Historical Data to Life with IBMs SMF Data EngineNZS-4532 - Bringing Historical Data to Life with IBMs SMF Data Engine
NZS-4532 - Bringing Historical Data to Life with IBMs SMF Data Engine
 
IBM Cloud Solutions Customer Deck
IBM Cloud Solutions Customer Deck IBM Cloud Solutions Customer Deck
IBM Cloud Solutions Customer Deck
 
IBM Informix on cloud webcast August 2017
IBM Informix on cloud webcast August 2017IBM Informix on cloud webcast August 2017
IBM Informix on cloud webcast August 2017
 
Enabling Continuous Availability and Reducing Downtime with IBM Multi-Site Wo...
Enabling Continuous Availability and Reducing Downtime with IBM Multi-Site Wo...Enabling Continuous Availability and Reducing Downtime with IBM Multi-Site Wo...
Enabling Continuous Availability and Reducing Downtime with IBM Multi-Site Wo...
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
LightEdge Partner Cloud Overview
LightEdge Partner Cloud Overview LightEdge Partner Cloud Overview
LightEdge Partner Cloud Overview
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session I
 
Cloud computing
Cloud computing Cloud computing
Cloud computing
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Iod session 3423   analytics patterns of expertise, the fast path to amazing ...Iod session 3423   analytics patterns of expertise, the fast path to amazing ...
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
CirrusDB Offerings
CirrusDB OfferingsCirrusDB Offerings
CirrusDB Offerings
 
Why z/OS is a Great Platform for Developing and Hosting APIs
Why z/OS is a Great Platform for Developing and Hosting APIsWhy z/OS is a Great Platform for Developing and Hosting APIs
Why z/OS is a Great Platform for Developing and Hosting APIs
 
Presentation20130616
Presentation20130616Presentation20130616
Presentation20130616
 
Datacenter 2014: HP - Brian Andersen
Datacenter 2014: HP - Brian AndersenDatacenter 2014: HP - Brian Andersen
Datacenter 2014: HP - Brian Andersen
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
Riverbed Granite
Riverbed GraniteRiverbed Granite
Riverbed Granite
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

JavaOne BOF 5957 Lightning Fast Access to Big Data

  • 1. © 2013 IBM Corporation Brian K. Martin – Distinguished Engineer, WebSphere eXtreme Scale 17 September 2013 Lightning Fast access to Big Data Document number BOF-5957
  • 2. © 2013 IBM Corporation Important Disclaimers THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: - CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS 2
  • 3. © 2013 IBM Corporation About me §  Brian K. Martin §  Chief Architect of WebSphere eXtreme Scale. Previously a lead architect on WebSphere Application Server. Currently leading CloudFoundry middleware services inside IBM §  Email: bkmartin@us.ibm.com §  Twitter: @bkmartin §  Blog: http://www.attheextreme.com/ §  Visit the IBM booth #5112 and meet other IBM developers at JavaOne 2013 3
  • 4. © 2013 IBM Corporation What is a Cache Anyways? A cache allows you to get stuff faster and helps you avoid doing something over and over again (which may be redundant and may not make sense) (far away) (near) (happy) © 2012 IBM Corporation
  • 5. © 2013 IBM Corporation How to overwhelm an enterprise Laptops, Ultrabooks, Tablets, Smartphones + Transaction Overload Mobile Access = E-mail, SMS, Pop-up, Click-thru Promotions, Web Crawlers + All of the above = Transaction Overload Targeted Advertising =TV, Movie, Sports Personality mentions / endorses product + Transaction Overload Social Media = Retail, Banking, Finance, Insurance, Telecom, Travel & Transportation …
  • 6. © 2013 IBM Corporation Where do we cache? A database cache? A page fragment cache? A service Cache? TOO SPECIFIC! •  A cache is a tool for reducing application path length •  OR the distance data has to travel before it gets to the customer/data sink Web Channel Mobile Channel DBData Service Logic OR Map
  • 7. © 2013 IBM Corporation But I’m already caching… Problems with local caching: Ø  Local cache doesn’t scale Ø  Local cache is not fault tolerant or highly available Ø  Need to handle invalidation across a cluster Ø  Local cache is typically single function or application specific Ø  Local cache memory requirements could actually degrade performance due to Garbage Collection cycles on large JVM heap sizes Ø  Resource contention for managing local cache (CPU, memory, I/O)
  • 8. © 2013 IBM Corporation Solution: In Memory Data Grid Distributed in-memory object cache Capable of massive volumes of transactions Self-healing, allow scale-out / scale-in Splits a given dataset into shards or partitions • Elastic, scalable, coherent in-memory cache • Dynamically caches, partitions, replicates and manages application data and business logic across multiple servers • Provides qualities of service such as transaction integrity, high availability, and predictable response times •  Automatic failure recovery • on-the-fly addition / removal of memory capacity • Primary and Replica shards
  • 9. © 2013 IBM Corporation In-memory Database versus In-memory Data grid Data Grid Capabilities In-memory Database Database Capabilities In-memory Data Grid Sophisticated Query Support Linear Scale‑Out Data Flexibility Simple OLTP Access Data as POJOs Fast In-Memory Data Access
  • 10. © 2013 IBM Corporation Why you might need a data grid Scalability issues with database servers Large volume of data Fault tolerance and self-healing Data redundancy and replication •  Adding extra hardware is not easy •  Licensing costs •  Ability to handle volumes of data without slowing down data access •  Handle data surges during product launches and live events •  Need for automatic mechanisms to avert system failure affecting end-users •  Data integrity •  Maintain data reliability in case of failover
  • 11. © 2013 IBM Corporation Characteristics of a Data Grid •  Data is de-normalized •  Data stored as key-value pairs. (Think: hash map of ‘infinite’ size) •  Data Grid can be transactional •  Simple APIs •  Get, Insert, Update, Delete •  SQL-like query language •  Map-reduce, grid based applications •  Can be horizontally partitioned •  list-based, hash-based, range-banged partitioning schemes can be applied to the data •  Often (but not always) transient or referential data •  HTTP sessions, user profile, etc. •  Mainframe DBMS offloading •  Read-only or read-mostly •  Can tolerate some staleness
  • 12. © 2013 IBM Corporation Elastic Cache (In-Memory Data Grid) Enterprise Architecture Application Server Tier RDBMS TPMBack-end Services Web Server Tier
  • 13. © 2013 IBM Corporation Elastic Cache (In-Memory Data Grid) Application State Store Pattern Application Server • Single replacement for multiple local caches • Consistent response times • Reduces Application Server JVM heap size • Improved memory utilization - more memory for applications • Faster Application Server start-up • Removes invalidation chatter of local caches • Applications move application state to grid • Stateless applications scale elastically • Application state can be shared across data centers for high availability Applications use single coherent, highly- available, scalable cache
  • 14. © 2013 IBM Corporation Elastic Cache (In-Memory Data Grid) HTTP Session Distribution Application Server •  Improved Performance ‒  Improves start-up time when bringing a new server on- line •  Better Scalability (Less expensive) ‒  Replaces memory to memory replication ‒  Replaces need for database persistence ‒  Less expensive than scaling the database ‒  Faster, more consistent response times ‒  Makes better use of system resources ‒  Larger cache capacity •  Higher Availability ‒  Provides fault tolerance and high availability of session ‒  Not only within the datacenter, but across datacenters ‒  Session replication or distribution is crucial in highly available systems to provide uninterrupted user experience (e.g. Shopping Cart). •  No new code required! Easy to configure
  • 15. © 2013 IBM Corporation Elastic Cache (In-Memory Data Grid) Active/Active Datacenter HTTP session failover Application Server Elastic Cache (In-Memory Data Grid) Application Server Multi-master replication (MMR) Load Balancers Datacenter 1 Datacenter 2
  • 16. © 2013 IBM Corporation Elastic Cache (In-Memory Data Grid) Application Server Elasticity – Cloud Web Application Application Server Intelligent Routers Deployment Manager Deploy New server instance • Intelligent Routers monitor • Number of connections • Response Time • Application Server health • Based on SLA, Intelligent Router coordinates with Deployment Manager to deploy another server instance to meet SLA • Hypervisor instance (App Server/Portal) • Dynamic cluster member • Application state is immediately available to new instance via elastic cache • During periods of reduced load, system can scale down and release resources for other purposes
  • 17. © 2013 IBM Corporation Elastic Cache Side Cache Pattern Application Server •  Client first checks the grid before using the data access layer to connect to a back end data store. •  If an object is not returned from the grid (a cache “miss”), the client uses the data access layer as usual to retrieve the data. •  The result is put into the grid to enable faster access the next time. •  The back end remains the system of record, and usually only a small amount of the data is cached in the grid. •  An object is stored only once in the cache, even if multiple clients use it. Thus, more memory is available for caching, more data can be cached, which increases the cache hit rate. •  Improve performance and offload unnecessary workload on backend systems. Back-end Services RDBMS TPM
  • 18. © 2013 IBM Corporation Enterprise Service Bus – Side Cache Elastic Cache Application Server RDBMS TPM Back-end Services Enterprise ServiceBus •  Easily integrates into the existing business process ‒  No code changes to the client application or back- end application ‒  Simply add the side cache mediation at the ESB layer •  Significantly reduces the load on the back-end system by eliminating redundant requests ‒  Eliminates costly MIPS by eliminating redundant request ‒  Allows for more “REAL” work to be performed ‒  Improves overall response time ‒  Minimizes the need to scale hardware to increase processing capacity since the back-end system no longer has to handle redundant requests. •  Response time from elastic cache is sub millisecond
  • 19. © 2013 IBM Corporation Enterprise Service Bus – Global Cache for Performance Elastic Cache RDBMS TPM Back-end Services Back-end Data ESB 1 ESB 2 Response 1 Normal response Response 2 Faster response • Lay foundation for low latency access to data required for real-time analytics • Response time is key driver in choice of infrastructure provider for mobile implementations • Fast access to cached data for common DB lookups • Reduce load on DB by eliminating redundant lookups
  • 20. © 2013 IBM Corporation Enterprise Service Bus – Global Cache for Scalability Elastic Cache RDBMS TPM Back-end Services Client State ESB 1 ESB 2 • Allow the building of scalable infrastructure to support growth in demand for services • Provide a cache for sharing data between ESBs, which enables...Ability to correlate replies to share workload between ESBs in request-response scenarios
  • 21. © 2013 IBM Corporation Elastic Cache Mobile Gateway Acceleration Mobile Gateway • By integrating Elastic Cache with Mobile Gateway, users can see improved performance without the penalty of having to scale to a large cluster of Mobile Gateways. • Use Side cache to cache XSLT transforms • Directly access the Elastic Cache to retrieve cached objects • Use Elastic Cache to provide session state for stateless communication Application Server TPM RDBMS
  • 22. © 2013 IBM Corporation Elastic Cache In-line cache – Database shock absorber •  The grid can be used as a special data access layer where it is configured to use a loader to get data from the back-end system. ‒  Read through cache ‒  Write through cache (Synchronous writes) ‒  Write-behind cache (Asynchronous writes) •  System of Record Data Store ‒  Cache is used as the system of record ‒  Write behind technology pushes changes asynchronously to the backend. Ø Changes batched Ø Only last change written ‒  Runs through backend outages! •  Benefits ‒  Writes faster (memory vs. disk speed) ‒  Backend load reduced, throughput improved ‒  Increased availability and scalability Application Server Back-end Services RDBMS
  • 23. © 2013 IBM Corporation Elastic Cache In-line cache – Real time access to Big Data •  Use case: high speed ingest of data ‒  Read through cache ‒  Write through cache (Synchronous writes) ‒  Write-behind cache (Asynchronous writes) •  Store data in CSV, JSON •  Keep operational data in-memory •  Evict data from IMDG using time based evictor or space based evictor •  (e.g. 30 days, 7 days) •  Use Hadoop MR jobs for offline analytics •  Benefits ‒  Writes faster (memory vs. disk speed) ‒  Reduced response time access to operational data Application Server
  • 24. © 2013 IBM Corporation IMDG Cache – MR Job output store •  Use case: high throughput, low latency access to the MR results Application Server MR Output
  • 25. © 2013 IBM Corporation Elastic Cache eXtreme Transaction Processing Agent • Lowest possible latency • Application code (Agent) runs in the grid itself • Map/Reduce API supported • Events routed to correct partitions for processing • Databases relegated to durable log and reports RDBMS
  • 26. © 2013 IBM Corporation Elastic Cache Map Reduce Parallel Processing Agent • Parallel Map • Allows the entries for a set of Entities or Objects to be processed and returns a result for each entry processed • Parallel Reduction • Processes a subset of the entries and calculates a single result for the group of entries • Since the Elastic Cache is the system of record, there is little to no load on the back-end data stores RDBMS
  • 27. © 2013 IBM Corporation Elastic Cache Real-Time Business Rules / Event Processing Agent • Lowest possible latency • Application code runs in the grid itself • Events routed to correct partitions for processing • Extension of Write behind scenario • Databases relegated to durable log and reports RDBMS = Business Rules Business Process Management
  • 28. © 2013 IBM Corporation Multi-datacenter - High Availability/Distributed Computing Datacenter 1 Datacenter 2 Public Cloud
  • 29. © 2013 IBM Corporation Elastic Cache Shared Service • Provides Elastic Caching resource for cloud based architectures • Elastic Cache service is multi-tenant • Support grid capping • Individual maps per cloud group • Authentication/Authorization per map/grid • Used for • Simple Cache • HTTP session distribution • Dynamic Cache provider • http://www.bluemix.net
  • 30. © 2013 IBM Corporation § Java and .NET applications can now interact natively with the same data in the same data grid, leading the way toward a true enterprise-wide data grid. § Native off heap storage, overflow to disk § A new REST Gateway provides simple access from other languages. § WXS 8.6 delivers a faster, more compact serialization format called eXtreme Data Format (XDF), which is neutral to programming languages. § A new transport mechanism, eXtreme IO (XIO) removes the dependency on the IBM ORB, enabling easier integration with existing environments. § Built in pub/sub capabilities enable WXS 8.6 to update client “near caches” whenever data is updated, deleted, or invalidated on the server side. § API enhancements enable continuous query or data that is inserted and updated in the grid. IBM Elastic Caching Delivers Consistent Response Times, High Availability of Data & Linear Scalability for Enterprise-wide Data Grids WebSphere eXtreme Scale V8.6 A powerful, scalable, elastic in- memory grid for your business- critical applications Rapid, “drop-in” use with a broad range of Java and non- Java application environments DataPower XC10 Appliance V2.5 •  Rapid drop-in use across a broad range of application server technologies and programming languages •  New data format (eXtreme Data Format –XDF) improves performance and allows data to be shared natively between Java & .NET applications •  Built in notification infrastructure allows for client-side event notification, continuous query cache and near-cache invalidation •  Improved usability, serviceability •  Supports FIPS security protocol for government and financial sector compliance •  Improved performance •  Improved monitoring and administration capabilities •  Native disk overflow, extending grid capacity by moving less frequently used data to disk
  • 31. © 2013 IBM Corporation Infrastructure • WebSphere Application Server • Liberty • Rational Team Concert Stack Product Integration • WebSphere Commerce • WebSphere Portal • IBM Mobile Platform / Worklight Need IMDG with That….? ü Improve Performance Scalability & Availability ü Consistent Response Times ü Reduces cost by eliminating redundant transactions Distributed caching is becoming a central element of transaction processing Cloud • IBM PureSystems • IBM Workload Deployer • Cast Iron Live (Saas) • IBM Smart Cloud Application services • BlueMix Security Tivoli Access Manager for eBusiness Business Process & Connectivity • DataPower Integration Appliance XI50/52 & XG45 • WebSphere Message Broker • IBM Business Process Manager (WPS) • WebSphere Registry and Repository • IBM Operation Decision Management (iLOG JRules/WBE)
  • 32. © 2013 IBM Corporation59 Getting Started WebSphere eXtreme Scale Product Page http://www-01.ibm.com/software/webservers/appserv/extremescale/ WebSphere DataPower XC10 Appliance Product Page http://www-01.ibm.com/software/webservers/appserv/xc10/ WebSphere eXtreme Scale and WebSphere DataPower XC10 wiki http://www.ibm.com/developerworks/connect/caching WebSphere eXtreme Scale Free Trial http://www.ibm.com/developerworks/downloads/ws/wsdg/ Virtual appliance free for developers http://tinyurl.com/virtualXC10 Contact your IBM Representative