Manik Surtani is the founder and project lead of Infinispan, an open source data grid platform. He discussed data grids, NoSQL, and their role in cloud storage. Data grids evolved from distributed caches to provide features like querying, task execution, and co-location control. NoSQL systems are alternative data storage that is scalable and distributed but lacks relational structure. JSR 347 aims to standardize data grid APIs for the Java platform. Infinispan implements JSR 107 and will support JSR 347, acting as the reference backend for Hibernate OGM.
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Data Grids, NoSQL, Cloud Storage & JSR-347
1. Infinispan
Data Grids, NoSQL, Cloud Storage & JSR-347
Manik Surtani
Founder and Project Lead, Infinispan
Red Hat, Inc.
2. Who is Manik?
• Hacker@JBoss, Red Hat’s middleware division
• Founder and Project Lead, Infinispan
• Spec lead, JSR 347
•Data Grids for Java
• EG representative, JSR 107
•Temporary Caching for Java
• http://blog.infinispan.org
• http://twitter.com/maniksurtani
3. Agenda
• A brief introduction to Infinispan
• Understanding Data Grids
• .. and NoSQL
• Their role in Cloud Storage
• JSR 347 and related standards
4.
5. What is Infinispan?
• An open source data grid platform
• Written in Java and Scala
• Not just for the JVM though
• Distributed key/value store
• Transactional (JTA)
• Low-latency (in-memory)
• Optionally persisted to disk
• Feature-rich
8. WTF is Hot Rod?
• Wire protocol for client server
communications
• Open
• Language independent
• Built-in failover and load
balancing
• Smart routing
9. Server Endpoint Comparison
Protocol Client Clustered? Smart Load Balancing/
Libraries Routing Failover
REST Text N/A Yes No Any HTTP load
balancer
Memcached Text Plenty Yes No Only with
predefined
server list
Hot Rod Binary Java, Yes Yes Dynamic
Python,
Ruby
12. Why use distributed caches?
• Cache data that is expensive to retrieve/calculate
• E.g., from a database
• The need for fast, low-latency data access
• Performance or time-sensitive applications
• Very commonly used in:
• Financial Services industry
• Telcos
• Highly scalable e-commerce
13. Data grids as clustering toolkits
• To introduce high availability
and failover to frameworks
• Commercial and open source
frameworks
• In-house frameworks and
reusable architectures
• Delegate all state
management to the data
grid
• Framework becomes
stateless and hence elastic
14. But
Data Grids > Distributed Caches
• Querying
• Task execution and map/reduce
• Control over data co-location
16. What is NoSQL?
• An alternative form of typically disk-based data
storage
• Free from relational structure
• Usually key/value or document-based
• Allows for greater scalability and easier
clustering/distribution
18. NoSQL and Consistency
• BASE not ACID
• Relax consistency in exchange for high availability
and partition tolerance
• Usually eventually consistent
• Which means applications need to be designed with
this in mind
21. Cloud Storage
• Traditional mechanisms (RDBMSs and file
systems) are hard to deal with
• Clouds are ephemeral
• All cloud components are expected to be:
• elastic
• highly available
22. Cloud Storage
•Data grids and NoSQL win over traditional
storage mechanisms in the cloud
• Data grids and NoSQL are fast converging in
feature sets
• E.g., Data grids can write through to disk; many
NoSQL engines would also cache in memory
24. JSR 347
Data Grids for the Java Platform
• A new JSR for proposed inclusion in Java EE 8
• to make enterprise Java more cloud-friendly
• Standardize data grid APIs and behavior for the
Java platform
• Does not define NoSQL
• Data grids primarily used from within a JVM
• NoSQL primarily used via client connectors over a
socket
• Standardizing wire protocols beyond the scope of the JCP
25. JSR 347
Data Grids for the Java Platform
• Extends JSR 107 (Temporary Caching for Java)
• Adds:
• Asynchronous, non-blocking API
• Grouping API to control co-location
• Distributed code execution and Map/Reduce APIs
• Eventually consistent API
• Possibly more
• Still very much work in progress
• Participate!
26. Related standards and efforts
• JSR 107
• A temporary caching API that defines:
• Basic interaction
• JTA compatibility
• Persistence: write-through and write-behind
• Listeners
27. Related standards and efforts
• Hibernate OGM
• JPA for key/value stores!
• Common and familiar paradigm for persisting data
• Except persistence is made to a data grid or NoSQL store
28. Related standards and efforts
• Contexts and Dependency Injection
• Interaction with caches defined in JSR 107
• Familiar and well proven programming model
• Works well with JPA and hence Hibernate OGM
• Works well even for direct access to key/value data
grids
29. Where does Infinispan fit in?
• Will implement JSR 107
• Currently implements most of this at least in concept
• Will implement JSR 347
• Currently serves as a “donor” for most of JSR 347
features and API
• Is already the reference backend for Hibernate
OGM
• Already supports CDI integration
30. To Summarize
•What data grids and distributed caches are
•Where NoSQL came from and main differences
between NoSQL and data grids
•Cloud storage challenges
•JSR 347: Data Grids for the Java Platform
•Infinispan and where it sits in all this
31. Questions &
More Info
• http://github.com/datagrids/spec/wiki
• http://groups.google.com/group/jsr347
• http://twitter.com/jsr347
• http://www.infinispan.org
• http://twitter.com/infinispan
• http://hibernate.org/subprojects/ogm.html
Notas del editor
Welcome to session on Infinispan, I hope you find this both informative and amusing.\n
A bit about me\nFounder and project lead of Infinispan\n\n
\n
\n
\n
Embedded setup.\nApp in JVM, starts ISPN instance.\nInstances form a cluster\nApp stores all state in ISPN, app is now HA, can be LB’d, etc!\n\nHow to build clustered fwks and appservers\n
Infinispan nodes form a p2p cluster as usual\nShare state and communicate with each other\nEach node also opens a network socket for client comms\nAttaches an encoder and decoder NETTY\nClients talk to Infinispan instances via sockets\nClients now don’t need to be in a JVM\n\n
Explain Hot Rod\n
Talk about protocols and endpoints\n
Lets talk about data grids in general.\n
\n
\n
\n
Offers more than just what distributed caches do.\n
Strictly distributed NoSQL.\nLeaving out the likes of CouchDB, Redis, etc.\n
Alternative to an RDBMS\nUnstructured data\nPrimary goal: scalability and elasticity.\n
RDBMSs strive for ACIDity. (Atomic Consistent Isolated Durable)\nBASE (Basic Availability, Soft-state, Eventually consistent)\nEric Brewer’s CAP theorem\n
RDBMSs strive for ACIDity. (Atomic Consistent Isolated Durable)\nBASE (Basic Availability, Soft-state, Eventually consistent)\nEric Brewer’s CAP theorem\n
RDBMSs strive for ACIDity. (Atomic Consistent Isolated Durable)\nBASE (Basic Availability, Soft-state, Eventually consistent)\nEric Brewer’s CAP theorem\n