5. Relational Databases
But scaling is hard!
-Replication
-Multiple instances w/ shared
disk
-Sharding
6. Relational Databases on a cloud
Master/replicas: which master?
A single master? I was promised elasticity
Less reliable “disks”
IP in configuration files? DNS update times?
Who coordinates this? How does that failover?
8. No-SQL goals
Very heterogeneus
• Large datasets
• High availability
• Low latency / higher throughput
• Specific data access pattern
• Specific data structures
• ...
9. NotOnlySQL
• Document based stores
• Column based
• Graph oriented databases
• Key / value stores
• Full-Text Search
11. Flexibility at a cost
• Programming model
• one per product :-(
• Often very thight code coupling
• No standard drivers / stable APIs
• no schema => app driven schema
• query (Map Reduce, specific DSL, ...)
• data structure transpires
• Transactions ?
• durability / consistency puzzles
12. Where does Infinispan fit?
Distributed Key/Value store
• (or Replicated, local only efficient cache,
invalidating cache)
Each node is equal
• Just start more nodes, or kill some
No bottlenecks
• by design
Cloud-network friendly
• JGroups
• And “cloud storage” friendly too!
13. But how to use it?
map.put( “user-34”, userInstance );
map.get( “user-34” );
map.remove( “user-34” );
14. It's a ConcurrentMap !
map.put( “user-34”, userInstance );
map.get( “user-34” );
map.remove( “user-34” );
map.putIfAbsent( “user-38”, another );
15. Other Hibernate/Infinispan
collaborations
●
Second level cache for Hibernate ORM
●
Hibernate Search indexing backend
●
Infinispan Query
16. Cloud-hack experiments
Let's play with Infinispan's integration for
Hibernate's second level cache design:
- usually configured in clustering mode
INVALIDATION.
•Let's use DIST or REPL instead.
- Disable expiry/timeouts.
What's the effect on your cloud-deployed
database?
17. Cloud-hack experiments
Now introduce Hibernate Search:
- full-text queries should be handled by
Lucene, NOT by the database.
Hibernate Search identifies hits from the
Lucene index, but loads them by PK. *by default
19. These tools are very
appropriate for the job:
Load by PK ->
second level cache ->
Key/Value store
FullText query ->
Hibernate Search ->
Lucene Indexes
20. These tools are very
appropriate for the job:
Load by PK ->
second level cache ->
Key/Value store
FullText query ->
Hibernate Search ->
Lucene Indexes
What if we now shut down the database?
21.
22. Goals
• Encourage new data usage patterns
• Familiar environment
• Ease of use
• Easy to jump in
• Easy to jump out
• Push NoSQL exploration in enterprises
• “PaaS for existing API” initiative
23. What it does
• JPA front end to key/value stores
• Object CRUD (incl polymorphism and associations)
• OO queries (JP-QL)
• Reuses
• Hibernate Core
• Hibernate Search (and Lucene)
• Infinispan
• Is not a silver bullet
• not for all NoSQL use cases
25. Schema or no schema?
• Schema-less
• move to new schema very easy
• app deal with old and new structure or migrate all
data
• need strict development guidelines
• Schema
• reduce likelihood of rogue developer corruption
• share with other apps
• “didn’t think about that” bugs reduced
26. Entities as serialized blobs?
• Serialize objects into the (key) value
• store the whole graph?
• maintain consistency with duplicated objects
• guaranteed identity a == b
• concurrency / latency
• structure change and (de)serialization, class definition
changes
27. OGM’s approach to schema
• Keep what’s best from relational model
• as much as possible
• tables / columns / pks
• Decorrelate object structure from data structure
• Data stored as (self-described) tuples
• Core types limited
• portability
28. OGM’s approach to schema
• Store metadata for queries
• Lucene index
• CRUD operations are key lookups
29. How does it work?
• Entities are stored as tuples (Map<String,Object>)
• Or Documents?
• The key is composed of
• table name
• entity id
• Collections are represented as a list of tuples
- The key is composed of:
• table name hosting the collection information
• column names representing the FK
• column values representing the FK
32. Queries / Infinispan
• Hibernate Search indexes entities
• Store Lucene indexes in Infinispan
• JP-QL to Lucene query transformation
• Works for simple queries
• Lucene is not a relational SQL engine
33. select a from Animal a where a.size > 20
> animalQueryBuilder
.range().onField(“size”).above(20).excludeLimit()
.createQuery();
select u from Order o join o.user u where o.price > 100 and u.city =
“Paris”
> orderQB.bool()
.must(
orderQB.range()
.onField(“price”).above(100).excludeLimit().createQuery() )
.must(
orderQB.keyword(“user.city”).matching(“Paris”)
.createQuery()
).createQuery();
34. Why Infinispan?
• We know it well
• Supports transactions
• Supports distribution of Lucene indexes
• Designed for clouds
• It's a key/value store with support for Map/Reduce
• Simple
• Likely a common point for many other “databases”
35. Why Infinispan?
•Map/Reduce as an alternative to
indexed queries
•Might be chosen by a clever JP-QL
engine
•Potential for additional query types
36.
37. Why ?
Nothing new to learn for most common operations:
• JPA models
• JP-QL queries
Everything else is performance tuning, including:
• Move to/from different NoSQL implementations
• Move to/from a SQL implementation
• Move to/from clouds/laptops
• JPA is a well known standard: move to/from
Hibernate :-)
38. Development state:
• Query via Hibernate Search
• Smart JP-QL parser is on github
• Available in master:
• EHCache
• Infinispan
• In development branches:
• MongoDB
• Voldemort