3. Couchbase NoSQL
Open Source Technology
• Vibrant and growing community focused
on distributed database technology
• Supports both key-value and document-
oriented use cases
• Community contributes via core server
and client development, connectors and
plugins, usability feedback, docs and
forums, etc.
• All project components available under Open Source Project
the Apache Public License
• Packaged software available from
Couchbase, Inc.
– Community and enterprise editions
4. NoSQL + Big Data
Operational database for
web and mobile apps with
high performance at scale
Map-reduce against
huge datasets to analyze
and find insights and
answers
5. Couchbase Server
Easy Consistent High
Scalability PE
RFORM ANCE
Performance
Grow cluster without Consistent sub-millisecond
application changes, without read and write response times
downtime with a single click with consistent high throughput
Always Flexible Data
On JSON
JSON JSO
JSON
JSON
N
Model
24x365
No downtime for software JSON document model with
upgrades, hardware no fixed schema.
maintenance, etc.
6. Flexible Data Model
{
“ID”: 1,
“FIRST”: “Dipti”,
“LAST”: “Borkar”,
“ZIP”: “94040”,
“CITY”: “MV”,
“STATE”: “CA”
} JSON JSON
JSON
JSON
• No need to worry about the database when changing your
application
• Records can have different structures, there is no fixed
schema
• Allows painless data model changes for rapid application
development
7. New in 2.0
JSON support Indexing and Querying
JSON
JSON JSO
JSON N
JSON
Incremental Map Reduce Cross data center replication
8. Additional Couchbase Server Features
Built-in clustering – All nodes equal Append-only storage layer
Data replication with auto-failover Online compaction
Zero-downtime maintenance Monitoring and admin API & UI
Built-in managed cached SDK for a variety of languages
9. Common Use Cases
Social Gaming Ad Targeting Session store
• Couchbase stores
player and game • Couchbase stores • Couchbase Server as a key-
data user information for value store
fast access
• Examples • Examples customers include:
customers include: • Examples customers Concur, Sabre
Zynga include:
• Tapjoy, Ubisoft, Ten AOL, Mediamind, Co
cent nvertro Content & Metadata
User Profile Store Store
Mobile Apps • Couchbase document store
• Couchbase Server as a with Elastic Search
• Couchbase stores user key-value store
info and app content • Examples customers
• Examples customers include: McGraw Hill
• Examples customers include: Tunewiki
include: Kobo, Playtika 3rd party aggregation data
High availability cache • Couchbase stores social media and
data feeds
• Couchbase Server used as a cache tier replacement • Examples customers include:
Sambacloud
• Examples customers include: Orbitz
10. Common Use Cases
Social Gaming Ad Targeting Session store
• Couchbase stores
player and game • Couchbase stores • Couchbase Server as a key-
data user information for value store
fast access
• Examples • Examples customers include:
customers include: • Examples customers Concur, Sabre
Zynga include:
• Tapjoy, Ubisoft, Ten AOL, Mediamind, Co
cent nvertro Content & Metadata
User Profile Store Store
Mobile Apps • Couchbase document store
• Couchbase Server as a with Elastic Search
• Couchbase stores user key-value store
info and app content • Examples customers
• Examples customers include: McGraw Hill
• Examples customers include: Tunewiki
include: Kobo, Playtika 3rd party aggregation data
High availability cache • Couchbase stores social media and
data feeds
• Couchbase Server used as a cache tier replacement • Examples customers include:
Sambacloud
• Examples customers include: Orbitz
11. Use Case: Content and Metadata Store
Content and Metadata Store Types of Data Application Requirements
• Content metadata • Flexibility to store any kind of
content
• Content: Articles, text • Fast access to content metadata
• Landing pages for website (most accessed objects) and
content
• Digital content: • Full-text Search across data set
eBooks, magazine, research
material • Scales horizontally as more content
gets added to the system
Why NoSQL and Couchbase
• Fast access to metadata and content via object-managed cache
• JSON provides schema flexibility to store all types of content and
metadata
• Indexing and querying provides real-time analytics capabilities
across dataset
• Integration with ElasticSearch for full-text search
• Ease of scalability ensures that the data cluster can be grown
seamlessly as the amount of user and ad data grows
13. Use Case: Content and metadata store
Building a self-
adapting, interactive learning
portal with Couchbase
14. The Problem
As learning move online in great numbers
Growing need to build interactive learning environments that
0101001001
1101010101
Scale! 0101001010
101010
Scale to millions of Serve MHE as well as third-party Including Support Self-adapt via
learners content open content learning apps usage data
15. The Challenge
Backend is an Interactive Content Delivery Cloud that must:
• Allow for elastic scaling under spike periods
• Ability to catalog & deliver content from many
sources
• Consistent low-latency for metadata and stats access
• Require full-text search support for content
discovery
• Offer tunable content ranking & recommendation
functions
Experimented with a combination of:
XML Databases In-memory Data Grids
SQL/MR Engines Enterprise Search Servers
16.
17. The Learning Portal
• Designed and built as a
collaboration between MHE Labs
and Couchbase
• Serves as proof-of-concept and
testing harness for Couchbase +
ElasticSearch integration
• Available for download and
further development as open
source code
18. Techniques Used
• Document Modeling
• Metadata & Content Storage
• View Querying to support Content Browsing
• Elastic Search Integration (Full Text Search)
- Content Updated in near Real-Time
- Search Content Summaries
- Relevancy boosted based on User Preferences
• Real-Time Content Updates
• Event Logging for offline analysis
19. Couchbase 2.0 + Elasticsearch
Store full-text articles as well Continuously accept updates
1 as document metadata for 3 from Couchbase with new
image, video and text content in content & stats
Couchbase
Logs user behavior to calculate Combine user preferences
2 user preference statistics (e.g. 4 statistics with custom
relevancy scoring to provide
video > text)
personalized search results
20. Data Model
• Stores content metadata for
media objects and content for
articles
Content Metadata • Includes
Bucket tags, contributors, type
information
• Includes pointer to the media
• Stores user view details per
type
User Profiles • Updated every time a user
Bucket views a doc with running count
• To be used for customizing ES
search results per user
preference
• Stores content view details
Content Stats • Updated for every time a
Bucket document is viewed
• To be used for boosting ES
search results based on
popularity
21. Architecture
App Server
External Media Store
Couchbase Ruby SDK queries over HTTP
ES
Data Refs
View Query TS Query
Couchbase
ES Transport
Via
XDCR
MR Views MR Views MR Views MR Views
Elastic Search
Cluster
Couchbase Server Cluster
22. Use Case: Social Gaming
Social and Mobile Gaming Types of Data Application Requirements
• User account information • Ability to support rapid growth
• User game profile info • Fast response times for awesome
• User’s social graph user experience
• State of the game • Game uptime –24x7x365
• Player badges and stats • Easy to update apps with new
features
Why NoSQL and Couchbase
• Scalability ensures that games are ready to handle the millions of
users that come with viral growth.
• High performance guarantees players are never left waiting to
make their next move.
• Always-on operations means zero interruption to game play (and
revenue)
• Flexible data model means games can be developed rapidly and
updated easily with new features
24. Use Case: Social gaming
Building a social game with an
awesome user experience that
can scale to millions of players
25. The Problem
Social gaming is all about the experience
Applications needs
- User centric data (read key-value access)
- Scalability
- Easy and simple backend
26. The Challenge
Backend must be a platform for multiple games
• Must be scalable
• Highly available
• Extreme performance (latency and throughput)
• Cost effective
• Operationally easy to maintain
Experimented with several databases
Couchbase DBShards
MongoDB MySQL Cluster
27. Evaluations considerations
Couchbase MongoDB dbShards MySQL Cluster (NDB)
Sharding strategy
Replication
Failover support
Scalability
Customized data support
System compatibility
Coding effort
Performance
Protocol
Upgrade difficulty
Data persisting method
Map Reduce / Join
SQL compatible
Licensing Price
Bulk price
Management / monitor tool
Hardware requirement
Supported OS
Operation knowledge
Operation training
Operation difficulty
Developer company size
Market penetration
Support
Successful use cases
29. Use Case: High availability caching
High availability caching Types of Data Application Requirements
• Application objects • Consistently low response times
• Popular search query for document / key lookups
results • High-availability - 24x7x365
• Session information • Operationally easy to migrate /
• Heavily accessed web upgrade / maintain with app
landing pages online
• Replacement for entire caching
tier
Why NoSQL and Couchbase
• Low latency in sub-milliseconds with consistently high read /
write throughput
• Always-on operations even for database upgrades and
maintenance with zero down time
• memcached compatibility for easy migration to Couchbase
without any application changes
• High availability and disaster replication with intra-cluster and
cross-cluster replication (XDCR)
30. Use Case: Ad Targeting
Ad Targeting Types of Data Application Requirements
• User profile: preferences • High performance to meet
and psychographic data limited ad serving budget; time
• Ad serving history by user allowance is typically <40 msec
• Ad buying history by • Scalability to handle hundreds of
advertiser millions of user profiles and
rapidly growing amount of data
• Ad serving history by
advertiser • 24x7x365 availability to avoid ad
revenue loss
Why NoSQL and Couchbase
• Sub-millisecond reads/writes means less time is needed for data
access, more time is available for ad logic processing, and more
highly optimized ads will be served
• Ease of scalability ensures that the data cluster can be grown
seamlessly as the amount of user and ad data grows
• Always-on operations = always-on revenue. You will never miss
the opportunity to serve an ad because downtime.