DocumentDB is a powerful NoSQL solution. It provides elastic scale, high performance, global distribution, a flexible data model, and is fully managed. If you are looking for a scaled OLTP solution that is too much for SQL Server to handle (i.e. millions of transactions per second) and/or will be using JSON documents, DocumentDB is the answer.
2. About Me
Microsoft, Big Data Evangelist
In IT for 30 years, worked on many BI and DW projects
Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM
architect, PDW/APS developer
Been perm employee, contractor, consultant, business owner
Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data World conference
Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting Microsoft Azure
Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data
Platform Solutions
Blog at JamesSerra.com
Former SQL Server MVP
Author of book “Reporting with Microsoft SQL Server 2012”
4. What is NoSQL?
Choose the store that
best fits your needs
A database solution designed to compensate for the technical limitations of SQL
5. Traditional approach: relational stores
Data is stored in tables that comprise:
• Schemas
• Columns
• Rows
Chappell & Associates. “Understanding NoSQL on Microsoft Azure.” 2014. http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf.
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
6. Azure DocumentDB
Uses all but graph category
Includes some key-value and columnstore capabilities
NoSQL approach: various types of stores
PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
A NoSQL database uses four categories of stores:
7. Key-value stores
Key-value stores offer high speed
through the least-complicated data
model—anything can be stored as
a value, as long as each value is
associated with a key or name.
Key Value
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
8. Wide-column stores
Wide-column stores are fast and can be almost as simple as key-value stores. They
include a primary key, an optional secondary key, and anything stored as a value.
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Values
Primary key
Keys and values can be
sparse or numerous
Secondary key
9. Graph databases
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Title:
Forgotten
Bridges
Title:
Mythical
Bridges
Purchased
Date: 03/02/2011
Purchased
Date: 09/09/2011
Purchased
Date: 05/07/2011
Name:
Ian
Name:
Alan
10. Document stores
Document stores contain data objects
that are inherently hierarchical, tree-
like structures (most notably JavaScript
Object Notation [JSON] or Extensible
Markup Language [XML]).
Note that these are not Microsoft
Word documents!
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
11. NewSQL: another variation
Relational NewSQL stores are
designed for web-scale applications,
but they still require up-front schemas,
joins, and table management that can
be labor intensive.
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015.
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf.
12. Why NoSQL evolved
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015
http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Drivers
13. SQL and NoSQL: each has its place
Fully featured RDBMS
Transactional processing
RichQuery
Managed as a service
Elastic scale
Internet-accessible http/rest
Schema-free data model
Arbitrary data formats
14. Azure DocumentDB
Perfect for cloud architects and developers who need an enterprise-ready NoSQL document database
JSON
{
"name": "John",
"country": "Canada",
"age": 43,
"lastUse": "March 4, 2014"
}
{
"name": "Eva",
"country": "Germany",
"age": 25
}
{
"name": "Lou",
"country": "Australia",
"age": 51,
"firstUse": "May 8, 2013"
}
{
"docCount": 3,
"last": "May 1, 2014"
}
DOCUMENT1
DOCUMENT2
DOCUMENT3
DOCUMENT4
A NoSQL document database-as-a-service, fully managed by Azure
15. {
"name": "SmugMug",
"permalink": "smugmug",
"homepage_url": "http://www.smugmug.com",
"blog_url": "http://blogs.smugmug.com/",
"category_code": "photo_video",
"products": [
{
"name": "SmugMug",
"permalink": "smugmug"
}
],
"offices": [
{
"description": "",
"address1": "67 E. Evelyn Ave",
"address2": "",
"zip_code": "94041",
"city": "Mountain View",
"state_code": "CA",
"country_code": "USA",
"latitude": 37.390056,
"longitude": -122.067692
}
]
}
Perfect for: schema-agnostic JSON store for
hierarchical and denormalized data at scale
What documents?
Not Word documents
16. Azure DocumentDB details
Native support for JavaScript, SQL query, and transactions over JSON documents
Reliable and
predictable
performance
• Tunable consistency
• Elastic scale
Rapid
development
• Build with familiar tools—REST,
JSON, JavaScript
RichQuery and
transactions
over JSON data
• Query JSON data with no
secondary indices
Ideal for apps designed for the cloud when the following are high priorities:
17. Top Features
Auto-scaling/sharding
• Improved scalability and reliability due to distribution
of large data sets across multiple machines
Automatic indexing
• All document properties are available for queries
• Frees you from relying on schemas or secondary indexes
SQL query language
• Make use of SQL experience and .NET LINQ
Managed service
• Spin up on demand with no setup
• Availability guarantee of 99.95%
• Linear price curve without virtual-machine step functions
• Integration with Azure HDInsight and Azure Search
18. Top Features
Greater consistency control
• Four consistency levels provide more options for
consistency, availability, and performance requirements
Atomicity, Consistency,
Isolation, and Durability
(ACID) transaction control
• Simpler programming model (compared to state variables)
• Use JavaScript for insert, update, and delete actions
Standards-based open
API with RESTful HTTP
• Uses JSON standard—no mapping of Binary
JSON (BSON) to JSON needed
Granular access rights
• Allows access to all documents and attachments
within collections
19. Monitor an account
• View performance metrics for a DocumentDB account
• Customize performance metric views for a DocumentDB account
• Create side-by-side performance metric charts
• View usage metrics for a DocumentDB account
• Set up performance metric alerts for a DocumentDB account
20. Today’s modern apps
• Produce and consume data at a staggering rate
• Require instantaneous response times to
match user expectations
• Are developed iteratively with many versions
supported concurrently
• Are developed with continuously evolving
data models
• Are increasingly complex
• Experience unpredictable and explosive growth
21. Well-suited for web and mobile apps
Catalog data Preferences
and state
Event store
User-generated
content
Data exchange
22. Azure DocumentDB at Microsoft
More than 450 million unique users
Store 20 TB of JSON document data
Under 15 millisecond (ms) writes and
single-digit ms reads
Store for 40+ app/device combinations
Available globally to serve all markets
USER DATA STORE
24. Azure DocumentDB basics
Resource model
• Entities addressable by logical Uniform Resource
Identifier (URI)
• Partitioned for scale out
• Replicated for high availability
• Entities represented as JSON
• Accounts scale out by moving a slider
Interaction model
• RESTful interaction over HTTPS
• HTTPS and TCP connectivity
• Standard HTTPS verbs and semantics
Development
• .NET, Node.js, Python, Java, and JavaScript clients
• SQL for query expression, .NET LINQ
• JavaScript for server-side app logic
Azure
DocumentDB
account Databases
Users
Permissions
101
010
Attachments
Your documents here
{ }
{ }
DocumentsCollections
Stored procedures
Triggers
User-defined functions
JS
JS
JS
25. • Collections != tables
• Unit of partitioning
• Transaction boundary
• No enforced schema, flexible
• Queried or updated stay together in
one collection
• Elasticity to 10 GB
• RUs evenly distributed
across partitions
Azure DocumentDB collections
101
010
Attachments
Your documents here
DocumentsCollections
Stored procedures
Triggers
User-defined functions
JS
JS
JS
26. …
Elastic collections
• Collection != single partition
• Partition count dynamic
• Each partition (key) is 10 GB
• Online splits and merges with
full availability
• RUs evenly distributed
across partitions
27. Rich query over JSON data
Native JavaScript
transactional
processing
Familiar SQL-based
query language
Query on JSON data
without specifying
secondary indices or
constructing views
Build modern, scalable apps with robust transactional querying and data
processing on JSON documents
28. JavaScript transactions
Transactionally process multiple documents
with application-defined stored procedures
and triggers
• JavaScript as the procedural language
• Language integrated
• Execution wrapped in an implicit transaction
• Preregistered and scoped to a collection
• Performed with ACID guarantees
• Triggers invoked as pre- or post-operations
Stored procedures
JS
Triggers
29. Reliable and predictable performance
Tunable
consistency
Elastic scaleFast, predictable
performance
Defined throughput levels that scale
linearly with application needs
Azure DocumentDB is born in the cloud to achieve fast, predictable performance
with reserved resources to deliver on throughput needs. Delivers reliable, tunable
consistency to increase performance based on application needs.
30. Document myDoc = await
client.ReadDocumentAsync(documentLink, new
RequestOptions { ConsistencyLevel =
ConsistencyLevel.Eventual });
Four consistency levels
Strong Session
Bounded
Staleness
Eventual
Lower consistency level on read operations
31. Consistency levels enable guarantees
Choose your consistency level and make predictable trade-offs between
consistency, availability, and performance
Choose
your level
Strong
Data consistency
Session
Monotonic reads
(on explicit read
requests) and writes
Bounded Staleness
Total order of
propagation of writes
Eventual
Lowest latency
for reads and writes
32. Security model
Azure Document DB is designed to be secure with:
• Master key
• Access control on resources
• User operations
• Permission operations
• Code execution
33. Rapid development
Easy to start and
fully-manage
Enterprise-grade
Azure platform
Build with familiar
tools—REST, JSON,
and JavaScript
Reduce development friction and complexity when building new business-class
applications by using familiar tools and industry-standard platforms. Combine
Azure DocumentDB with a portfolio of complementary cloud services on the
Azure platform, such as the Azure HDInsight Connector and Azure Search Indexer
35. Azure DocumentDB service summary
Unique among NoSQL stores:
• Developed for the cloud and for delivery
as a service
• Truly query-able JSON store
• Transactional processing through language-
integrated JavaScript
• Predictable performance and
tunable consistency
36. Development scenarios
Consider Azure DocumentDB when you need:
• To build new web and mobile cloud-based applications
• Rapid development and high-scalability requirements
• Query and processing of user- and device-generated data
• More query and processing support for your key-value stores
• To run a document store in virtual machines
• A managed service model
37. Build your first Azure DocumentDB app today
Get support
Schedule a 1:1 chat directly with
the Azure DocumentDB engineering
team at askdocdb.com
Give feedback
Ask questions through the forum
at http://aka.ms/docdbforum
Suggest an idea and vote to support
other ideas for Azure DocumentDB at
http://aka.ms/docdbideas
On Twitter @documentdb
Get started
Sign up for Azure DocumentDB
at http://aka.ms/docdbstart
Access and configure your account
at http://portal.azure.com
Download an SDK from
http://aka.ms/docdbsdks,
and then build a sample at
http://aka.ms/docdbsample
39. Learn more
David Chappell NoSQL overview paper on Infopedia
http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf
Seven Databases in Seven Weeks: A Guide to Modern
Databases and the NoSQL Movement [book]
http://www.pdfiles.com/pdf/files/English/Databases/Seven_Databases_In_Seven_Weeks.pdf
Replicated Data Consistency Explained Through Baseball
[paper]
http://research.microsoft.com/apps/pubs/default.aspx?id=206913
40. Q & A ?
James Serra, Big Data Evangelist
Email me at: JamesSerra3@gmail.com
Follow me at: @JamesSerra
Link to me at: www.linkedin.com/in/JamesSerra
Visit my blog at: JamesSerra.com (where this slide deck will be posted)