11. {
"name": "SmugMug",
"permalink": "smugmug",
"homepage_url": "http://www.smugmug.com",
"blog_url": "http://blogs.smugmug.com/",
"category_code": "photo_video",
"products": [
{
"name": "SmugMug",
"permalink": "smugmug"
}
],
"offices": [
{
"description": "",
"address1": "67 E. Evelyn Ave",
"address2": "",
"zip_code": "94041",
"city": "Mountain View",
"state_code": "CA",
"country_code": "USA",
"latitude": 37.390056,
"longitude": -122.067692
}
]
}
Perfect for these
Documents
schema-agnostic JSON store
for
hierarchical and de-normalized data at scale
12. Azure DocumentDB
Millions of RPS
Many TBs of data
Transparent Partitioning
<10ms Reads
<15ms Writes
@P99
Low-latency access
around the globe!
Automatic Indexing
Easy-to-learn query
grammar
Multi-Record
Transactions
Blazing fast, planet scale NoSQL service
99.99% SLAs for availability, latency, and throughput
18. Item Author Pages Language
Harry Potter and the Sorcerer’s
Stone
J.K. Rowling 309 English
Game of Thrones: A Song of Ice
and Fire
George R.R.
Martin
864 English
19. Item Author Pages Language
Harry Potter and the Sorcerer’s
Stone
J.K. Rowling 309 English
Game of Thrones: A Song of Ice
and Fire
George R.R.
Martin
864 English
Lenovo Thinkpad X1 Carbon ??? ??? ???
20.
21.
22. Item Author Pages Language Processor Memory Storage
Harry Potter
and the
Sorcerer’s
Stone
J.K.
Rowling
309 English ??? ??? ???
Game of
Thrones: A
Song of Ice
and Fire
George
R.R.
Martin
864 English ??? ??? ???
Lenovo
Thinkpad X1
Carbon
??? ??? ??? Core i7
3.3ghz
8 GB 256 GB
SSD
23. Item Author Pages Language
Harry Potter and the Sorcerer’s
Stone
J.K. Rowling 309 English
Game of Thrones: A Song of Ice
and Fire
George R.R.
Martin
864 English
Item CPU Memory Storage
Lenovo Thinkpad X1 Carbon Core i7 3.3ghz 8 GB 256 GB
SSD
24. ProductId Item
1 Harry Potter and the
Sorcerer’s Stone
2 Game of Thrones: A Song of
Ice and Fire
3 Lenovo Thinkpad X1 Carbon
ProductId Attribute Value
1 Author J.K. Rowling
1 Pages 309
…
2 Author George R.R. Martin
2 Pages 864
…
3 Processor Core i7 3.3ghz
3 Memory 8 GB
…
31. The Challenge
Scale with expectation of
millions of users on Day 1
Deliver real time responsiveness
for a lag-free, gaming experience
Highly competitive – high scores
and global leaderboards critical
More Users, More Problems
32.
33. The Results
#1 in Apple app store free apps
during launch week
>1M downloads
~1B queries per day
99p queries served under 10ms
38. Why is this such a hard problem?
Caches
Scoreboard keeps updating…
SQL database
Need to shard
Schema and Index Management
Loss of relational benefits
Azure Table Storage
Secondary Indexes
Latency
Throughput
39. Planet-Scale NoSQL
Horizontal Scaling for storage and
throughput
High performance with SSDs and
automatic indexing
Operating on a global scale
44. Request Unit (RU) is the
normalized currency
% Memory
% IOPS
% CPU
Replica gets a fixed budget
of Request Units
Resource
Resource
set
Resource
Resource
DocumentsSQL
sprocs
args
Resource Resource
Predictable Performance
50. Globally Distributed
• Not just for disaster recovery…. DocumentDB is unreasonably highly available
• Replicate data across any # of regions of your choice
• Low-latency access to your data around the globe
• Dynamically configure your write and read regions
Azure DocumentDB gives you the ability cheat the speed of light!
51. Bounded Staleness Session EventualStrong
LEFT TO RIGHT Relaxed consistency => better performance and availability
Consistency Level Strong Bounded Staleness Session Eventual
Total global order Yes Yes, outside of the “staleness
window”
No, partial “session”
order
No
Consistent prefix
guarantee
Yes Yes Yes Yes
Monotonic reads Yes Yes, across regions outside of the
staleness window and within a
region all the time
Yes, for the given session No
Monotonic writes Yes Yes Yes Yes
Read your writes Yes Yes (in the write region) Yes No
Strong
consistency, High
latency
Eventual consistency,
Low latency
27%
3%
54%
16%
Observed Distribution
BoundedStalene
ss
Eventual
Session
52. App defined regional preferences
ConnectionPolicy docClientConnectionPolicy = new ConnectionPolicy { ConnectionMode =
ConnectionMode.Direct, ConnectionProtocol = Protocol.Tcp };
docClientConnectionPolicy.PreferredLocations.Add(LocationNames.EastUS2);
docClientConnectionPolicy.PreferredLocations.Add(LocationNames.WestUS);
docClient = new DocumentClient(
new Uri("https://myglobaldb.documents.azure.com:443"),
"PARvqUuBw2QTO4rRXr6d1GnLCR7VinERcYrBQvDRh6EDTJLOHtZxgjTS4pv8nQv2Lg1QQLBLfO6TVziOZKvYow==",
docClientConnectionPolicy);
53.
54. Automatic Indexing
• Index is a union of all the document trees
Common
structure
Terms Postings List/Values
$/location/0/ 1, 2
location/0/country/ 1, 2
location/0/city/ 1, 2
0/country/Germany 1, 2
1/country/France 2
… …
0/city/Moscow 2
0/dealers/0 2
http://aka.ms/docdbvldb
No need to define secondary indices / schema hints!
55. Index policies
customize index management including storage
overhead, throughput and query consistency
range, hash and spatial indexes
included and excluded paths
indexing mode; consistent or lazy
index precision
online, in-place index transformations
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*",
"indexes": [
{
"kind": "Range",
"dataType": "Number",
"precision": -1
},
{
"kind": "Hash",
"dataType": "String",
"precision": 3
},
{
"kind": "Spatial",
"dataType": "Point"
}
]
}
],
"excludedPaths": []
}
56. -- Nested lookup against index
SELECT Books.Author
FROM Books
WHERE Books.Author.Name = "Leo Tolstoy"
-- Transformation, Filters, Array access
SELECT { Name: Books.Title, Author: Books.Author.Name }
FROM Books
WHERE Books.Price > 10 AND Books.Languages[0] = "English"
-- Joins, User Defined Functions (UDF)
SELECT CalculateRegionalTax(Books.Price, "USA", "WA")
FROM Books
JOIN LanguagesArr IN Books.Languages
WHERE LanguagesArr.Language = "Russian"
SQL Query Grammar
68. No magic bullet
Think about how your data is
going to be written, read and
model accordingly
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"countOfBooks": 3,
"books": [1, 2, 3],
"images": [
{"thumbnail": "http://....png"}
{"profile": "http://....png"}
]
}
{
"id": 1,
"name": "DocumentDB 101",
"authors": [
{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"},
{"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}
]
}
Notas del editor
Image licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license.
http://commons.wikimedia.org/wiki/File:Crying-girl.jpg
Well nested, multiple properties and values
Not word documents
Well nested, multiple properties and values
Query over heterogeneous documents without defining schema or managing indexes
Query arbitrary paths, properties and values without specifying secondary indexes or indexing hints
Execute queries with consistent results in the face of sustained writes
Query through fluent language integration including LINQ for .NET developers and a “document oriented“ SQL grammar for traditional SQL developers
Extend query execution through application supplied JavaScript UDFs
Supported SQL features include; predicates, iterations (arrays), sub-queries, logical operators, UDFs, intra-document JOINs, JSON transforms
Stored Procedures and Triggers
Familiar programming model constructs for executing application logic
Registered as named, URI addressable, durable resources
Scoped to a DocumentDB collection
JavaScript as a procedural language to express business logic
Language integration
JavaScript throw statement results into aborting the transaction
Execution
JavaScript runtime is hosted on each replica
Pre-compiled on registration
The entire procedure is wrapped in an implicit database transaction
Fully resource governed and sandboxed execution
Stored Procedures and Triggers
Familiar programming model constructs for executing application logic
Registered as named, URI addressable, durable resources
Scoped to a DocumentDB collection
JavaScript as a procedural language to express business logic
Language integration
JavaScript throw statement results into aborting the transaction
Execution
JavaScript runtime is hosted on each replica
Pre-compiled on registration
The entire procedure is wrapped in an implicit database transaction
Fully resource governed and sandboxed execution
Stored Procedures and Triggers
Familiar programming model constructs for executing application logic
Registered as named, URI addressable, durable resources
Scoped to a DocumentDB collection
JavaScript as a procedural language to express business logic
Language integration
JavaScript throw statement results into aborting the transaction
Execution
JavaScript runtime is hosted on each replica
Pre-compiled on registration
The entire procedure is wrapped in an implicit database transaction
Fully resource governed and sandboxed execution
Source: http://en.wikipedia.org/wiki/Denormalization
In computing, denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data.[1][2] In some cases, denormalization is a means of addressing performance or scalability in relational database software.
With DocumentDB, you can choose to also use a hybrid model that to mimic advantages of normalization.
With DocumentDB, you can choose to also use a hybrid model that to mimic advantages of normalization.