3. aim.
Learn using Cosmos DB in
your Serverless solutions.
Improve performance of your
whole solution.
4. www.jankowskimichal.pl 4
way
of working
A bit of theory, then a lot of demos and practice.
I would encourage you to work together and exchange
your knowledge.
we should have fun
5. Steve Jobs, cofounder of Apple
Great things in business are
never done by one person.
They're done by a team of
people.
10. www.jankowskimichal.pl 10
serverless
characteristics
server abstraction
There is no server managing
tasks.
productivity
Reduce tasks related to
infrastructure. You can focus
on development activities.
event driven
Function does not work when
there is no event triggering it. It
can also instantly scale up.
focus on features
And then you are able to focus
on business logic of your app.
microbilling
Pay only when there are events.
But think about DDOS on your
wallet.
faster time to market
All items mentioned together
allow you to reduce time to
market.
11. www.jankowskimichal.pl 11
serverless
in Azure
An event-based serverless compute experience to
accelerate your development. Scale based on demand
and pay only for the resources you consume.
Azure Functions
A single service for managing routing of all events
from any source to any destination. Designed for high
availability, consistent performance and dynamic
scale. Event Grid lets you focus on your app logic
rather than infrastructure.
Event Grid
12. www.jankowskimichal.pl 12
serverless
in Azure
Provide a way to simplify and implement scalable
integrations and workflows in the cloud. It provides a
visual designer to model and automate your process
as a series of steps known as a workflow.
Logic Apps
Is a service that allows you to create automated
workflows between your favourite applications and
services to synchronize files, get notifications, collect
data, and more.
Flow
13. www.jankowskimichal.pl 13
serverless
in Azure
Was built from the ground up with global distribution
and horizontal scale at its core. It offers turnkey global
distribution with multi-master support across any
number of Azure regions by transparently scaling and
replicating your data wherever your users are.
Cosmos DB
15. www.jankowskimichal.pl 15
DocumentDB
preview
21/08 2014
SQL grammar over schema-free JSON
Tuneable throughput, indexing, consistency
Server-side ACID transactions
microsoft
noSQL database evolution
Internal Microsoft
DocumentDB service
2010 - 2014
Office
OneNote
Xbox
Part of Azure portal
16. www.jankowskimichal.pl 16
Cosmos DB
…improvements
– May 2016
Partitioned collections
Geo-replication
MongoDB wire protocol support
microsoft
noSQL database evolution
08/04 2015
DocumentDB GA
ORDER BY
String range queries
Geospatial support
Partitioning support by SDK
10/05 2017
17. www.jankowskimichal.pl 17
azure
cosmos db
limitless elastic scale
around the globe
With Azure Cosmos DB, you pay
only for the throughput and storage
you need. Azure Cosmos DB allows
you to independently and elastically
scale storage and throughput at any
time, anywhere across the globe,
making it a perfect ally for your
serverless applications.
Only Azure Cosmos DB allows you to
use key-value, graph, column-family,
and document data in one service.
Azure Cosmos DB automatically
indexes all data and allows you to use
your favourite API including SQL,
JavaScript, Gremlin, MongoDB,
Apache® Cassandra, and Azure Table
Storage to access your data.
multi-model + multi-API
Easily build globally distributed applications
without the hassle of complex, multiple-
datacenter configurations. Designed as a
globally distributed database system, Azure
Cosmos DB automatically replicates your
data to any number of regions of your choice
for fast, responsive access. Azure Cosmos
DB supports transparent multi-homing and
guarantees 99.999% high availability.
turnkey global distribution
18. www.jankowskimichal.pl 18
azure
cosmos db
industry-leading,
enterprise-grade SLAs
Rest assured your apps are running
on a "battle-tested" database
service built on world-class
infrastructure. Azure Cosmos DB
gives you enterprise-grade security
and compliance, and is the first and
only service to offer industry-
leading comprehensive SLAs for
99.999% high availability, latency at
the 99th percentile, guaranteed
throughput, and consistency.
Serve read and write requests from
the nearest region while
simultaneously distributing data
across the globe. With its latch-free
and write-optimized database engine,
Azure Cosmos DB guarantees less
than 10-ms latencies on reads and
less than 15-ms latencies on
(indexed) writes at the 99th
percentile.
guaranteed low latency
at 99th percentile
Azure Cosmos DB offers five well-defined
consistency levels—strong, bounded
staleness, consistent-prefix, session, and
eventual—for an intuitive programming
model with low latency and high availability
for your planet-scale app.
multiple, well-defined
consistency choices
20. Key benefits
• Cosmos DB supports fast ingestion
of message data from 1:1
communication, group chats
• Cosmos DB enables real-time query
over message and group
conversations, with custom filters
on when user enters/leaves thread
Business need
• Provide search capabilities over
TBs-PBs of Skype and Teams
conversations
• Fast ingestion with multiple writes,
overlay group memberships
• Secure & compliant data storage
with high privacy requirements
Azure Cosmos DB
Azure Cosmos DB
Azure Cosmos DB
USERS
GROUPS
MESSAGES
Skype Ingestion
service
Skype Query
service
44TB
Message data
Skype powers 1M searches per second
over conversation data
6TB
User data
1TB
Group data
Source: Building globally distributed applications with Azure Cosmos DB
21. Key benefits
• Cosmos DB can scale elastically
without operational overhead of
MongoDB
• Perform fast queries over events
to deliver recommended services,
safety notices to vehicles
• Perform staged migration via
MongoDB APIs
Business need
• Need to ingest massive volumes
of diagnostic data from vehicles
and take
real-time actions as part
of connected car platform
• Management and operations
of database infrastructure to
handle exponential growth
of data
8TB
Vehicle Telemetry
250K
Lexus Cars
Toyota drives connected car push forward
with Azure Cosmos DB
Azure Cosmos DBAzure HDInsight
Storm
Azure Storage
(archival)
Source: Building globally distributed applications with Azure Cosmos DB
22. Business need
• Handle millions of players on
Day 1 due to popularity of the
TV series
• Match-making of players for
competitive and lag-free
experience
• Provide new content weekly, and
iterate on social functionality
•
Key benefits
• Cosmos DB provides elastic scalability
for millions of users and flexible
schema to support social features
and gameplay
• Global distribution allows for low
latency for players spread worldwide
• Automatic indexing used to build
real-time leaderboards
Performance at massive scale allows
millions to play mobile game
Azure Traffic
Manager
Azure API Aps
(game backend)
Azure CDN
Azure Cosmos DB
Azure Functions
Azure Notification
Hubs (push
notifications)
Azure Storage
(game files)
1M
Peak Active
#1
iOS App Store
1B
Daily Queries
Source: Building globally distributed applications with Azure Cosmos DB
26. www.jankowskimichal.pl 26
multiple
APIs
SQL API (json)
MongoDB API (bson)
document
Cassandra API
column-family
Gremlin API
(graph traversal language)
graph
Table API
(potential replacement for
Azure Table Storage)
key-value
28. www.jankowskimichal.pl 28
this is time
for you
It is time for you to start your journey with this product.
Play with Cosmos DB
jankowskimichal.pl/cosmosdb-list1
30. www.jankowskimichal.pl 30
data
in Cosmos DB
• denormalized data
• referential integrity NOT enforced
• mixed data in collections
• flexible schema
• SQL-like language as well as JavaScript and
others
Cosmos DB (document db)
• normalised data
• referential integrity enforced by
normalisation and relationships
• uniform data in tables
• schema is set
• SQL
Relational database
33. www.jankowskimichal.pl 33
database
creation
• create new one or use an existing one
• setup performance on database level – throughput
• there is no limit for it, but we need to contact
support when we need more than 1 000 000 RU/s
• we need to confirm higher costs
we can
34. www.jankowskimichal.pl 34
request units
normalized measure of request processing cost
Item
size
Reads /
second
Writes /
second
Request
units
1 KB 500 100 1,000 RU/s
1 KB 500 500 3,000 RU/s
4 KB 500 100 1,350 RU/s
4 KB 500 500 4,150 RU/s
64 KB 500 100 9,800 RU/s
64 KB 500 500 29,000 RU/s
Table that shows how many request units to provision for items with three different sizes (1 KB, 4 KB, and 64 KB)
and at two different performance levels (500 reads/second + 100 writes/second and 500 reads/second + 500
writes/second). In this example, the data consistency is set to Session, and the indexing policy is set to None.
• combines memory, CPU and IOPS into currency
rate
• each same request will always consume the
same amount of RUs
• each time we will get information about
operation cost
• we are paying for some capacity:
• when it is exhausted our operations will be
replanned
• we can increase or decrease the amount of
throughput instantaneously
• without capacity our service will stop
• cost of write operation is higher than read one
• pricing cost may be different per region
37. www.jankowskimichal.pl 37
partition
key
Remember about some limits:
• A physical partition can store a maximum of 10GB of data
• A physical partition can facilitate at most 10,000 RU/s
partition key selection
is the most important
decision
Cosmos DB account
Database
Collections
Physical partition
Logical partition
Logical
partition
Physical
partition
Documents
43. how to do it right?
• Choosing partition key purely depends on structure
of data.
• It is important to choose a partition key property
that has a number of distinct values.
• An ideal partition key is one that appears
frequently as a filter in your queries and has
sufficient cardinality to ensure your solution is
scalable.
• If chosen partition key doesn't have many distinct
values then all queries will get fired to a single
partition which may slow down performance.
• General:
• Do not be afraid of having too many partition
keys. In most cases, more partition keys mean
more scalability
45. www.jankowskimichal.pl 45
this is time
for you
1. Familiarise with presented queries.
2. Try to develop queries to a bigger database
Let's do something more serious
jankowskimichal.pl/cosmosdb-list2
48. www.jankowskimichal.pl 48
from
my perspective
quite simple product with some limitations and great potential
we need to change how we are thinking about data and database
can be very expensive when it is wrong designed or used
50. www.jankowskimichal.pl 50
serverless
main benefits
cost
can be more cost-effective than
renting or purchasing a fixed quantity
of servers
operations / scalability
a serverless architecture means that
developers and operators do not need
to spend time setting up and tuning
autoscaling policies or system
productivity
simplifying the task of back-end
software development
52. www.jankowskimichal.pl 52
know
the limits
The speed of light in vacuum is a universal physical constant important in many areas of
physics. Its exact value is 299,792,458 metres per second. The speed at which light propagates
through transparent materials, such as glass or air, is less than c.
54. geo replication.
• it has never been so easy
• you can replicate your
data to as many data
centers as you need
• you can do it with just a
few clicks
56. www.jankowskimichal.pl 56
bounded
staleness
Bounded staleness consistency guarantees that
the reads may lag behind writes by at most K
versions or prefixes of an item or t time-interval.
The cost of a read operation (in terms of RUs
consumed) with bounded staleness is higher than
session and eventual consistency, but the same as
strong consistency.
57. www.jankowskimichal.pl 57
session
Session consistency is ideal for all scenarios
where a device or user session is involved since it
guarantees monotonic reads, monotonic writes,
and read your own writes (RYW) guarantees.
Session consistency provides predictable
consistency for a session, and maximum read
throughput while offering the lowest latency writes
and reads.
58. www.jankowskimichal.pl 58
consistent
prefix
Consistent prefix guarantees that in absence of
any further writes, the replicas within the group
eventually converge.
Consistent prefix guarantees that reads never
see out of order writes. If writes were performed
in the order A, B, C, then a client sees either A or
A,B, or A,B,C, but never out of order like A,C or
B,A,C.
59. www.jankowskimichal.pl 59
eventual
Eventual consistency guarantees that in
absence of any further writes, the replicas
within the group eventually converge.
It is the weakest form of consistency a client
may get the values that are older than the ones
it had seen before.
Provides the weakest read consistency but
offers the lowest latency for both reads and
writes with the lowest cost of a read operation.
60. prefer
bounded staleness20%
Azure Cosmos DB tenants use
session consistency73%
experiment with various consistency
levels initially before settling on
a specific consistency
3%
61. www.jankowskimichal.pl 61
read – nothing to do
In the rare event of an Azure regional outage or data center
outage, Cosmos DB automatically triggers failovers of all Cosmos
DB accounts with a presence in the affected region.
write – automatic failover must be on
If the affected region is the current write region and automatic
failover is enabled for the Azure Cosmos DB account, then the
region is automatically marked as offline. Then, an alternative
region is promoted as the write region for the affected Azure
Cosmos DB account.
* manual failover
Can be used as follow the clock model.
regional
failover
62. www.jankowskimichal.pl 62
data
index
customisations
scope - include or exclude documents and paths to and from the
index
index types
- hash - supports efficient equality and JOIN queries
- range - supports efficient equality queries, range queries (using
>, <, >=, <=, !=), and ORDER BY queries
- spatial - supports efficient spatial (within and distance) queries
precision - make trade-offs between index storage overhead and
query performance
index update mode - consistent, lazy, and none
indexing everything by default
63. azure
functions.
By definition, synergy happens
when the interaction between
two elements produces an effect
greater than the individual
elements’ contribution.
65. www.jankowskimichal.pl 65
this is time
for you
1. Create a function that allows adding new ToDo
items to the database
2. Create a function that will list all ToDo items
assigned to list
Let’s build API for the application
jankowskimichal.pl/cosmosdb-list3
68. denote discrete objects, such as a person, a place, or an event
Id: DEN
Label: airport
Properties:
• Code: DEN
• City: Denver
• Description: Denver International Airport
• Elevation: 5443
verticle
denote relationships between vertices
Id
Label: route
Properties:
• Distance: 542
edge
69. www.jankowskimichal.pl 69
sample
graph
label: route
properties:
• distance: 2249
id: United States
label: country
properties:
• code: US
• name: United States
label: contains
label: contains
id: DEN
label: airport
properties:
• code: DEN
• city: Denver
• elevation: 5443
id: ATL
label: airport
properties:
• code: ATL
• city: Atlanta
• elevation: 1026
71. www.jankowskimichal.pl 71
this is time
for you
1. Get all the details about your favourite airport.
2. Check how you can get to your vacation location.
Maybe you should plan your holidays
jankowskimichal.pl/cosmosdb-list4
72. summary.
We made a brief introduction
to serverless and its'
connections to Cosmos DB.
We learned how to use SQL
API and connect Cosmos DB
with Azure Functions.
You should know how you
can make optimisation of
your environment.
We tried graph API.
73. www.jankowskimichal.pl 73
do you have any
questions?
www.jankowskimichal.pl
@JankowskiMichal
mail@jankowskimichal.pl
github.com/MichalJankowskii