Link: https://youtu.be/C4rlepOPk5o
https://go.dok.community/slack
https://dok.community/
From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.
Architecting for serverless requires a unique look at app logic and the way it is deployed. It takes a combination of the logical and physical worlds. An architectural pattern has emerged where we can scale ephemeral compute separate from services that need to persist.
We use Kubernetes to deliver exactly this. A “serverless” experience that is driven and enabled by compute pods and storage pods. We also have used our experience running thousands of database clusters on Kubernetes to automate the operational expertise of managing a distributed database.
In this talk, we will take a dive deep into the architecture of our application and share:
* A definition and outline of the challenges of serverless
* How we reworked our logic for a serverless approach
* How we use Kubernetes to gain serverless autoscaling
-----
Jim is a recovering developer turned evangelist who loves useful, cool, cutting-edge tech. He loves to translate and distill complex concepts into compelling, more simple explanations that broader communities can consume. He is an advocate of the developer and an active participant in several open source communities.
2. Before we get started
Jim Walker
• Principal Product Evangelist
• @jaymce
This session is INTERMEDIATE
• I am not database “experts”
• I am curious and love tech
• I think this stuff is cool and these concepts
define the future
• GOAL: a high level context of the concept
6. Serverless as a computing paradigm…
1. Little to no manual server management
2. Automatic, elastic app/service scale
3. Built-in resilience and inherently fault tolerant service
4. Always available and instant access
5. Consumption-based rating or billing mechanism
6. Survive any failure domain, including regions
7. Geographic Scale and latencies
8. Infrastructure-less
Over the past 5 or so years, the serverless execution model has been most commonly used for:
● Serverless compute
● Serverless functions
● Serverless app development
8. So, how do you make a database app serverless?
Language
Execution
Storage
Language: SQL
Distributed Execution
Replication & Distribution
Most Databases
Storage
Find the
divide
9. single cluster - one logical database
CockroachDB Node 1
Pulling it together, into serverless and beyond
SQL Layer
Execution
Storage & Replication
Distribution
CockroachDB Node 2
SQL Layer
Execution
Storage & Replication
Distribution
CockroachDB Node 3
SQL Layer
Execution
Storage & Replication
Distribution
10. Virtual Cluster
Shared CockroachDB Storage ONLY Cluster
CockroachDB
SQL Pod 1
Serverless decouples execution and storage
SQL Layer
Execution
Distribution
Storage & Replication Storage & Replication Storage & Replication
CockroachDB
SQL Pod 2
SQL Layer
Execution
Distribution
CockroachDB
SQL Pod 3
SQL Layer
Execution
Distribution
SQL
Storage
11. Virtual Cluster - Tenant 3
Virtual Cluster 2
Virtual Cluster - Tenant 1
Shared CockroachDB Storage ONLY Cluster
Tenant 1
SQL Pod 1
This allows to scale storage and execution separately
SQL Layer
Execution
Distribution
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 2
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 3
SQL Layer
Execution
Distribution
12. Availability Zone 1 Availability Zone 2 Availability Zone 3
App pods are spreads across AZs and the storage
cluster is spread to optimize for resilience
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
13. Availability Zone 1 Availability Zone 2 Availability Zone 3
App pods are spreads across AZs and the storage
cluster is spread to optimize for resilience
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 2
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
14. Availability Zone 1 Availability Zone 2 Availability Zone 3
App pods are spreads across AZs and the storage
cluster is spread to optimize for resilience
Storage & Replication Storage & Replication Storage & Replication
Tenant 3
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 2
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 3
SQL Layer
Execution
Distribution
15. Availability Zone 1 Availability Zone 2 Availability Zone 3
We introduce proxy pods that routes
tenant queries to their SQL pods
Storage & Replication Storage & Replication Storage & Replication
Proxy Proxy Proxy Proxy
LOAD BALANCER
Tenant 3
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 2
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
Tenant 3
SQL Pod 3
SQL Layer
Execution
Distribution
16. Serverless Scale
Data volume scale is accomplished in the storage cluster and uses native distribution
and range splitting
Transactional scale is quite different. We need to
● Accommodate elastic usage
● Deal with spikes in traffic
● Spin down to dormant usage
Autoscaler monitors CPU load on each SQL pod in the cluster, and calculates the
number of SQL pods/tenant based on two metrics:
• Average CPU usage over the last 5 minutes.
• Peak CPU usage during the last 5 minutes.
This is all accomplished using ephemeral SQL pods
17. Availability Zone 1 Availability Zone 2 Availability Zone 3
Autoscaling your application
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL pod 3
SQL Layer
Execution
Distribution
Tenant 1
SQL POd 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
Proxy Proxy Proxy Proxy
LOAD BALANCER
unassigned unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
“Hot”
pods
“Hot”
pods
“Hot”
pods
Tenant 1
Capacity
18. Availability Zone 1 Availability Zone 2 Availability Zone 3
Accommodates peaks in traffic by adding SQL pods
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
Proxy Proxy Proxy Proxy
LOAD BALANCER
unassigned
Tenant 1
SQL pod 4
SQL Layer
Execution
Distribution
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
“Hot”
pods
“Hot”
pods
“Hot”
pods
Tenant 1
Capacity
19. Availability Zone 1 Availability Zone 2 Availability Zone 3
...and returns to “steady state” after the event
Storage & Replication Storage & Replication Storage & Replication
Tenant 1
SQL Pod 3
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 2
SQL Layer
Execution
Distribution
Tenant 1
SQL Pod 1
SQL Layer
Execution
Distribution
Proxy Proxy Proxy Proxy
LOAD BALANCER
unassigned unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
“Hot”
pods
“Hot”
pods
“Hot”
pods
Tenant 1
Capacity
20. Availability Zone 1 Availability Zone 2 Availability Zone 3
And when there is no traffic, the tenant goes dormant and
the pods spin down to zero
Storage & Replication Storage & Replication Storage & Replication
Proxy Proxy Proxy Proxy
LOAD BALANCER
unassigned unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
“Hot”
pods
“Hot”
pods
“Hot”
pods
Tenant 1
Capacity
21. Availability Zone 1 Availability Zone 2 Availability Zone 3
When it needs to “wake up”, it spins up a “hot” pod
Storage & Replication Storage & Replication Storage & Replication
Proxy Proxy Proxy Proxy
LOAD BALANCER
unassigned unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
unassigned
“Hot”
pods
“Hot”
pods
“Hot”
pods
Tenant 1
Capacity
Tenant 1
SQL Pod 4
SQL Layer
Execution
Distribution
22. Serverless gives developers what they want
Spin up a CockroachDB
Serverless cluster in seconds,
for free w/out a credit card
Start Instantly
No need to manage, upgrade or
operate the database, just
connect and code
No Operations
Use a SQL database that scales
storage and transaction vols
up/down to meet demand
Auto Scale
CockroachDB serverless
replicates your data across AZs
to ensure it is always available
Eliminate Downtime
23. Serverless gives developers what they want
Spin up a CockroachDB
Serverless cluster in seconds,
for free w/out a credit card
Start Instantly
No need to manage, upgrade or
operate the database, just
connect and code
No Operations
Use a SQL database that scales
storage and transaction vols
up/down to meet demand
Auto Scale
CockroachDB serverless
replicates your data across AZs
to ensure it is always available
Eliminate Downtime
Just Build
Relational models and SQL
Guaranteed correct, low-latency transactions
OPERATIONS
NO
YES Code against an API
Wire compatible w/ PostgreSQL
DEVELOPER
FOCUS?
CODE
YES
GENEROUS free tier
For us, it’s a database!
27. Create a CockroachDB instance now…
Serverless(beta)
A single region instance with a generous free tier and
with a capped pay for usage beyond free limits
No credit card required!
Free, every month up to:
• 5GB Storage
• 250M Request Units
Dedicated
A full featured, single tenant instance. Deploy instantly
on AWS or GCP in a single region or across multiple
regions with 99.99% guaranteed uptime.
Let our SRE team provision and manage your
database.
www.cockroachlabs.com
28. What if I run out of request units (RUs)?
CockroachDB Serverless doesn’t turn off once you use your spend limit capacity, it
ensures you will have at least 100RUs/second for remainder of billing period (month)
What is it good for?
In its current state, We
believe CockroachDB
Serverless will be good for:
● side projects
● smaller apps
● low code apps
● learning SQL
Gaining familiarity with
SQL or CockroachDB
29. SINGLE LOGICAL
DATABASE
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
CockroachDB is a distributed, relational database that
can be used for mundane and high value workloads
It is a database cluster that is comprised of nodes that
appear as a single logical database
It gives your developers, familiar standard SQL
USER: Ashley
> INSERT (Kimball)
INTO CUSTOMER;
30. SINGLE LOGICAL
DATABASE
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
Scale the database by simply adding more nodes.
CockroachDB auto-balances to incorporate the new
resource. No manual work is required.
● Easy scale for increase in database size
● Every node accepts reads and write so
you also scale transactional volume
USER: Ashley
> INSERT (Kimball)
INTO CUSTOMER;
USER: Lindsay
> SELECT * FROM
ORDERS;
31. SINGLE LOGICAL
DATABASE
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
REGION 1
US-WEST
REGION 1
US-WEST
REGION 1
US-WEST
Scale even further across regions
and even clouds, yet still deliver a
single logical database
It excels when deployed across
multiple data centers.
USER: Ashley
> INSERT (Kimball)
INTO CUSTOMER;
USER: Lindsay
> SELECT * FROM
ORDERS;
USER: Peter
> UPDATE (Kimball)
FNAME=”Spencer”;
32. SINGLE LOGICAL
DATABASE
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
Scale even further across regions
and even clouds, yet still deliver a
single logical database
It excels when deployed across
multiple data centers.
...and even multi-cloud!
USER: Ashley
> INSERT (Kimball)
INTO CUSTOMER;
USER: Lindsay
> SELECT * FROM
ORDERS;
USER: Peter
> UPDATE (Kimball)
FNAME=”Spencer”;
33. SINGLE LOGICAL
DATABASE
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
USER: Ashley
> INSERT (Kimball)
INTO CUSTOMER;
USER: Lindsay
> SELECT * FROM
ORDERS;
REGION 1
US-WEST
REGION 1
US-WEST
REGION 1
US-WEST
USER: Peter
> UPDATE (Kimball)
FNAME=”Spencer”;
CockroachDB is naturally resilient so you can
survive the failure of a node or even an entire
region without service disruption
● Always-on, always available w/ zero RPO/RTO
● Allows for no downtime rolling upgrades
34. REGION 2
US-EAST
REGION 3
EMEA
REGION 1
US-WEST
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
USER: Random
> INSERT (Kimball)
INTO CUSTOMER;
LOAD BALANCER
Kimball
Mattis
Stewart
LOAD BALANCER LOAD BALANCER
Kimball
Mattis
Stewart
Kimball
Mattis
Stewart
Ask any node for data and it will find it in
the cluster
35. REGION 2
US-EAST
REGION 3
EMEA
REGION 1
US-WEST
CockroachDB: Architected For the Cloud
A fundamentally better database for your developers and applications
USER: Random
> INSERT (Kimball)
INTO CUSTOMER;
LOAD BALANCER
Kimball
Mattis
Stewart
USER: Kimball
> SELECT (Kimball)
FROM CUSTOMER;
LOAD BALANCER LOAD BALANCER
Ask any node for data and it will find it in
the cluster
Geo-locate data near user to reduce
read/write latencies
(or comply with regulations)
Kimball
Mattis
Stewart
Kimball
Mattis
Stewart