Adaptive Scaling of Microgateways on Kubernetes

Adaptive Scaling of Microgateways on
Kubernetes
October 7, 2020

Hello!
Renuka Fernando
Amila De Silva
Software Engineer, WSO2
amilad@wso2.com
renuka@wso2.com
Software Architect/ Associate Director, WSO2

Agenda
3
● Adaptive Scaling
● Autoscaling in Kubernetes
● Custom Metrics with Microgateway
● Demo
● Q & A

● The ability to scale dynamically as the traﬃc varies
● As the demand rises, more instances are provisioned
● Made possible by
○ The availability of abundant computing power
○ On-demand provisioning
● Before IaaS was available
○ Instances had to be provisioned beforehand
○ Resources were sitting idle even when there was no traﬃc.
What is Adaptive Scaling?
5

● On-demand provisioning through IaaS
○ Allowed provisioning infrastructure without long delays
○ Manually allocating infrastructure
● Autoscaling took this a step further
○ Instances were provisioned automatically
○ Could respond to sudden bursts
○ Cluster would automatically scale up and down according to the traﬃc
● Many IaaS providers provide a way to autoscale
○ AWS autoscaling groups
○ GCP, Azure and many others have similar capabilities
● Regardless of the framework autoscaling generally offers some advantages
What is Adaptive Scaling
6

● Autoscaling provisions infrastructure dynamically
○ Saves the application from sudden spikes without losing traffic
○ Gathers traffic without provisioning resources upfront
Benefits of Adaptive Scaling
7

● Makes services more
responsive
○ More and more services are
embracing microservices
○ Interaction between services
are complex
○ Ensuring responsiveness of
dependant services is critical
8

● Allows using resources optimally
○ Allows many services to share a single resource pool
○ Scaling only the services needed
9

● Cluster autoscaling
⦿ Adds more nodes to a K8 cluster
⦿ Usually done with an extension to underlying IaaS provider
● Pod auto scaling
⦿ Scales pods in a ﬁxed resource pool
⦿ Two variations
● Vertical Pod Autoscaling
⦿ Increases the computing power of a single pod
● Horizontal Pod Autoscaling
⦿ Increases the number of pods
● The new feature works with Horizontal Pod Autoscaling
Scaling Options in Kubernetes
11

● Handled by Horizontal Pod
Autoscaler
⦿ Pulls metrics from different APIs
⦿ Pulling happens at a predeﬁned
frequency
⦿ Pod Autoscaler adjusts the pods
● A target value can be speciﬁed
● Autoscaler ensures the metric is kept
at the target level.
How Horizontal Pod Autoscaling Works
12

● HPA configuration specifies the
values to scale the cluster on
● A target value can be specified
● Autoscaler ensures the metric is
maintained at the target level
How Horizontal Pod Autoscaling Works
13
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50

● Autoscaling in K8s is based on metrics
⦿ Resource metrics
⦾ CPU and Memory
⦾ Pulled through metrics-server
⦿ Custom metrics
⦾ Can be deﬁned per Pod or for
other Objects
⦾ Request rate, error rate, etc.
⦿ External metrics
⦾ Metric set from an external
cluster
Providing Metrics for Autoscaling
14

Custom Metrics with Microgateway

● With this release microgateway
publishes a set of metrics
⦿ Metrics like error count, total
requests, response delays are
gathered
⦿ Metrics are periodically pulled by
the Prometheus server
⦿ HPA consumes these through
Prometheus Adaptor
● Supports scaling microgateway with
custom metrics
What is Supported in This release
16

● Two types of metrics are published through the microgateway
⦿ Metrics indicating about the microgateway
⦿ Metrics indicating about backend services
● Gateway level metrics can be used to scale the microgateway and service level
metrics to scale backend
● Going through different deployment models help understand this better
Scaling at Different Levels
17
Gateway
Per_Request_Duration_mean
Request_Duration_Total_mean
http_inprogress_requests_value
Http_requests_total_value
http_response_time_seconds
Service
ballerina_http_Caller_1XX_requests_total_value
ballerina_http_Caller_inprogress_requests_value
ballerina_http_Caller_requests_total_value
ballerina_http_Caller_response_time_seconds
ballerina_http_Caller_response_time_seconds_max

● Shared gateway
⦿ Gateway as the ﬁrst hop after Ingress Gateway
⦿ Single gateway for all microservices
Shared Gateway Mode - Scaling the Microgateway
18

● Sidecar
⦿ Gateway sits in the same pod as the microservice
● Metrics pushed through gateway can be used to scale the pod
● Underlying service doesn’t need to be instrumented
Sidecar Mode - Scaling the Microservice
19

21
● Scaling Shared API Gateway with HTTP request count
Demo - Scaling Shared API Gateway

22
● Scaling Products backend with HTTP request count exposed by Microgateway.
Demo - Scaling Backend with Sidecar Microgateway

Adaptive Scaling of Microgateways on Kubernetes

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Adaptive Scaling of Microgateways on Kubernetes

Similar a Adaptive Scaling of Microgateways on Kubernetes (20)

Más de WSO2

Más de WSO2 (20)

Último

Último (20)

Adaptive Scaling of Microgateways on Kubernetes