9. Prestobase proxy
• Written in Scala
• Finagle base RPC proxy
• Running as Docker container
• A user of Airframe
• VCR base light-weight test framework
10. Finagle
Finagle is an extensible RPC system for the JVM,
used to construct high-concurrency servers.
Finagle implements uniform client and server
APIs for several protocols, and is designed for
high performance and concurrency.
see: https://twitter.github.io/finagle/
11. Finagle
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
LastFilter andThen
prestoClient
Build request pipeline by binding
filter, handlers with Airframe
12. Airframe
Airframe is a trait base dependency injection
framework using Scala macro
- https://github.com/wvlet/airframe
14. Airframe
val design : Design =
newDesign
.bind[X].toInstance(new X) // Bind type X to a concrete instance
.bind[Y].toSingleton // Bind type Y to a singleton object
.bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl
import wvlet.airframe._
trait App {
val x = bind[X]
val y = bind[Y]
val z = bind[Z]
// Do something with X, Y, and Z
}
val session = design.newSession
val app : App = session.build[App]
15. VCR testing framework
Record test suite HTTP interaction to make
test stable and deterministic
see more detail
https://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html
16. VCR testing framework
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[RequestVCR] andThen
prestClient
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[NoRecording] andThen
prestClient
On CI
On Production
23. Node Scheduler
NodeScheduler creates NodeSelector that
selects worker nodes on which tasks are
scheduled. NodeSelector picks up worker
nodes when there is available splits.
24. Node Scheduler in TD
Keeps worker node map that can be
candidate for launching next tasks.
- Ignore min candidates
- Limit by available memory pool
25. Node Scheduler in TD
Back to normal memory pool usage after task is completed.
26. Node Scheduler in TD
Challenges
- Smoothing CPU time metric
- Split type awareness
- Avoid problematic worker nodes
28. Resource Group
Resource Group was introduced since 0.147
→ https://prestodb.io/docs/current/admin/resource-groups.html
Resource Group aims to limit the resource
usage by account/group/query.
30. Resource Group limits
- maxQueued
- maxRunning
- softMemoryLimit
Following queries will be queued
- softCpuLimit
Impose penalty against max running queries
- hardCpuLimit
Following queries will be queued
31. Resource Group scheduling
- schedulingPolicy
- fair : FIFO
- weighted : Selected stochastically
- query_priority : Selected according to priority
- schedulingWeight
32. Resource Group
Every query must be associated to a resource
group. The matching can be done by
configured selector.
{
"user": “bob", "group": "general"
},
{
"source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}"
}
37. Recap
Distributed system often requires each
component to be stable and scalable. We can
make Presto ecosystem reliable by doing…
- Code modification reliability with DI
- VCR testing
- Multi dimensional resource scheduling
- Resource isolation makes multi-tenant
distributed SQL engine reliable