Prometheus as exposition format for eBPF programs running on Kubernetes

Prometheus as exposition
format for eBPF programs
running on k8s
Leonardo Di Donato. Open Source Software Engineer @ Sysdig.
2019.05.18 - Cloud_Native Rejekts EU - Barcelona, Spain

whoami
Leonardo Di Donato.
Maintainer of Falco.
Creator of kubectl-trace and go-syslog.
Reach me out @leodido.

@leodido
• Old buzzword.
• Is this SNMP? 😂
• Focus on collecting, persisting, and alerting
on just any data!
• It might also become simply garbage.
• Data lake.
• Doing it well requires a strategy.
• Uninformed monitoring equals hope.
Monitoring
The missing buzzwords
Wait, another really cool buzzword is Tracing!
• Ability of a system to give to humans
insights.
• Humans can observe, understand, and act on
the presented state of an observable system.
• Ability to make deductions about internal
state only looking at boundaries (inputs vs
outputs).
• Never truly achieved. Ongoing process and
mindset.
• Avoid black box data. Extract fine-grained
and meaningful data.
Observability

@leodido
• Monitoring landscape very fragmented
• Many solutions
• with ancient tech
• Proprietary data formats
• often not completely impl. or undocumented or ...
• Hierarchical data models
• Metrics? W00t?
Before Prometheus
But there’s a thing ...
• De-facto standard
• Cloud-native metric monitoring
• Ease of use
• Explosion of /metrics endpoints
After Prometheus
The journey so far

What if we could exploit Prometheus
(or OpenMetrics) exposition format’s
awesomeness without having to
punctually instrument applications?
Can we avoid to clog our applications
through eBPF superpowers?
eBFP superpowers
@leodido

What eBPF is
You can now write mini programs that run on events like disk I/O
which are run in a safe virtual machine in the kernel.
In-kernel verifier refuses to load eBPF programs with invalid
pointer dereferences, exceeding maximum call stack, or with loop
without an upper bound.
Imposes a stable Application Binary Interface (ABI).
BPF on steroids 🚀
A core part of the Linux kernel.
@leodido

@leodido
userspace
program
bpf() syscall
eBPF program ...
user-space
kernel
eBPF map
BPF_MAP_CREATE
BPF_MAP_LOOKUP_ELEM
BPF_MAP_UPDATE_ELEM
BPF_MAP_DELETE_ELEM
BPF_MAP_GET_NEXT_KEY
http://bit.ly/bpf_map_types 📎
BPF_PROG_TYPE_SOCKET_FILTER
BPF_PROG_TYPE_KPROBE
BPF_PROG_TYPE_TRACEPOINT
BPF_PROG_TYPE_RAW_TRACEPOINT
BPF_PROG_TYPE_XDP
BPF_PROG_TYPE_PERF_EVENT
BPF_PROG_TYPE_CGROUP_SKB
BPF_PROG_TYPE_CGROUP_SOCK
BPF_PROG_TYPE_SOCK_OPS
BPF_PROG_TYPE_SK_SKB
BPF_PROG_TYPE_SK_MSG
BPF_PROG_TYPE_SCHED_CLS
BPF_PROG_TYPE_SCHED_ACT
📎 http://bit.ly/bpf_prog_types
eBPF program
How does eBFP work?

• fully programmable
• can trace everything in a system
• not limited to a specific application
• unified tracing interface for both kernel and
userspace
• [k,u]probes, (dtrace)tracepoints and so on
are also used by other tools
• minimal (negligible) performance impact
• attach JIT native compiled instrumentation
code
• no long suspensions of execution
Advantages
• requires a fairly recent kernel
• definitely not for debugging
• no knowledge of the calling higher level
language implementation
• not fully running in user space
• kernel-user context (usually negligible)
switch when eBPF instrument a user process
• still not portable as other tracers
• VM primarily developer in the Linux kernel
(work-in-progress portings btw)
Disadvantages
Why use eBPF at all to trace userspace processes?

@leodido
BFP operator for
Kubernetes
Why don’t we make eBPF programs look
more YAML ✌✌✌

📎 http://bit.ly/k8s_crd
An extension of the
K8S API that let you
store and retrieve
structured data.
Custom resources
📎 http://bit.ly/k8s_shared_informers
The actual control
loop that watches the
shared state using the
workqueue.
Shared informers
📎
http://bit.ly/k8s_custom_controllers
It declares and
specifies the desired
state of your resource
continuously trying to
match it with the
actual state.
Controllers
Customize all the things

@leodido
BPF
runner
bpf()
syscall
eBPF
program
...
user-space
kernel
eBPF
map
eBPF
program
...
BPF
runner
bpf()
syscall
eBPF
program
...
user-space
kernel
eBPF
map
eBPF
program
BPF
CRD
Here’s the evil plan
:9387/metrics :9387/metrics

@leodido
Did y’all say
Y’AML?!
let’s put some ELF magic
in it...
🧝‍♂🤯🧙‍♂

@leodido
Count packets by protocol Count sys_enter_write by process ID
macro to generate sections inside the object file (later interpreted by the ELF BPF loader)

@leodido
Compile and inspect
This is important because communicates to set the
current running kernel version!
Tricky and controversial legal thing about
licenses ...
The bpf_prog_load() wrapper also has a license
parameter to provide the license that applies to
the eBPF program being loaded.
Not GPL-compatible license?
Kernel won’t load you eBPF!
Exceptions applies...
eBPF
Maps

@leodido
Demo time
Doing all the BPF things, with YAML 💦

@leodido
# HELP test_packets No. of packets per protocol (key), node
# TYPE test_packets counter
test_packets{key="00001",node="127.0.0.1"} 8
# EOF
It is a WIP project but already open source! 🎺
Check it out @ gh:bfptools/kube-bpf 🔗
ip-10-12-0-136.ec2.internal:9387/metrics
# <- ICMP
# <- IGMP
# <- TCP
# <- EGP
# <- UDP
# <- OSPF
# <- ?

@leodido
# HELP test_dummy No. sys_enter_write calls per PID (key), node
# TYPE test_dummy counter
test_dummy{key="00001",node="127.0.0.1"} ...
test_dummy{key="00001",node="127.0.0.1"} 8
# EOF
ip-10-12-0-122.ec2.internal:9387/metrics

@leodido

@leodido
kubectl-trace
More eBPF + k8s
Run bpftrace program (from file)
Ctrl-C tells the
program to
plot the results
using hist()
The output histogram
Maps

@leodido
• Prometheus exposition format is here to stay given how simple it is 📊
• OpenMetrics will introduce improvements on such giant shoulders 📈
• We cannot monitor and observe everything from inside our applications 🎯
• We might want to have a look at the orchestrator (context) our apps live
and die in 🕸
• Kubernetes can be extended to achieve such levels of integrations 🔌
• ELF is cool 🧝
• We look for better tools (eBPF) for grabbing our metrics and even more 🔮
• Almost nullify footprint ⚡
• Enable a wider range of available data 🌊
• Do not touch our applications directly 👻
• There is a PoC doing some magic at gh:bfptools/kube-bpf 🧞
Key takeaways

Thanks.
Reach me out @leodido on twitter & github!
SEE Y’ALL AROUND AT KUBECON
http://bit.ly/prometheus_ebpf_k8s

Prometheus as exposition format for eBPF programs running on Kubernetes

Recomendados

Recomendados

Más contenido relacionado

Similar a Prometheus as exposition format for eBPF programs running on Kubernetes

Similar a Prometheus as exposition format for eBPF programs running on Kubernetes (20)

Más de Leonardo Di Donato

Más de Leonardo Di Donato (8)

Último

Último (20)

Prometheus as exposition format for eBPF programs running on Kubernetes