SlideShare una empresa de Scribd logo
1 de 22
QEMU Sandboxing for dummies
Eduardo Otubo <otubo@redhat.com>
Senior Software Engineer
27/Jan/2018
2
1. Secure Computing: The basics
2. Libseccomp
3. Qemu sandboxing v1
4. Qemu sandboxing v2 and more options
Agenda
3
Secure Computing: the basics
● Kernel support first version dated from March, 8th 2005 (2.6.12)
Commit by: Andrea Arcangeli
● The main purpose is to call prctl() with PR_SET_SECCOMP on the
process which will allow only: exit(), sigreturn(), read()
and write()
○ Otherwise SIGKILL or SIGSYS are issued
4
Secure Computing: the basics
● Second kernel implementation with dynamic seccomp policies:
January, 11th 2011; Commit by: Will Drewry <wad@chromium.org>
● Now uses with seccomp() system call
● Uses BPF (Berkeley Packet Filter)
○ An in-kernel data link layer packet filter that has an abstracted API that
also works as a generic filter
5
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
6
Libseccomp
● Paul Moore (2011)
● Userspace layer to make life easier:
○ Abstract complex BPF constructions
○ Abstract differences between architectures and its ABIs
○ Optimize filter construction for best performance
○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among
other actions)
7
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
8
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
9
Qemu sandboxing v1
static const struct QemuSeccompSyscall seccomp_whitelist[] = {
{ SCMP_SYS(timer_settime), 255 },
{ SCMP_SYS(timer_gettime), 254 },
{ SCMP_SYS(futex), 253 },
{ SCMP_SYS(select), 252 },
{ SCMP_SYS(recvfrom), 251 },
{ SCMP_SYS(sendto), 250 },
{ SCMP_SYS(read), 249 },
{ SCMP_SYS(brk), 248 },
{ SCMP_SYS(clone), 247 },
{ SCMP_SYS(mmap), 247 },
{ SCMP_SYS(mprotect), 246 },
{ SCMP_SYS(execve), 245 },
{ SCMP_SYS(open), 245 },
{ SCMP_SYS(ioctl), 245 },
{ SCMP_SYS(recvmsg), 245 },
{ SCMP_SYS(sendmsg), 245 },
10
Qemu sandboxing v1
11
● Basic whitelist approach (--sandbox=on)
○ Every system call is blocked, except for the ones that are explicitly
whitelisted
● Various compatibility problems, requires lots of testing and
different workloads
● It’s safe right?
12
Qemu sandboxing v1
Not actually!
● QEMU links to too many different shared libraries and there is no way
to determine which code paths QEMU triggers in these libraries and
thus identify which syscalls will be genuinely needed.
● Sometimes you miss a syscall and it aborts right at the beginning
before boot (which is good?) but sometimes your VM is running for
days and it could suddenly abort (which is terrible)
13
Qemu sandboxing v2
● Extended blacklist approach (--sandbox=on,...)
● Everything is allowed except for a few sets that are definitely not
allowed
○ Default system calls: basic set of forbidden system calls (kexec,swapon,
swapoff, mount, umount, etc)
○ obsolete
○ elevateprivileges
○ spawn
○ resourcecontrol
14
Obsolete system calls
● Old system calls that were usefull in the past but became obsolete or
replaced by new version
○ Like readdir() being replaced by getdents()
● Should be by default blocked, but left an option to enabled it by
--sandbox on,obsolete=allow
15
Elevated Privileges
● This option would block all set*uid|gid system calls, this is known
to be required by some features like bridge helpers
● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will
avoid new threads to escalate privilege as well
● This mode could be switched on or off by the option:
--sandbox on,elevatedprivileges=allow|deny|children
16
Spawn
● This option provides a fair way to disable new fork() or exec()
processes to be created at all, privileged or not.
● Things like bridge helper, SMB server, ifup/down scripts, migration
exec: protocol would all be disabled.
● This mode could be switched on or off by the option:
--sandbox on,spawn=allow|deny
17
Resource Control
● Avoids QEMU to set process affinity, scheduler priority, etc
● This shouldn’t be QEMU’s responsability to do this, but rather management
software like libvirt.
● This mode could be switched on or off by the option:
--sandbox on,resourcecontrol=allow|deny
18
Qemu sandboxing v2
static const struct QemuSeccompSyscall blacklist[] = {
/* default set of syscalls to blacklist */
{ SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT },
19
Some thoughts on Qemu sandboxing
20
● Sandboxing is not your definitive solution for security on virtualization.
But rather a good solution to be stacked on others like:
○ MAC/DAC (Mandatory Access Control and Discretionary Access Control)
○ SELinux
○ Remote Management using SSH/TLS/SSL
○ Guest Image cryptography
○ Virtual Trusted Platform Module (vTPM)
● Sandbox v2 are not low level knobs to control system calls but rahter a high
level knobs to controls concepts.
Questions?
21
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews

Más contenido relacionado

La actualidad más candente

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
Kernel TLV
 
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu securityCSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CanSecWest
 

La actualidad más candente (20)

semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Understanding a kernel oops and a kernel panic
Understanding a kernel oops and a kernel panicUnderstanding a kernel oops and a kernel panic
Understanding a kernel oops and a kernel panic
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysis
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu securityCSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Linux Kernel I/O Schedulers
Linux Kernel I/O SchedulersLinux Kernel I/O Schedulers
Linux Kernel I/O Schedulers
 
Linux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewLinux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver Overview
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Thuc hanh 13
Thuc hanh  13Thuc hanh  13
Thuc hanh 13
 
Linux Kernel Exploitation
Linux Kernel ExploitationLinux Kernel Exploitation
Linux Kernel Exploitation
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Semtex.c [CVE-2013-2094] - A Linux Privelege Escalation
Semtex.c [CVE-2013-2094] - A Linux Privelege EscalationSemtex.c [CVE-2013-2094] - A Linux Privelege Escalation
Semtex.c [CVE-2013-2094] - A Linux Privelege Escalation
 

Similar a QEMU Sandboxing for dummies

HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 

Similar a QEMU Sandboxing for dummies (20)

Kernel debug log and console on openSUSE
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSE
 
Microkernel Development
Microkernel DevelopmentMicrokernel Development
Microkernel Development
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
Alexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for Developers
 
Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)
 
Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
Linux scheduler
Linux schedulerLinux scheduler
Linux scheduler
 
Generic Synchronization Policies in C++
Generic Synchronization Policies in C++Generic Synchronization Policies in C++
Generic Synchronization Policies in C++
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
 
Linux Kernel Debugging
Linux Kernel DebuggingLinux Kernel Debugging
Linux Kernel Debugging
 
Implementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphoresImplementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphores
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Chromium Sandbox on Linux (BlackHoodie 2018)
Chromium Sandbox on Linux (BlackHoodie 2018)Chromium Sandbox on Linux (BlackHoodie 2018)
Chromium Sandbox on Linux (BlackHoodie 2018)
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
 
Tracer Evaluation
Tracer EvaluationTracer Evaluation
Tracer Evaluation
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementation
 

Último

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Último (20)

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

QEMU Sandboxing for dummies

  • 1. QEMU Sandboxing for dummies Eduardo Otubo <otubo@redhat.com> Senior Software Engineer 27/Jan/2018
  • 2. 2
  • 3. 1. Secure Computing: The basics 2. Libseccomp 3. Qemu sandboxing v1 4. Qemu sandboxing v2 and more options Agenda 3
  • 4. Secure Computing: the basics ● Kernel support first version dated from March, 8th 2005 (2.6.12) Commit by: Andrea Arcangeli ● The main purpose is to call prctl() with PR_SET_SECCOMP on the process which will allow only: exit(), sigreturn(), read() and write() ○ Otherwise SIGKILL or SIGSYS are issued 4
  • 5. Secure Computing: the basics ● Second kernel implementation with dynamic seccomp policies: January, 11th 2011; Commit by: Will Drewry <wad@chromium.org> ● Now uses with seccomp() system call ● Uses BPF (Berkeley Packet Filter) ○ An in-kernel data link layer packet filter that has an abstracted API that also works as a generic filter 5
  • 6. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 6
  • 7. Libseccomp ● Paul Moore (2011) ● Userspace layer to make life easier: ○ Abstract complex BPF constructions ○ Abstract differences between architectures and its ABIs ○ Optimize filter construction for best performance ○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among other actions) 7
  • 8. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 8
  • 9. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 9
  • 10. Qemu sandboxing v1 static const struct QemuSeccompSyscall seccomp_whitelist[] = { { SCMP_SYS(timer_settime), 255 }, { SCMP_SYS(timer_gettime), 254 }, { SCMP_SYS(futex), 253 }, { SCMP_SYS(select), 252 }, { SCMP_SYS(recvfrom), 251 }, { SCMP_SYS(sendto), 250 }, { SCMP_SYS(read), 249 }, { SCMP_SYS(brk), 248 }, { SCMP_SYS(clone), 247 }, { SCMP_SYS(mmap), 247 }, { SCMP_SYS(mprotect), 246 }, { SCMP_SYS(execve), 245 }, { SCMP_SYS(open), 245 }, { SCMP_SYS(ioctl), 245 }, { SCMP_SYS(recvmsg), 245 }, { SCMP_SYS(sendmsg), 245 }, 10
  • 11. Qemu sandboxing v1 11 ● Basic whitelist approach (--sandbox=on) ○ Every system call is blocked, except for the ones that are explicitly whitelisted ● Various compatibility problems, requires lots of testing and different workloads ● It’s safe right?
  • 12. 12
  • 13. Qemu sandboxing v1 Not actually! ● QEMU links to too many different shared libraries and there is no way to determine which code paths QEMU triggers in these libraries and thus identify which syscalls will be genuinely needed. ● Sometimes you miss a syscall and it aborts right at the beginning before boot (which is good?) but sometimes your VM is running for days and it could suddenly abort (which is terrible) 13
  • 14. Qemu sandboxing v2 ● Extended blacklist approach (--sandbox=on,...) ● Everything is allowed except for a few sets that are definitely not allowed ○ Default system calls: basic set of forbidden system calls (kexec,swapon, swapoff, mount, umount, etc) ○ obsolete ○ elevateprivileges ○ spawn ○ resourcecontrol 14
  • 15. Obsolete system calls ● Old system calls that were usefull in the past but became obsolete or replaced by new version ○ Like readdir() being replaced by getdents() ● Should be by default blocked, but left an option to enabled it by --sandbox on,obsolete=allow 15
  • 16. Elevated Privileges ● This option would block all set*uid|gid system calls, this is known to be required by some features like bridge helpers ● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will avoid new threads to escalate privilege as well ● This mode could be switched on or off by the option: --sandbox on,elevatedprivileges=allow|deny|children 16
  • 17. Spawn ● This option provides a fair way to disable new fork() or exec() processes to be created at all, privileged or not. ● Things like bridge helper, SMB server, ifup/down scripts, migration exec: protocol would all be disabled. ● This mode could be switched on or off by the option: --sandbox on,spawn=allow|deny 17
  • 18. Resource Control ● Avoids QEMU to set process affinity, scheduler priority, etc ● This shouldn’t be QEMU’s responsability to do this, but rather management software like libvirt. ● This mode could be switched on or off by the option: --sandbox on,resourcecontrol=allow|deny 18
  • 19. Qemu sandboxing v2 static const struct QemuSeccompSyscall blacklist[] = { /* default set of syscalls to blacklist */ { SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT }, 19
  • 20. Some thoughts on Qemu sandboxing 20 ● Sandboxing is not your definitive solution for security on virtualization. But rather a good solution to be stacked on others like: ○ MAC/DAC (Mandatory Access Control and Discretionary Access Control) ○ SELinux ○ Remote Management using SSH/TLS/SSL ○ Guest Image cryptography ○ Virtual Trusted Platform Module (vTPM) ● Sandbox v2 are not low level knobs to control system calls but rahter a high level knobs to controls concepts.