Dima Krasner talks about FUSE, Filesystem in Userspace, its pros and cons, usage, tips and tricks, and more.
Dima is a senior developer at Sam Seamless Network.
2. Agenda
● Intro to FUSE
● L33t haxx0r tricks
○ The layered mount() trick
○ The *at() family
● The tricks with FUSE
3. What is FUSE?
fuse(8) says:
FUSE (Filesystem in Userspace) is a
simple interface for userspace
programs to export a virtual
filesystem to the Linux kernel.
FUSE also aims to provide a secure
method for non privileged users to
create and mount their own
filesystem implementations.
man fuse
4. Wait, what?
3 components:
● fuse.ko, a dummy file system
○ Provides /dev/fuse
● A userspace daemon, the actual file system
● libfuse, which communicates with fuse.ko through /dev/fuse
(Interesting read: doc/how-fuse-works under the FUSE source tree)
5. ● Distros have been shipping libfuse 2.9.x for years
○ New maintainer came in 2016
○ Moved to GitHub
■ My guess: some libfuse package maintainers and users haven’t noticed this :D
● Then, came libfuse 3.0.0
○ API breakage, legacy cruft removed
○ Called “fuse3” in pkg-config, for coexistence with libfuse 2.9.x
● This talk addresses both
○ And demo code has #ifdefs for dual libfuse2/libfuse3 support
(Very) Brief History
6. Pros
● Access to all the goodies of userspace
○ Good example: libcurl
○ Languages other than C
○ Flexible release cycles
● FUSE is super nice to work with
○ Multi-threading of file system operations
○ Mounting
○ Daemonization
○ API compatibility levels
● Quick development
○ python-fuse
8. Common Uses
● File systems that have no proper driver (no other choice)
○ ntfs-3g
○ exFAT
● Remote file systems (network latency)
○ httpfs2
○ gvfs
● Layered file systems (disk latency)
○ unionfs-fuse
○ posixovl
● (See? Overhead is not a deal breaker)
10. ● Reliability
○ Kernel lockup or crash following daemon crash
○ Stale mounts
● With great complexity comes great responsibility: CVE-2014-5207
○ Linux vuln. exploited through FUSE, https://www.exploit-db.com/exploits/34923
○ Linux commit 9566d6742852c527bf5af38af5cbb878dad75705
● Something has to be privileged: CVE-2015-3202
○ Bad handling of privileges in a SUID binary, fusermount
○ Environment is user-controlled, but no clearenv() when ruid != euid in fusermount/mount
● File systems are not seen as an attack vector
The Real Cons
12. Before we go on: how do I write to /etc on this router?
13. The File System Hierarchy
My / is an ext4 on /dev/sda1.
It contains a directory, tmp.
A tmpfs is mounted at /tmp.
The ability to mount a file system on
top of another is a design choice.
Photo from Wikipedia (Waffle, Pilettes, CC BY-SA 3.0)
/home
/tmp
/etc
/root
/usr
/bin
/sbin
/var
14. The Layered
mount() Trick
mount() works, always
Read-only file system?
Incapable file system?
No problem.
Just mount something at the same
mount point and do your magic.
16. ● Convenience
○ No need to sprintf(“/a/b/%s”, ...) like crazy
● A *at() call operates on a specific file system
○ Hardening against another mount() on top
○ Limiting paths a process can access (e.g. AppArmor) isn’t enough
○ Say, httpd reads /etc/resolv.conf
■ I can mount something at /etc and alter resolv.conf, unless httpd uses openat()
● Finicky userspace support
○ The *at() syscalls appeared individually
○ libc doesn’t expose everything
Benefits and Annoyances
17. Magic!
strace says:
After the umount2(), the file system is hidden.
1. We mount a tmpfs at .
2. echo -n abcd > lol
3. echo -n efgh > rofl
4. We get a file descriptor under the tmpfs (.
counts too!)
5. We lazily unmount the tmpfs
6. (The tmpfs disappears from mounts)
7. We can read files with openat()!
19. Security: A Double-Edged Sword
● *at() calls can be used to protect against mount()
● But, *at() calls can be used to hide file systems
● Semi-legit use cases: packers and self-extractors
21. Showcase
● Full control of a directory can be achieved in real-time from userspace using:
○ mount()
○ *at()
○ FUSE
● First, we’ll take a look at overheadfs, which is useless
● Then, we’ll look at logfence, which does something good with this technique
22. overheadfs
● Run overheadfs /tmp
● Nothing special happens
overheadfs delegates all file system
operations to the file system
mounted below it.
A proxy file system
github.com/dimkr/overheadfs
24. How overheadfs Works
● fd = open(path, O_DIRECTORY)
● libfuse does its magic and mounts overheadfs
● fd allows access to the underlying file system through *at()
Instead of acting as a transparent proxy, we can use this for fun (evil?) stuff!
25. logfence
● Run logfence /var/log
● Start nginx
○ nginx open()s a log file for writing
● No other process can open the
same file for writing
An attacker that used a nginx
vulnerability cannot modify logs to
hide traces.
Prevents tampering with logs
github.com/dimkr/logfence
26. How logfence Works
● Same concept
● Manages a list of locked files
● Using flock() is probably a better idea
28. More Ideas
● A file system where syscalls fail randomly, for tests
● A file system that verifies signatures of files: elfence
● A file system that uses memfd_create() as cache
○ In-memory SQLite databases that get flushed to disk every once in a while
○ Transparent AT_STATX_DONT_SYNC for all stat()
● A file system that provides an alternative to inotify
○ Embedded devices with INOTIFY_USER=n
● Everything is a file, right?
○ How does mount know what’s currently mounted?
○ How does ps know what processes are running?
○ You know the drill
29. ● Delegation of kernel work to userspace has its costs
○ Give the user a finger and a haxx0r will grab you by the neck by exploiting through libfuse
● Complexity is the devil
○ mount() and affected stuff don’t have one linear flow because of all the flags
● The devil is in the detail
○ fusermount: something has to call mount() after all, hence SUID root
● “Everything is a file” is nice, but …
○ It’s a lie - e.g. Netlink sockets
○ (Just a symptom: the Unix philosophy is long dead too, e.g. systemd)
○ Files can be created by userspace, unlike a special Netlink family/group
○ Security-wise, file access isn’t everything
Conclusions