SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Python FUSE       File-System in USErspace


                       Beyond the Traditional File-Systems




Matteo Bertozzi (Th30z)
     http://th30z.blogspot.com
Talk Overview
What is a File-System

Brief File-Systems History

What is FUSE

Beyond the Traditional File-System

API Overview

Examples (Finally some code!!!)
http://mbertozzi.develer.com/python-fuse


Q&A
What is a File-System
Is a Method of storing and organizing data
    to make it easy to find and access.




...to interact with an object
  You name it, and you say
      what you want it do.



  The Filesystem takes the name you give
  Looks through disk to find the object
  Gives the object your request to do
  something.
What is a File-System



On Disk Format (...serialized struct)
ext2, ext3, reiserfs, btrfs...


Namespace
(Mapping between name and content)
/home/th30z/, /usr/local/share/test.c, ...


Runtime Service: open(), read(), write(), ...
(The Origins)

                       ...A bit of History
Multics 1965 (File-System Paper)
A General-Purpose File System For Secondary Storage
Unix Late 1969



                                      User Program

               User Space

               Kernel Space
                                  System Call Layer

                                      The File-System




 Only One File-System
(The Evolution)

                       ...A bit of History
Multics 1965 (File-System Paper)
A General-Purpose File System For Secondary Storage
Unix Late 1969



                                        User Program

               User Space

               Kernel Space
                                    System Call Layer
                                            (which?)

                            The File-System 1      The File-System 2
(The Evolution)

                       ...A bit of History
Multics 1965 (File-System Paper)
A General-Purpose File System For Secondary Storage
Unix Late 1969



                                       User Program

               User Space

               Kernel Space
                                     System Call Layer
                                            (which?)

                              FS 1   FS 2   FS 3   FS 4       ...        FS N
(The Solution)

                       ...A bit of History
Multics 1965 (File-System Paper)
A General-Purpose File System For Secondary Storage
Unix Late 1969
Sun Microsystem 1984

                                            User Program

               User Space

               Kernel Space

                                      System Call Layer


                                 Vnode/VFS Layer

                              FS 1   FS 2    FS 3   FS 4   ...     FS N
Virtual File-System
Provides an abstraction within the
kernel which allows different                                     C Library
filesystem implementations to coexist.                     (open(), read(), write(), ...)
                                           User Space
Provides the filesystem interface to
                                           Kernel Space
userspace programs.                                              System Calls
                                                          (sys_open(), sys_read(), ...)


            VFS Concepts                                              VFS
                                                          (vfs_read(), vfs_write(), ...)
A super-block object represents a
filesystem.                                     Kernel
                                                          ext2      ReiserFS        XFS
                                            Supported
I-Nodes are filesystem objects such         File-Systems
                                                          ext3       Reiser4        JFS
as regular files, directories, FIFOs, ...
                                                          ext4        Btrfs        HFS+
A file object represents a file opened                       ...          ...          ...
by a process.
Wow, It seems not much
difficult writing a filesystem
Why File-System are Complex

You need to know the Kernel   (No helper libraries: Qt, Glib, ...)

Reduce Disk Seeks / SSD Block limited write cycles

Be consistent, Power Down, Unfinished Write...
(Journal, Soft-Updates, Copy-on-Write, ...)

Bad Blocks, Disk Error

Don't waste to much space for Metadata

Extra Features: Deduplication, Compression,
Cryptography, Snapshots…
File-Systems Lines of Code
    MinixFS    2,000

MinixFS Fuse   800

        UFS    2,000

   UFS Fuse    1,000

        FAT    6,000

       ext2    8,000

       ext3    16,000

    ReiserFS   27,000

       ext4    30,000

       btrfs   50,000

        NFS    10,000          8,000

       Fuse    7,000       9,000

      FtpFS    800

      SshFS    2,000


                       Kernel Space    User Space
Building a File-System is Difficult


Writing good code is not easy (Bugs, Typo, ...)

Writing good code in Kernel Space is much
more difficult.

Too many reboots during the development

Too many Kernel Panic during Reboot

We need more flexibility and Speedups!
FUSE, develop your file-system
with your favorite language and library
             in user space
What is FUSE


Kernel module! (like ext2, ReiserFS, XFS, ...)

Allows non-privileged user to create their own file-
system without editing the kernel code. (User Space)

FUSE is particularly useful for writing "virtual file
systems", that act as a view or translation of an
existing file-system storage device. (Facilitate Disk-
Based, Network-Based and Pseudo File-System)

Bindings: Python, Objective-C, Ruby, Java, C#, ...
File-Systems in User Space?
            ...Make File Systems Development Super Easy




All UserSpace Libraries are Available

...Debugging Tools

No Kernel Recompilation

No Machine Reboot!
...File-System upgrade/fix
2 sec downtime, app restart!
Yeah, ...but what’s FUSE?
It’s a File-System with user-space callbacks


  ntfs-3g  sshfs ifuse      ChrionFS
                    zfs-fuse    YouTubeFS
    gnome-vfs2 ftpfs     cryptoFS    RaleighFS
                                          U n i x
FUSE Kernel Space and User Space
                                                          Your FS
 The FUSE kernel module                                   SshFS
                                                                    lib Fuse
 and the FUSE library                                     FtpFS

 communicate via a                                          ...
                                         User Space
 special file descriptor                                             /dev/fuse
                                         Kernel Space
 which is obtained by                                                FUSE

 opening /dev/fuse                                                   ext2

                                                  ...     VFS        ext4
                                                drivers
                          Your Fuse FS                                 ...
                                               firmware
          User Input
                                                kernel               Btrfs
         ls -l /myfuse/     lib FUSE



Kernel
             VFS             FUSE
...be creative

Beyond the Traditional File-Systems

ImapFS: Access to your
mail with grep.              Thousand of
                            tools available
SocialFS: Log all your
                             cat/grep/sed
social network to collect
news/jokes and other
social things.               open() is the
                               most used
YouTubeFS: Watch YouTube    function in our
video as on your disk.
                              applications
GMailFS: Use your mailbox
as backup disk.
FUSE API Overview
create(path, mode)                         mkdir(path, mode)

truncate(path, size)                       unlink(path)

mknod(path, mode, dev)                     readdir(path)
open(path, mode)                           rmdir(path)

write(path, data, offset)                  rename(opath, npath)

read(path, length, offset)                 link(srcpath, dstpath)
release(path)
                                                       Your Fuse FS
fsync(path)                            User Input
                                      ls -l /myfuse/     lib FUSE
chmod(path, mode)
chown(path, oid, gid)        Kernel
                                          VFS             FUSE
(File Operations)

                                   FUSE API Overview
   Reading                                                     Appending
cat /myfuse/test.txt             Writing                echo World >> /myfuse/test2.txt
                                                                                                Truncating
     getattr()         echo Hello > /myfuse/test2.txt               getattr()               echo Woo > /myfuse/test2.txt

                                  getattr()                                                        getattr()
      open()                                                         open()

                                  create()                                                        truncate()
      read()                                                         write()

                                   write()                                                          open()
      read()                                                         flush()

                                   flush()                                                           write()
     release()                                                      release()

                                  release()                                                         flush()

                                                                                                   release()


           Removing
                                    getattr()            unlink()
         rm /myfuse/test.txt
(Directory Operations)

                              FUSE API Overview
               Creating                                                 Removing
            mkdir /myfuse/folder         Reading                     rmdir /myfuse/folder

                 getattr()           ls /myfuse/folder/                    getattr()

                                          getattr()
                  mkdir()                                                   rmdir()

                                          opendir()

                                          readdir()

                                         releasedir()




 Other Methods (getattr() is always called)
                                         chown th30z:develer /myfuse/test.txt          getattr() -> chown()
                                         chmod 755 /myfuse/test.txt                    getattr() -> chmod()

ln -s /myfuse/test.txt /myfuse/test-link.txt   getattr() -> symlink()
mv /myfuse/folder /myfuse/fancy-folder         getattr() -> rename()
First Code Example!
        HTFS
  (HashTable File-System)
HTFS Overview
                                             FS Item/Object
  Traditional Filesystem                  Time of last access      Metadata
  Object with Metadata                    Time of last modification
                                          Time of last status change
  (mode, uid, gid, ...)                   Protection and file-type (mode)
                                          User ID of owner (UID)
  HashTable (dict) keys are               Group ID of owner (GID)
  paths values are Items.                 Extended Attributes (Key/Value)


                                                      Data
                      Item 1

Path 1
Path 2
                                    Item can be a Regular File or
Path 3
                     Item 2         Directory or FIFO...
Path 4               Item 3
                                    Data is raw data or filename
Path 5               Item 4
                                    list if item is a directory.
(Disk - Storage HashTable)
HTFS Item
class Item(object):
  def __init__(self, mode, uid, gid):
    # ----------------------------------- Metadata --
    self.atime = time.time() # time of last acces
    self.mtime = self.atime    # time of last modification
    self.ctime = self.atime   # time of last status change

    self.mode = mode    # protection and file-type
    self.uid = uid      # user ID of owner
    self.gid = gid      # group ID of owner

    # Extended Attributes
    self.xattr = {}

    # --- Data -----------                    This is a File!
    if stat.S_ISDIR(mode):                   we’ve metadata
        self.data = set()
    else:                                  data and even xattr
        self.data = ''
(Data Helper)

                               HTFS Item
def read(self, offset, length):
   return self.data[offset:offset+length]

def write(self, offset, data):
   length = len(data)
   self.data = self.data[:offset] + data + self.data[offset+length:]
   return length

def truncate(self, length):
   if len(self.data) > length:
        self.data = self.data[:length]
   else:
        self.data += 'x00' * (length - len(self.data))          ...a couple
                                                        of utility methods
                                                             to read/write
                                                   and interact with data.
HTFS Fuse Operations
class HTFS(fuse.Fuse):
  def __init__(self, *args, **kwargs):
                                                             getattr() is called before
   fuse.Fuse.__init__(self, *args, **kwargs)                   any operation. Tells to
                                                             the VFS if you can access
  self.uid = os.getuid()
  self.gid = os.getgid()                                      to the specified file and
                                                                    the “State”.
  root_dir = Item(0755 | stat.S_IFDIR, self.uid, self.gid)
  self._storage = {'/': root_dir}                            def getattr(self, path):
                                                              if not path in self._storage:
                                                                  return -errno.ENOENT
File-System must be initialized with
                                                               # Lookup Item and fill the stat struct
          the / directory                                      item = self._storage[path]
                                                               st = zstat(fuse.Stat())
              def main():                                      st.st_mode = item.mode
                                                               st.st_uid = item.uid
                 server = HTFS()
                                                               st.st_gid = item.gid
                 server.main()
                                                               st.st_atime = item.atime
                                                               st.st_mtime = item.mtime
           Your FUSE File-System                               st.st_ctime = item.ctime
                                                               st.st_size = len(item.data)
              is like a Server...                              return st
(File Operations)

                     HTFS Fuse Operations
def create(self, path, flags, mode):
  self._storage[path] = Item(mode | stat.S_IFREG, self.uid, self.gid)
  self._add_to_parent_dir(path)

def truncate(self, path, len):
   self._storage[path].truncate(len)

def read(self, path, size, offset):
  return self._storage[path].read(offset, size)

def write(self, path, buf, offset):
  return self._storage[path].write(offset, buf)
                                                  def unlink(self, path):
                                                   self._remove_from_parent_dir(path)
Disk is just a big dictionary...                   del self._storage[path]

     ...and files are items                        def rename(self, oldpath, newpath):
          key = name                               item = self._storage.pop(oldpath)
                                                   self._storage[newpath] = item
          value = data
(Directory Operations)

                   HTFS Fuse Operations
def mkdir(self, path, mode):
  self._storage[path] = Item(mode | stat.S_IFDIR, self.uid, self.gid)
  self._add_to_parent_dir(path)

def rmdir(self, path):
  self._remove_from_parent_dir(path)
  del self._storage[path]                        Directory is a File
                                                    that contains
def readdir(self, path, offset):
  dir_items = self._storage[path].data               File names
  for item in dir_items:
      yield fuse.Direntry(item)
                                                      as data!

def _add_to_parent_dir(self, path):
   parent_path = os.path.dirname(path)
   filename = os.path.basename(path)
   self._storage[parent_path].data.add(filename)
(XAttr Operations)

                       HTFS Fuse Operations
def setxattr(self, path, name, value, flags):
  self._storage[path].xattr[name] = value

def getxattr(self, path, name, size):
  value = self._storage[path].xattr.get(name, '')
  if size == 0: # We are asked for size of the value
    return len(value)
  return value
                                                 Extended attributes extend
def listxattr(self, path, size):                     the basic attributes
  attrs = self._storage[path].xattr.keys()         associated with files and
  if size == 0:                                 directories in the file system.
      return len(attrs) + len(''.join(attrs))   They are stored as name:data
  return attrs
                                                   pairs associated with file
def removexattr(self, path, name):                      system objects
  if name in self._storage[path].xattr:
      del self._storage[path].xattr[name]
(Other Operations)

                    HTFS Fuse Operations
      Lookup Item,                     def chmod(self, path, mode):
                                          item = self._storage[path]
       Access to its                      item.mode = mode
information/data return
        or write it.                   def chown(self, path, uid, gid):
                                         item = self._storage[path]
        This is the                      item.uid = uid
    File-System’s Job                    item.gid = gid


 def symlink(self, path, newpath):
   item = Item(0644 | stat.S_IFLNK, self.uid, self.gid)
   item.data = path
   self._storage[newpath] = item
   self._add_to_parent_dir(newpath)          Symlinks contains just
 def readlink(self, path):
                                               pointed file path.
   return self._storage[path].data
Other small Examples
Simulate Tera Byte Files
class TBFS(fuse.Fuse):                               Read-Only FS
    def getattr(self, path):
                                                       with 1 file
       st = zstat(fuse.Stat())
       if path == '/':                                 of 128TiB
           st.st_mode = 0644 | stat.S_IFDIR
           st.st_size = 1
           return st
       elif path == '/tera.data':                    No
           st.st_mode = 0644 | stat.S_IFREG
           st.st_size = 128 * (2 ** 40)       Disk/RAM Space
           return st
       return -errno.ENOENT                       Required!
   def read(self, path, size, offset):
      return '0' * size                              read()
                                                Send data only
   def readdir(self, path, offset):
      if path == '/':
                                               when is requested
          yield fuse.Direntry('tera.data')
X^OR File-System
def _xorData(data):
   data = [chr(ord(c) ^ 10) for c in data]          10101010 ^
   return string.join(data, “”)
                                                    01010101 =
class XorFS(fuse.Fuse):                             ---------
    ...                                             11111111 ^
    def write(self, path, buf, offset):
        data = _xorData(buf)                        01010101 =
        return _writeData(path, offset, data)       ---------
   def read(self, path, length, offset):
                                                    10101010
       data = _readData(path, offset, length)
       return _xorData(data)
   ...
                                             res = _xorData(“xor”)
                                             print res // “rex”
                                             res2 = _xorData(res)
                                             print res // “xor”
Dup Write File-System
class DupFS(fuse.Fuse):
    def __init__(self, *args, **kwargs):               Write on your Disk
        ...
        fd_disk1 = open(‘/dev/hda1’, ...)              partition 1 and 2.
        fd_disk2 = open(‘/dev/hdb5’, ...)
        fd_log = open(‘/home/th30z/testfs.log’, ...)
        fd_net = socket.socket(...)                         Send data
        ...
    ...                                                   over Network
    def write(self, path, buf, offset):
        ...
        disk_write(fd_disk1, path, offset, buf)
                                                            Log your
        disk_write(fd_disk2, path, offset, buf)            file-system
        net_write(fd_net, path, offset, buf)
        log_write(fd_log, path, offset, buf)               operations
        ...

                                                ...do other fancy stuff
One more thing
(File and Folders doesn’t fit)


Rethink the File-System
                 I dont’t know
                 where I’ve to
                place this file...

                ...Ok, for now
                 Desktop is a
                 good place...
(Mobile/Home Devices)


Rethink the File-System
            Small Devices
             Small Files
            EMails, Text...

          We need to
        lookup quickly
        our data. Tags,
           Full-Text
           Search...
                            ...Encourage people
                          to view their content
                                     as objects.
(Large Clusters, The Cloud...)


  Rethink the File-System

Distributed data
  Scalability
   Fail over
    Cluster
  Rebalancing
Q&A
                                 Python FUSE




Matteo Bertozzi (Th30z)
     http://th30z.blogspot.com

Más contenido relacionado

La actualidad más candente

Private cloud-webinar
Private cloud-webinarPrivate cloud-webinar
Private cloud-webinar
WSO2
 
Administration des services réseaux
Administration des services réseauxAdministration des services réseaux
Administration des services réseaux
Fethi Kiwa
 

La actualidad más candente (20)

Alphorm.com Formation Hacking et Sécurité, l'essentiel
Alphorm.com Formation Hacking et Sécurité, l'essentielAlphorm.com Formation Hacking et Sécurité, l'essentiel
Alphorm.com Formation Hacking et Sécurité, l'essentiel
 
Tutoriel : Apprendre à configurer et gérer un serveur Web sous Windows Server...
Tutoriel : Apprendre à configurer et gérer un serveur Web sous Windows Server...Tutoriel : Apprendre à configurer et gérer un serveur Web sous Windows Server...
Tutoriel : Apprendre à configurer et gérer un serveur Web sous Windows Server...
 
alphorm.com - Formation Linux LPIC-1/Comptia Linux+
alphorm.com - Formation Linux LPIC-1/Comptia Linux+alphorm.com - Formation Linux LPIC-1/Comptia Linux+
alphorm.com - Formation Linux LPIC-1/Comptia Linux+
 
Correction TD Adressage IP.pdf
Correction  TD Adressage IP.pdfCorrection  TD Adressage IP.pdf
Correction TD Adressage IP.pdf
 
Complete Guide for Linux shell programming
Complete Guide for Linux shell programmingComplete Guide for Linux shell programming
Complete Guide for Linux shell programming
 
VMware vSphere Storage Enhancements
VMware vSphere Storage EnhancementsVMware vSphere Storage Enhancements
VMware vSphere Storage Enhancements
 
Virtualisation
VirtualisationVirtualisation
Virtualisation
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
[FR] Présentatation d'Ansible
[FR] Présentatation d'Ansible [FR] Présentatation d'Ansible
[FR] Présentatation d'Ansible
 
Private cloud-webinar
Private cloud-webinarPrivate cloud-webinar
Private cloud-webinar
 
Administration des services réseaux
Administration des services réseauxAdministration des services réseaux
Administration des services réseaux
 
Linux Basic Commands
Linux Basic CommandsLinux Basic Commands
Linux Basic Commands
 
Windows container security
Windows container securityWindows container security
Windows container security
 
Cours Unix Emsi 2023 2024.pdf
Cours Unix Emsi 2023 2024.pdfCours Unix Emsi 2023 2024.pdf
Cours Unix Emsi 2023 2024.pdf
 
service NFS sous linux
 service NFS sous linux service NFS sous linux
service NFS sous linux
 
Spim Mips Simulator 08 02
Spim Mips Simulator 08 02Spim Mips Simulator 08 02
Spim Mips Simulator 08 02
 
Linux command ppt
Linux command pptLinux command ppt
Linux command ppt
 
Red Hat Certified engineer course
  Red Hat Certified engineer course   Red Hat Certified engineer course
Red Hat Certified engineer course
 
PaperCut Hive Sales Presentation
PaperCut Hive Sales PresentationPaperCut Hive Sales Presentation
PaperCut Hive Sales Presentation
 
Alphorm.com formation-GNS3
Alphorm.com formation-GNS3Alphorm.com formation-GNS3
Alphorm.com formation-GNS3
 

Destacado

Code Refactoring - Live Coding Demo (JavaDay 2014)
Code Refactoring - Live Coding Demo (JavaDay 2014)Code Refactoring - Live Coding Demo (JavaDay 2014)
Code Refactoring - Live Coding Demo (JavaDay 2014)
Peter Kofler
 

Destacado (20)

Writing flexible filesystems in FUSE-Python
Writing flexible filesystems in FUSE-PythonWriting flexible filesystems in FUSE-Python
Writing flexible filesystems in FUSE-Python
 
FLTK Summer Course - Part III - Third Impact
FLTK Summer Course - Part III - Third ImpactFLTK Summer Course - Part III - Third Impact
FLTK Summer Course - Part III - Third Impact
 
Git hooks For PHP Developers
Git hooks For PHP DevelopersGit hooks For PHP Developers
Git hooks For PHP Developers
 
Blisstering drupal module development ppt v1.2
Blisstering drupal module development ppt v1.2Blisstering drupal module development ppt v1.2
Blisstering drupal module development ppt v1.2
 
TMS - Schedule of Presentations and Reports
TMS - Schedule of Presentations and ReportsTMS - Schedule of Presentations and Reports
TMS - Schedule of Presentations and Reports
 
FLTK Summer Course - Part I - First Impact - Exercises
FLTK Summer Course - Part I - First Impact - ExercisesFLTK Summer Course - Part I - First Impact - Exercises
FLTK Summer Course - Part I - First Impact - Exercises
 
FLTK Summer Course - Part VI - Sixth Impact - Exercises
FLTK Summer Course - Part VI - Sixth Impact - ExercisesFLTK Summer Course - Part VI - Sixth Impact - Exercises
FLTK Summer Course - Part VI - Sixth Impact - Exercises
 
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
 
Using Git on the Command Line
Using Git on the Command LineUsing Git on the Command Line
Using Git on the Command Line
 
Manipulating file in Python
Manipulating file in PythonManipulating file in Python
Manipulating file in Python
 
FLTK Summer Course - Part II - Second Impact
FLTK Summer Course - Part II - Second ImpactFLTK Summer Course - Part II - Second Impact
FLTK Summer Course - Part II - Second Impact
 
FLTK Summer Course - Part VIII - Eighth Impact
FLTK Summer Course - Part VIII - Eighth ImpactFLTK Summer Course - Part VIII - Eighth Impact
FLTK Summer Course - Part VIII - Eighth Impact
 
Code Refactoring - Live Coding Demo (JavaDay 2014)
Code Refactoring - Live Coding Demo (JavaDay 2014)Code Refactoring - Live Coding Demo (JavaDay 2014)
Code Refactoring - Live Coding Demo (JavaDay 2014)
 
Introduction to Git Commands and Concepts
Introduction to Git Commands and ConceptsIntroduction to Git Commands and Concepts
Introduction to Git Commands and Concepts
 
Creating Custom Drupal Modules
Creating Custom Drupal ModulesCreating Custom Drupal Modules
Creating Custom Drupal Modules
 
"Git Hooked!" Using Git hooks to improve your software development process
"Git Hooked!" Using Git hooks to improve your software development process"Git Hooked!" Using Git hooks to improve your software development process
"Git Hooked!" Using Git hooks to improve your software development process
 
FLTK Summer Course - Part VII - Seventh Impact
FLTK Summer Course - Part VII  - Seventh ImpactFLTK Summer Course - Part VII  - Seventh Impact
FLTK Summer Course - Part VII - Seventh Impact
 
Advanced Git
Advanced GitAdvanced Git
Advanced Git
 
Servicios web con Python
Servicios web con PythonServicios web con Python
Servicios web con Python
 
FLTK Summer Course - Part II - Second Impact - Exercises
FLTK Summer Course - Part II - Second Impact - Exercises FLTK Summer Course - Part II - Second Impact - Exercises
FLTK Summer Course - Part II - Second Impact - Exercises
 

Similar a Python Fuse

Introduction to Unix-like systems (Part I-IV)
Introduction to Unix-like systems (Part I-IV)Introduction to Unix-like systems (Part I-IV)
Introduction to Unix-like systems (Part I-IV)
hildenjohannes
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)
Vivian Vhaves
 

Similar a Python Fuse (20)

PythonFuse (PyCon4)
PythonFuse (PyCon4)PythonFuse (PyCon4)
PythonFuse (PyCon4)
 
Lect12
Lect12Lect12
Lect12
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOs
 
Introduction to unix
Introduction to unixIntroduction to unix
Introduction to unix
 
Ch1 linux basics
Ch1 linux basicsCh1 linux basics
Ch1 linux basics
 
The Evolution of Storage on Linux - FrOSCon - 2015-08-22
The Evolution of Storage on Linux - FrOSCon - 2015-08-22The Evolution of Storage on Linux - FrOSCon - 2015-08-22
The Evolution of Storage on Linux - FrOSCon - 2015-08-22
 
The evolution of storage on Linux
The evolution of storage on Linux The evolution of storage on Linux
The evolution of storage on Linux
 
Operating Systems 1
Operating Systems 1Operating Systems 1
Operating Systems 1
 
Unix operating system architecture with file structure
Unix operating system architecture with file structure Unix operating system architecture with file structure
Unix operating system architecture with file structure
 
Ubuntu OS Presentation
Ubuntu OS PresentationUbuntu OS Presentation
Ubuntu OS Presentation
 
Operating System
Operating SystemOperating System
Operating System
 
Os
OsOs
Os
 
Part 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxPart 03 File System Implementation in Linux
Part 03 File System Implementation in Linux
 
Fuse'ing python for rapid development of storage efficient
Fuse'ing python for rapid development of storage efficientFuse'ing python for rapid development of storage efficient
Fuse'ing python for rapid development of storage efficient
 
ubantu ppt.pptx
ubantu ppt.pptxubantu ppt.pptx
ubantu ppt.pptx
 
Unix and shell programming | Unix File System | Unix File Permission | Blocks
Unix and shell programming | Unix File System | Unix File Permission | BlocksUnix and shell programming | Unix File System | Unix File Permission | Blocks
Unix and shell programming | Unix File System | Unix File Permission | Blocks
 
Introduction to Unix-like systems (Part I-IV)
Introduction to Unix-like systems (Part I-IV)Introduction to Unix-like systems (Part I-IV)
Introduction to Unix-like systems (Part I-IV)
 
Unix fundamentals
Unix fundamentalsUnix fundamentals
Unix fundamentals
 
Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)Ospresentation 120112074429-phpapp02 (1)
Ospresentation 120112074429-phpapp02 (1)
 
Unix Operating System
Unix Operating SystemUnix Operating System
Unix Operating System
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Python Fuse

  • 1. Python FUSE File-System in USErspace Beyond the Traditional File-Systems Matteo Bertozzi (Th30z) http://th30z.blogspot.com
  • 2. Talk Overview What is a File-System Brief File-Systems History What is FUSE Beyond the Traditional File-System API Overview Examples (Finally some code!!!) http://mbertozzi.develer.com/python-fuse Q&A
  • 3. What is a File-System Is a Method of storing and organizing data to make it easy to find and access. ...to interact with an object You name it, and you say what you want it do. The Filesystem takes the name you give Looks through disk to find the object Gives the object your request to do something.
  • 4. What is a File-System On Disk Format (...serialized struct) ext2, ext3, reiserfs, btrfs... Namespace (Mapping between name and content) /home/th30z/, /usr/local/share/test.c, ... Runtime Service: open(), read(), write(), ...
  • 5. (The Origins) ...A bit of History Multics 1965 (File-System Paper) A General-Purpose File System For Secondary Storage Unix Late 1969 User Program User Space Kernel Space System Call Layer The File-System Only One File-System
  • 6. (The Evolution) ...A bit of History Multics 1965 (File-System Paper) A General-Purpose File System For Secondary Storage Unix Late 1969 User Program User Space Kernel Space System Call Layer (which?) The File-System 1 The File-System 2
  • 7. (The Evolution) ...A bit of History Multics 1965 (File-System Paper) A General-Purpose File System For Secondary Storage Unix Late 1969 User Program User Space Kernel Space System Call Layer (which?) FS 1 FS 2 FS 3 FS 4 ... FS N
  • 8. (The Solution) ...A bit of History Multics 1965 (File-System Paper) A General-Purpose File System For Secondary Storage Unix Late 1969 Sun Microsystem 1984 User Program User Space Kernel Space System Call Layer Vnode/VFS Layer FS 1 FS 2 FS 3 FS 4 ... FS N
  • 9. Virtual File-System Provides an abstraction within the kernel which allows different C Library filesystem implementations to coexist. (open(), read(), write(), ...) User Space Provides the filesystem interface to Kernel Space userspace programs. System Calls (sys_open(), sys_read(), ...) VFS Concepts VFS (vfs_read(), vfs_write(), ...) A super-block object represents a filesystem. Kernel ext2 ReiserFS XFS Supported I-Nodes are filesystem objects such File-Systems ext3 Reiser4 JFS as regular files, directories, FIFOs, ... ext4 Btrfs HFS+ A file object represents a file opened ... ... ... by a process.
  • 10. Wow, It seems not much difficult writing a filesystem
  • 11. Why File-System are Complex You need to know the Kernel (No helper libraries: Qt, Glib, ...) Reduce Disk Seeks / SSD Block limited write cycles Be consistent, Power Down, Unfinished Write... (Journal, Soft-Updates, Copy-on-Write, ...) Bad Blocks, Disk Error Don't waste to much space for Metadata Extra Features: Deduplication, Compression, Cryptography, Snapshots…
  • 12. File-Systems Lines of Code MinixFS 2,000 MinixFS Fuse 800 UFS 2,000 UFS Fuse 1,000 FAT 6,000 ext2 8,000 ext3 16,000 ReiserFS 27,000 ext4 30,000 btrfs 50,000 NFS 10,000 8,000 Fuse 7,000 9,000 FtpFS 800 SshFS 2,000 Kernel Space User Space
  • 13. Building a File-System is Difficult Writing good code is not easy (Bugs, Typo, ...) Writing good code in Kernel Space is much more difficult. Too many reboots during the development Too many Kernel Panic during Reboot We need more flexibility and Speedups!
  • 14. FUSE, develop your file-system with your favorite language and library in user space
  • 15. What is FUSE Kernel module! (like ext2, ReiserFS, XFS, ...) Allows non-privileged user to create their own file- system without editing the kernel code. (User Space) FUSE is particularly useful for writing "virtual file systems", that act as a view or translation of an existing file-system storage device. (Facilitate Disk- Based, Network-Based and Pseudo File-System) Bindings: Python, Objective-C, Ruby, Java, C#, ...
  • 16. File-Systems in User Space? ...Make File Systems Development Super Easy All UserSpace Libraries are Available ...Debugging Tools No Kernel Recompilation No Machine Reboot! ...File-System upgrade/fix 2 sec downtime, app restart!
  • 17. Yeah, ...but what’s FUSE? It’s a File-System with user-space callbacks ntfs-3g sshfs ifuse ChrionFS zfs-fuse YouTubeFS gnome-vfs2 ftpfs cryptoFS RaleighFS U n i x
  • 18. FUSE Kernel Space and User Space Your FS The FUSE kernel module SshFS lib Fuse and the FUSE library FtpFS communicate via a ... User Space special file descriptor /dev/fuse Kernel Space which is obtained by FUSE opening /dev/fuse ext2 ... VFS ext4 drivers Your Fuse FS ... firmware User Input kernel Btrfs ls -l /myfuse/ lib FUSE Kernel VFS FUSE
  • 19. ...be creative Beyond the Traditional File-Systems ImapFS: Access to your mail with grep. Thousand of tools available SocialFS: Log all your cat/grep/sed social network to collect news/jokes and other social things. open() is the most used YouTubeFS: Watch YouTube function in our video as on your disk. applications GMailFS: Use your mailbox as backup disk.
  • 20. FUSE API Overview create(path, mode) mkdir(path, mode) truncate(path, size) unlink(path) mknod(path, mode, dev) readdir(path) open(path, mode) rmdir(path) write(path, data, offset) rename(opath, npath) read(path, length, offset) link(srcpath, dstpath) release(path) Your Fuse FS fsync(path) User Input ls -l /myfuse/ lib FUSE chmod(path, mode) chown(path, oid, gid) Kernel VFS FUSE
  • 21. (File Operations) FUSE API Overview Reading Appending cat /myfuse/test.txt Writing echo World >> /myfuse/test2.txt Truncating getattr() echo Hello > /myfuse/test2.txt getattr() echo Woo > /myfuse/test2.txt getattr() getattr() open() open() create() truncate() read() write() write() open() read() flush() flush() write() release() release() release() flush() release() Removing getattr() unlink() rm /myfuse/test.txt
  • 22. (Directory Operations) FUSE API Overview Creating Removing mkdir /myfuse/folder Reading rmdir /myfuse/folder getattr() ls /myfuse/folder/ getattr() getattr() mkdir() rmdir() opendir() readdir() releasedir() Other Methods (getattr() is always called) chown th30z:develer /myfuse/test.txt getattr() -> chown() chmod 755 /myfuse/test.txt getattr() -> chmod() ln -s /myfuse/test.txt /myfuse/test-link.txt getattr() -> symlink() mv /myfuse/folder /myfuse/fancy-folder getattr() -> rename()
  • 23. First Code Example! HTFS (HashTable File-System)
  • 24. HTFS Overview FS Item/Object Traditional Filesystem Time of last access Metadata Object with Metadata Time of last modification Time of last status change (mode, uid, gid, ...) Protection and file-type (mode) User ID of owner (UID) HashTable (dict) keys are Group ID of owner (GID) paths values are Items. Extended Attributes (Key/Value) Data Item 1 Path 1 Path 2 Item can be a Regular File or Path 3 Item 2 Directory or FIFO... Path 4 Item 3 Data is raw data or filename Path 5 Item 4 list if item is a directory. (Disk - Storage HashTable)
  • 25. HTFS Item class Item(object): def __init__(self, mode, uid, gid): # ----------------------------------- Metadata -- self.atime = time.time() # time of last acces self.mtime = self.atime # time of last modification self.ctime = self.atime # time of last status change self.mode = mode # protection and file-type self.uid = uid # user ID of owner self.gid = gid # group ID of owner # Extended Attributes self.xattr = {} # --- Data ----------- This is a File! if stat.S_ISDIR(mode): we’ve metadata self.data = set() else: data and even xattr self.data = ''
  • 26. (Data Helper) HTFS Item def read(self, offset, length): return self.data[offset:offset+length] def write(self, offset, data): length = len(data) self.data = self.data[:offset] + data + self.data[offset+length:] return length def truncate(self, length): if len(self.data) > length: self.data = self.data[:length] else: self.data += 'x00' * (length - len(self.data)) ...a couple of utility methods to read/write and interact with data.
  • 27. HTFS Fuse Operations class HTFS(fuse.Fuse): def __init__(self, *args, **kwargs): getattr() is called before fuse.Fuse.__init__(self, *args, **kwargs) any operation. Tells to the VFS if you can access self.uid = os.getuid() self.gid = os.getgid() to the specified file and the “State”. root_dir = Item(0755 | stat.S_IFDIR, self.uid, self.gid) self._storage = {'/': root_dir} def getattr(self, path): if not path in self._storage: return -errno.ENOENT File-System must be initialized with # Lookup Item and fill the stat struct the / directory item = self._storage[path] st = zstat(fuse.Stat()) def main(): st.st_mode = item.mode st.st_uid = item.uid server = HTFS() st.st_gid = item.gid server.main() st.st_atime = item.atime st.st_mtime = item.mtime Your FUSE File-System st.st_ctime = item.ctime st.st_size = len(item.data) is like a Server... return st
  • 28. (File Operations) HTFS Fuse Operations def create(self, path, flags, mode): self._storage[path] = Item(mode | stat.S_IFREG, self.uid, self.gid) self._add_to_parent_dir(path) def truncate(self, path, len): self._storage[path].truncate(len) def read(self, path, size, offset): return self._storage[path].read(offset, size) def write(self, path, buf, offset): return self._storage[path].write(offset, buf) def unlink(self, path): self._remove_from_parent_dir(path) Disk is just a big dictionary... del self._storage[path] ...and files are items def rename(self, oldpath, newpath): key = name item = self._storage.pop(oldpath) self._storage[newpath] = item value = data
  • 29. (Directory Operations) HTFS Fuse Operations def mkdir(self, path, mode): self._storage[path] = Item(mode | stat.S_IFDIR, self.uid, self.gid) self._add_to_parent_dir(path) def rmdir(self, path): self._remove_from_parent_dir(path) del self._storage[path] Directory is a File that contains def readdir(self, path, offset): dir_items = self._storage[path].data File names for item in dir_items: yield fuse.Direntry(item) as data! def _add_to_parent_dir(self, path): parent_path = os.path.dirname(path) filename = os.path.basename(path) self._storage[parent_path].data.add(filename)
  • 30. (XAttr Operations) HTFS Fuse Operations def setxattr(self, path, name, value, flags): self._storage[path].xattr[name] = value def getxattr(self, path, name, size): value = self._storage[path].xattr.get(name, '') if size == 0: # We are asked for size of the value return len(value) return value Extended attributes extend def listxattr(self, path, size): the basic attributes attrs = self._storage[path].xattr.keys() associated with files and if size == 0: directories in the file system. return len(attrs) + len(''.join(attrs)) They are stored as name:data return attrs pairs associated with file def removexattr(self, path, name): system objects if name in self._storage[path].xattr: del self._storage[path].xattr[name]
  • 31. (Other Operations) HTFS Fuse Operations Lookup Item, def chmod(self, path, mode): item = self._storage[path] Access to its item.mode = mode information/data return or write it. def chown(self, path, uid, gid): item = self._storage[path] This is the item.uid = uid File-System’s Job item.gid = gid def symlink(self, path, newpath): item = Item(0644 | stat.S_IFLNK, self.uid, self.gid) item.data = path self._storage[newpath] = item self._add_to_parent_dir(newpath) Symlinks contains just def readlink(self, path): pointed file path. return self._storage[path].data
  • 33. Simulate Tera Byte Files class TBFS(fuse.Fuse): Read-Only FS def getattr(self, path): with 1 file st = zstat(fuse.Stat()) if path == '/': of 128TiB st.st_mode = 0644 | stat.S_IFDIR st.st_size = 1 return st elif path == '/tera.data': No st.st_mode = 0644 | stat.S_IFREG st.st_size = 128 * (2 ** 40) Disk/RAM Space return st return -errno.ENOENT Required! def read(self, path, size, offset): return '0' * size read() Send data only def readdir(self, path, offset): if path == '/': when is requested yield fuse.Direntry('tera.data')
  • 34. X^OR File-System def _xorData(data): data = [chr(ord(c) ^ 10) for c in data] 10101010 ^ return string.join(data, “”) 01010101 = class XorFS(fuse.Fuse): --------- ... 11111111 ^ def write(self, path, buf, offset): data = _xorData(buf) 01010101 = return _writeData(path, offset, data) --------- def read(self, path, length, offset): 10101010 data = _readData(path, offset, length) return _xorData(data) ... res = _xorData(“xor”) print res // “rex” res2 = _xorData(res) print res // “xor”
  • 35. Dup Write File-System class DupFS(fuse.Fuse): def __init__(self, *args, **kwargs): Write on your Disk ... fd_disk1 = open(‘/dev/hda1’, ...) partition 1 and 2. fd_disk2 = open(‘/dev/hdb5’, ...) fd_log = open(‘/home/th30z/testfs.log’, ...) fd_net = socket.socket(...) Send data ... ... over Network def write(self, path, buf, offset): ... disk_write(fd_disk1, path, offset, buf) Log your disk_write(fd_disk2, path, offset, buf) file-system net_write(fd_net, path, offset, buf) log_write(fd_log, path, offset, buf) operations ... ...do other fancy stuff
  • 37. (File and Folders doesn’t fit) Rethink the File-System I dont’t know where I’ve to place this file... ...Ok, for now Desktop is a good place...
  • 38. (Mobile/Home Devices) Rethink the File-System Small Devices Small Files EMails, Text... We need to lookup quickly our data. Tags, Full-Text Search... ...Encourage people to view their content as objects.
  • 39. (Large Clusters, The Cloud...) Rethink the File-System Distributed data Scalability Fail over Cluster Rebalancing
  • 40. Q&A Python FUSE Matteo Bertozzi (Th30z) http://th30z.blogspot.com