SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
Operating Systems
        CMPSCI 377
Dynamic Memory Management
                     Emery Berger
  University of Massachusetts Amherst




  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Dynamic Memory Management
    How the heap manager is implemented


        malloc, free
    

        new, delete
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   2
Memory Management
    Ideal memory manager:


        Fast
    

              Raw time, asymptotic runtime, locality
          


        Memory efficient
    

              Low fragmentation
          


    With multicore & multiprocessors:


        Scalable to multiple processors
    

    New issues:


        Secure from attack
    

        Reliable in face of errors
    

        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   3
Memory Manager Functions
    Not just malloc/free


        realloc
    

              Change size of object, copying old contents
          

                     ptr = realloc (ptr, 10);
                 

              But: realloc(ptr, 0) = ?
          

              How about: realloc (NULL, 16) ?
          


    Other fun


        calloc
    

        memalign
    

    Needs ability to locate size & object start


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   4
Fragmentation
    Intuitively, fragmentation stems from


    “breaking” up heap into unusable spaces
        More fragmentation = worse utilization of
    

        memory
    External fragmentation


        Wasted space outside allocated objects
    

    Internal fragmentation


        Wasted space inside an object
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   5
Classical Algorithms
    First-fit


        find first chunk of desired size
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   6
Classical Algorithms
    Best-fit


        find chunk that fits best
    

              Minimizes wasted space
          




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   7
Classical Algorithms
    Worst-fit


        find chunk that fits worst
    

        then split object
    




    Reclaim space: coalesce free adjacent

    objects into one big object




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   8
Implementation Techniques
    Freelists


        Linked lists of objects in same size class
    

              Range of object sizes
          


    First-fit, best-fit in this context





        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   9
Implementation Techniques
    Segregated size classes


        Use free lists, but never coalesce or split
    

    Choice of size classes


        Exact
    

        Powers-of-two
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   10
Implementation Techniques
    Big Bag of Pages (BiBOP)


        Page or pages (multiples of 4K)
    

        Usually segregated size classes
    

    Header contains metadata


        Locate with bitmasking
    

    Limits external fragmentation


    Can be very fast





        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   11
Runtime Analysis
    Key components


        Cost of malloc (best, worst, average)
    

        Cost of free
    

        Cost of size lookup (for realloc & free)
    

    Examine for first-fit, best-fit, segregated


    (with BiBOP)




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   12
Space Bounds
    Fragmentation worst-case for “optimal”:


    O(log M/m)
        M = largest object size
    

        m = smallest object size
    

    Best-fit = O(M * m) !


    Goal: perform well for typical programs


        Considerations:
    

              Internal fragmentation
          

              External fragmentation
          

              Headers (metadata)
          


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   13
Performance Issues
    We’ll talk about scalability later


    Reliability, too


    But: general-purpose allocator often seen


    as too slow




      UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   14
Custom Memory Allocation
    Programmers replace                                    Very common
                                                      

    new/delete, bypassing                                  practice
    system allocator                                             Apache, gcc, lcc, STL,
                                                             

                                                                 database servers…
        Reduce runtime – often
    

                                                                 Language-level
        Expand functionality –                               
    
                                                                 support in C++
        sometimes
                                                                 Widely
        Reduce space – rarely                                
    
                                                                 recommended

                                                                   “Use custom
                                                                     allocators”

           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science           15
Drawbacks of Custom Allocators

    Avoiding system allocator:


        More code to maintain & debug
    

        Can’t use memory debuggers
    

        Not modular or robust:
    

              Mix memory from custom
          

              and general-purpose allocators → crash!
    Increased burden on programmers



    Are custom allocators really a win?

        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   16
(1) Per-Class Allocators

    Recycle freed objects from a free list


    a = new Class1;                    Class1
                                                               Fast
                                       free list
    b = new Class1;                                       +
    c = new Class1;                                                  Linked list operations
                                                                 +
                                          a
    delete a;                                                  Simple
                                                          +
    delete b;
                                                                     Identical semantics
                                         b                       +
    delete c;
                                                                     C++ language support
                                                                 +
    a = new Class1;
                                          c
                                                               Possibly space-inefficient
    b = new Class1;                                       -
    c = new Class1;



            UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science              17
(II) Custom Patterns
             Tailor-made to fit allocation patterns
         

                 Example: 197.parser (natural language
             

                 parser)

                                db
                          a                c
char[MEMORY_LIMIT]

                        end_of_array
                            end_of_array
                                end_of_array
                                   end_of_array
                                       end_of_array
       a = xalloc(8);                           Fast
                                           +
       b = xalloc(16);                                Pointer-bumping allocation
                                                  +
       c = xalloc(8);
                                           - Brittle
       xfree(b);
                                                  - Fixed memory size
       xfree(c);
                                                  - Requires stack-like lifetimes
       d = xalloc(8);
                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   18
(III) Regions


    Separate areas, deletion only en masse


    regioncreate(r)                 r
    regionmalloc(r, sz)
    regiondelete(r)
                                                           - Risky
    Fast
+

                                                                 - Dangling
         Pointer-bumping allocation
     +

                                                                   references
         Deletion of chunks
     +

                                                                 - Too much space
    Convenient
+

         One call frees all memory
     +


             Increasingly popular custom allocator
         
              UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   19
Custom Allocators Are Faster…

                                Runtime - Custom Allocator Benchmarks

                                                              Custom             Win32

                         1.75
    Normalized Runtime




                                        non-regions                               regions
                          1.5
                         1.25
                            1
                         0.75
                          0.5
                         0.25
                           0
                                                                 r




                                                                                       he
                                    er




                                                                                                     lle
                                                     ze
                                    m




                                                                             c




                                                                                               c
                                                               vp


                                                                          gc




                                                                                            lc
                                 rs



                                 si




                                                                                                   ud
                                                                                     ac
                                                   ee



                                                             5.


                                                                        6.
                               d-
                              pa




                                                                                                   m
                                                          17




                                                                                  ap
                                                br




                                                                     17
                            xe
                            7.




                                              c-
                         bo
                         19




                   As good as and sometimes much faster than Win32


                           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science            20
Not So Fast…

                                        Runtime - Custom Allocator Benchmarks
                                                                Custom    Win32     DLmalloc

                         1.75
                                             non-regions                           regions
    Normalized Runtime




                          1.5
                         1.25
                           1
                         0.75
                          0.5
                         0.25
                           0




                                                                                                        lle
                                                            e



                                                                      r




                                                                                        he



                                                                                                 c
                                   r


                                                 m




                                                                               c
                                                                    vp
                                    e




                                                                                               lc
                                                          z




                                                                             gc
                                               si




                                                                                                       ud
                                 rs




                                                       ee




                                                                                     ac
                                                                  5.
                                             d-




                                                                           6.
                             pa




                                                                                                      m
                                                       br


                                                                 17




                                                                                   ap
                                                                          17
                                         xe
                           7.




                                                     c-
                                        bo
                          19




                         DLmalloc: as fast or faster for most benchmarks

                               UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science           21
The Lea Allocator (DLmalloc 2.7.0)
    Mature public-domain general-purpose

    allocator
    Optimized for common allocation patterns


        Per-size quicklists ≈ per-class allocation
    

    Deferred coalescing

    (combining adjacent free objects)
        Highly-optimized fastpath
    

    Space-efficient





        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   22
Space Consumption: Mixed Results

                                      Space - Custom Allocator Benchmarks

                                                         Custom        DLmalloc

                   1.75
                                  non-regions                                     regions
Normalized Space




                    1.5
                   1.25
                      1
                   0.75
                    0.5
                   0.25
                      0




                                                                                                     lle
                                                     e



                                                              r




                                                                                  he



                                                                                               c
                            r


                                        sim




                                                                       c
                                                            vp
                             e




                                                                                            lc
                                                   z




                                                                     gc




                                                                                                    ud
                          rs




                                                ee




                                                                               ac
                                                          5.
                                      d-




                                                                   6.
                      pa




                                                                                                   m
                                                br


                                                         17




                                                                            ap
                                                                  17
                                  xe
                    7.




                                              c-
                                 bo
                   19




                        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science               23
Custom Allocators?
    Generally not worth the trouble:


    use good general-purpose allocator
        Avoids risky software engineering errors
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   24
Problems with Unsafe Languages
       C, C++: pervasive apps, but langs.
   

       memory unsafe
       Numerous opportunities for security
   

       vulnerabilities, errors
           Double free
       

           Invalid free
       

           Uninitialized reads
       

           Dangling pointers
       

           Buffer overflows (stack & heap)
       




           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Soundness for “Erroneous” Programs

        Normally: memory errors ) ? …
    

        Consider infinite-heap allocator:
    

            All news fresh;
        

            ignore delete
                 No dangling pointers, invalid frees,
             

                 double frees
            Every object infinitely large
        

                 No buffer overflows, data overwrites
             


        Transparent to correct program
    

        “Erroneous” programs sound
    

            UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Probabilistic Memory Safety

   Approximate                 with M-heaps (e.g., M=2)

       DieHard: fully-randomized M-heap
   

           Increases odds of benign errors
       

           Probabilistic memory safety
       

                i.e., P(no error) n
            


           Errors independent across heaps
       

                E(users with no error)             n * |users|
            




       ? Efficient implementation…



           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Implementation Choices

       Conventional, freelist-based heaps
   

            Hard to randomize, protect from errors
       

                 Double frees, heap corruption
             


       What about bitmaps? [Wilson90]
   

       – Catastrophic fragmentation
                 Each small object likely to occupy one page
             




           obj              obj                     obj                           obj



                 pages

           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Heap Layout
00000001           1010 10             metadata
size = 2i+3         2i+4     2i+5

                                                                                          heap




        Bitmap-based, segregated size classes
    

              Bit represents one object of given size
        

                   i.e., one bit = 2i+3 bytes, etc.
               


              Prevents fragmentation
        




                   UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Allocation
00000001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                                                                                        heap



    malloc(8):
              compute size class = ceil(log2 sz) – 3
        

              randomly probe bitmap for zero-bit (free)
        


        Fast: runtime O(1)
    

              M=2 – E[# of probes] · 2
        




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Allocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                                                                                        heap



    malloc(8):
              compute size class = ceil(log2 sz) – 3
        

              randomly probe bitmap for zero-bit (free)
        


        Fast: runtime O(1)
    

              M=2 – E[# of probes] · 2
        




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Deallocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                                                                                        heap



        free(ptr):
    

              Ensure object valid – aligned to right address
        

              Ensure allocated – bit set
        

              Resets bit
        


        Prevents invalid frees, double frees
    




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Deallocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                                                                                        heap



        free(ptr):
    

              Ensure object valid – aligned to right address
        

              Ensure allocated – bit set
        

              Resets bit
        


        Prevents invalid frees, double frees
    




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Deallocation
00000001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                                                                                        heap



        free(ptr):
    

              Ensure object valid – aligned to right address
        

              Ensure allocated – bit set
        

              Resets bit
        


        Prevents invalid frees, double frees
    




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Randomized Heaps & Reliability
                  object size = 2i+3                                   object size = 2i+4
                                                                                        …
         24                5          3               1         6        3

My Mozilla: “malignant” overflow

            Objects randomly spread across heap
        

            Different run = different heap
        
                 Errors across heaps independent
            



                                          Your Mozilla: “benign” overflow

                                                                                        …
             1        6         3          2          54                            1


             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
DieHard software architecture


                                            replica1
                               seed1


          input                                                         output
                                            replica2
                               seed2

                                                                vote
            broadcast
                                            replica3
                               seed3

                                   execute replicas
                                       (separate
                                       processes)

      Replication-based fault-tolerance
  
          Requires randomization: errors independent
      



            UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
DieHard Results

        Analytical results (pictures!)
    

            Buffer overflows
        

            Uninitialized reads
        

            Dangling pointer errors (the best)
        

        Empirical results
    

            Runtime overhead
        

            Error avoidance
        

                  Injected faults & actual applications
              




            UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Model overflow as write of live data
 

         Heap half full (max occupancy)
     




           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Model overflow as write of live data
 

         Heap half full (max occupancy)
     




           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Model overflow: random write of live
 

     data
         Heap half full (max occupancy)
     




           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
      replicas




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
      replicas




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
      replicas




     P(Overflow in all replicas) = (½)3 = 1/8
 

     P(No overflow in > 1 replica) = 1-(½)3 = 7/8
 




                 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Analytical Results: Buffer Overflows


    F = free space


    H = heap size

    N = # objects

    worth of
    overflow
    k = replicas





         Overflow one object
     



                UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Empirical Results: Runtime




       UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Empirical Results: Error Avoidance
       Injected faults:
   

           Dangling pointers (@50%, 10 allocations)
       

                glibc: crashes; DieHard: 9/10 correct
            


           Overflows (@1%, 4 bytes over) –
       

                glibc: crashes 9/10, inf loop; DieHard: 10/10 correct
            


       Real faults:
   

           Avoids Squid web cache overflow
       

                Crashes BDW & glibc
            


           Avoids dangling pointer error in Mozilla
       

                DoS in glibc & Windows
            




           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
The End




   UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   47

Más contenido relacionado

Similar a Operating Systems - Dynamic Memory Management

Operating Systems - Virtual Memory
Operating Systems - Virtual MemoryOperating Systems - Virtual Memory
Operating Systems - Virtual MemoryEmery Berger
 
It's always sunny with OpenJ9
It's always sunny with OpenJ9It's always sunny with OpenJ9
It's always sunny with OpenJ9DanHeidinga
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETMaarten Balliauw
 
Processes and Threads
Processes and ThreadsProcesses and Threads
Processes and ThreadsEmery Berger
 
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management....NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...NETFest
 
javascript teach
javascript teachjavascript teach
javascript teachguest3732fa
 
JSBootcamp_White
JSBootcamp_WhiteJSBootcamp_White
JSBootcamp_Whiteguest3732fa
 
Memory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsMemory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsEmery Berger
 
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...InfinIT - Innovationsnetværket for it
 
Operating Systems - Concurrency
Operating Systems - ConcurrencyOperating Systems - Concurrency
Operating Systems - ConcurrencyEmery Berger
 
Brian Oliver Pimp My Data Grid
Brian Oliver  Pimp My Data GridBrian Oliver  Pimp My Data Grid
Brian Oliver Pimp My Data Griddeimos
 
A Re-Introduction to JavaScript
A Re-Introduction to JavaScriptA Re-Introduction to JavaScript
A Re-Introduction to JavaScriptSimon Willison
 
Exploring .NET memory management - JetBrains webinar
Exploring .NET memory management - JetBrains webinarExploring .NET memory management - JetBrains webinar
Exploring .NET memory management - JetBrains webinarMaarten Balliauw
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementKyong-Ha Lee
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...Maarten Balliauw
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 

Similar a Operating Systems - Dynamic Memory Management (20)

Operating Systems - Virtual Memory
Operating Systems - Virtual MemoryOperating Systems - Virtual Memory
Operating Systems - Virtual Memory
 
Scala Sjug 09
Scala Sjug 09Scala Sjug 09
Scala Sjug 09
 
It's always sunny with OpenJ9
It's always sunny with OpenJ9It's always sunny with OpenJ9
It's always sunny with OpenJ9
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
 
Processes and Threads
Processes and ThreadsProcesses and Threads
Processes and Threads
 
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management....NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
 
javascript teach
javascript teachjavascript teach
javascript teach
 
JSBootcamp_White
JSBootcamp_WhiteJSBootcamp_White
JSBootcamp_White
 
Memory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsMemory Management for High-Performance Applications
Memory Management for High-Performance Applications
 
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
 
Operating Systems - Concurrency
Operating Systems - ConcurrencyOperating Systems - Concurrency
Operating Systems - Concurrency
 
Brian Oliver Pimp My Data Grid
Brian Oliver  Pimp My Data GridBrian Oliver  Pimp My Data Grid
Brian Oliver Pimp My Data Grid
 
A Re-Introduction to JavaScript
A Re-Introduction to JavaScriptA Re-Introduction to JavaScript
A Re-Introduction to JavaScript
 
Exploring .NET memory management - JetBrains webinar
Exploring .NET memory management - JetBrains webinarExploring .NET memory management - JetBrains webinar
Exploring .NET memory management - JetBrains webinar
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
Ndp Slides
Ndp SlidesNdp Slides
Ndp Slides
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
Practical Groovy DSL
Practical Groovy DSLPractical Groovy DSL
Practical Groovy DSL
 

Más de Emery Berger

Doppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierDoppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierEmery Berger
 
Dthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingDthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingEmery Berger
 
Programming with People
Programming with PeopleProgramming with People
Programming with PeopleEmery Berger
 
Stabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationStabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationEmery Berger
 
DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)Emery Berger
 
Operating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsOperating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsEmery Berger
 
Operating Systems - File Systems
Operating Systems - File SystemsOperating Systems - File Systems
Operating Systems - File SystemsEmery Berger
 
Operating Systems - Networks
Operating Systems - NetworksOperating Systems - Networks
Operating Systems - NetworksEmery Berger
 
Operating Systems - Queuing Systems
Operating Systems - Queuing SystemsOperating Systems - Queuing Systems
Operating Systems - Queuing SystemsEmery Berger
 
Operating Systems - Distributed Parallel Computing
Operating Systems - Distributed Parallel ComputingOperating Systems - Distributed Parallel Computing
Operating Systems - Distributed Parallel ComputingEmery Berger
 
Operating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationOperating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationEmery Berger
 
Operating Systems - Synchronization
Operating Systems - SynchronizationOperating Systems - Synchronization
Operating Systems - SynchronizationEmery Berger
 
Virtual Memory and Paging
Virtual Memory and PagingVirtual Memory and Paging
Virtual Memory and PagingEmery Berger
 
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsMC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsEmery Berger
 
Vam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorVam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorEmery Berger
 
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementQuantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementEmery Berger
 
Garbage Collection without Paging
Garbage Collection without PagingGarbage Collection without Paging
Garbage Collection without PagingEmery Berger
 
DieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesDieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesEmery Berger
 
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...Emery Berger
 
Composing High-Performance Memory Allocators with Heap Layers
Composing High-Performance Memory Allocators with Heap LayersComposing High-Performance Memory Allocators with Heap Layers
Composing High-Performance Memory Allocators with Heap LayersEmery Berger
 

Más de Emery Berger (20)

Doppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierDoppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language Barrier
 
Dthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingDthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic Multithreading
 
Programming with People
Programming with PeopleProgramming with People
Programming with People
 
Stabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationStabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance Evaluation
 
DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)
 
Operating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsOperating Systems - Advanced File Systems
Operating Systems - Advanced File Systems
 
Operating Systems - File Systems
Operating Systems - File SystemsOperating Systems - File Systems
Operating Systems - File Systems
 
Operating Systems - Networks
Operating Systems - NetworksOperating Systems - Networks
Operating Systems - Networks
 
Operating Systems - Queuing Systems
Operating Systems - Queuing SystemsOperating Systems - Queuing Systems
Operating Systems - Queuing Systems
 
Operating Systems - Distributed Parallel Computing
Operating Systems - Distributed Parallel ComputingOperating Systems - Distributed Parallel Computing
Operating Systems - Distributed Parallel Computing
 
Operating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationOperating Systems - Advanced Synchronization
Operating Systems - Advanced Synchronization
 
Operating Systems - Synchronization
Operating Systems - SynchronizationOperating Systems - Synchronization
Operating Systems - Synchronization
 
Virtual Memory and Paging
Virtual Memory and PagingVirtual Memory and Paging
Virtual Memory and Paging
 
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsMC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
 
Vam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorVam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory Allocator
 
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementQuantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
 
Garbage Collection without Paging
Garbage Collection without PagingGarbage Collection without Paging
Garbage Collection without Paging
 
DieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesDieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe Languages
 
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...
Exploiting Multicore CPUs Now: Scalability and Reliability for Off-the-shelf ...
 
Composing High-Performance Memory Allocators with Heap Layers
Composing High-Performance Memory Allocators with Heap LayersComposing High-Performance Memory Allocators with Heap Layers
Composing High-Performance Memory Allocators with Heap Layers
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Operating Systems - Dynamic Memory Management

  • 1. Operating Systems CMPSCI 377 Dynamic Memory Management Emery Berger University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 2. Dynamic Memory Management How the heap manager is implemented  malloc, free  new, delete  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
  • 3. Memory Management Ideal memory manager:  Fast  Raw time, asymptotic runtime, locality  Memory efficient  Low fragmentation  With multicore & multiprocessors:  Scalable to multiple processors  New issues:  Secure from attack  Reliable in face of errors  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 3
  • 4. Memory Manager Functions Not just malloc/free  realloc  Change size of object, copying old contents  ptr = realloc (ptr, 10);  But: realloc(ptr, 0) = ?  How about: realloc (NULL, 16) ?  Other fun  calloc  memalign  Needs ability to locate size & object start  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4
  • 5. Fragmentation Intuitively, fragmentation stems from  “breaking” up heap into unusable spaces More fragmentation = worse utilization of  memory External fragmentation  Wasted space outside allocated objects  Internal fragmentation  Wasted space inside an object  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
  • 6. Classical Algorithms First-fit  find first chunk of desired size  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6
  • 7. Classical Algorithms Best-fit  find chunk that fits best  Minimizes wasted space  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7
  • 8. Classical Algorithms Worst-fit  find chunk that fits worst  then split object  Reclaim space: coalesce free adjacent  objects into one big object UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8
  • 9. Implementation Techniques Freelists  Linked lists of objects in same size class  Range of object sizes  First-fit, best-fit in this context  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9
  • 10. Implementation Techniques Segregated size classes  Use free lists, but never coalesce or split  Choice of size classes  Exact  Powers-of-two  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 10
  • 11. Implementation Techniques Big Bag of Pages (BiBOP)  Page or pages (multiples of 4K)  Usually segregated size classes  Header contains metadata  Locate with bitmasking  Limits external fragmentation  Can be very fast  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
  • 12. Runtime Analysis Key components  Cost of malloc (best, worst, average)  Cost of free  Cost of size lookup (for realloc & free)  Examine for first-fit, best-fit, segregated  (with BiBOP) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
  • 13. Space Bounds Fragmentation worst-case for “optimal”:  O(log M/m) M = largest object size  m = smallest object size  Best-fit = O(M * m) !  Goal: perform well for typical programs  Considerations:  Internal fragmentation  External fragmentation  Headers (metadata)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
  • 14. Performance Issues We’ll talk about scalability later  Reliability, too  But: general-purpose allocator often seen  as too slow UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
  • 15. Custom Memory Allocation Programmers replace Very common   new/delete, bypassing practice system allocator Apache, gcc, lcc, STL,  database servers… Reduce runtime – often  Language-level Expand functionality –   support in C++ sometimes Widely Reduce space – rarely   recommended “Use custom allocators” UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
  • 16. Drawbacks of Custom Allocators Avoiding system allocator:  More code to maintain & debug  Can’t use memory debuggers  Not modular or robust:  Mix memory from custom  and general-purpose allocators → crash! Increased burden on programmers  Are custom allocators really a win? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
  • 17. (1) Per-Class Allocators Recycle freed objects from a free list  a = new Class1; Class1 Fast free list b = new Class1; + c = new Class1; Linked list operations + a delete a; Simple + delete b; Identical semantics b + delete c; C++ language support + a = new Class1; c Possibly space-inefficient b = new Class1; - c = new Class1; UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
  • 18. (II) Custom Patterns Tailor-made to fit allocation patterns  Example: 197.parser (natural language  parser) db a c char[MEMORY_LIMIT] end_of_array end_of_array end_of_array end_of_array end_of_array a = xalloc(8); Fast + b = xalloc(16); Pointer-bumping allocation + c = xalloc(8); - Brittle xfree(b); - Fixed memory size xfree(c); - Requires stack-like lifetimes d = xalloc(8); UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
  • 19. (III) Regions Separate areas, deletion only en masse  regioncreate(r) r regionmalloc(r, sz) regiondelete(r) - Risky Fast + - Dangling Pointer-bumping allocation + references Deletion of chunks + - Too much space Convenient + One call frees all memory + Increasingly popular custom allocator  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
  • 20. Custom Allocators Are Faster… Runtime - Custom Allocator Benchmarks Custom Win32 1.75 Normalized Runtime non-regions regions 1.5 1.25 1 0.75 0.5 0.25 0 r he er lle ze m c c vp gc lc rs si ud ac ee 5. 6. d- pa m 17 ap br 17 xe 7. c- bo 19 As good as and sometimes much faster than Win32  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
  • 21. Not So Fast… Runtime - Custom Allocator Benchmarks Custom Win32 DLmalloc 1.75 non-regions regions Normalized Runtime 1.5 1.25 1 0.75 0.5 0.25 0 lle e r he c r m c vp e lc z gc si ud rs ee ac 5. d- 6. pa m br 17 ap 17 xe 7. c- bo 19 DLmalloc: as fast or faster for most benchmarks  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
  • 22. The Lea Allocator (DLmalloc 2.7.0) Mature public-domain general-purpose  allocator Optimized for common allocation patterns  Per-size quicklists ≈ per-class allocation  Deferred coalescing  (combining adjacent free objects) Highly-optimized fastpath  Space-efficient  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
  • 23. Space Consumption: Mixed Results Space - Custom Allocator Benchmarks Custom DLmalloc 1.75 non-regions regions Normalized Space 1.5 1.25 1 0.75 0.5 0.25 0 lle e r he c r sim c vp e lc z gc ud rs ee ac 5. d- 6. pa m br 17 ap 17 xe 7. c- bo 19 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
  • 24. Custom Allocators? Generally not worth the trouble:  use good general-purpose allocator Avoids risky software engineering errors  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
  • 25. Problems with Unsafe Languages C, C++: pervasive apps, but langs.  memory unsafe Numerous opportunities for security  vulnerabilities, errors Double free  Invalid free  Uninitialized reads  Dangling pointers  Buffer overflows (stack & heap)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 26. Soundness for “Erroneous” Programs Normally: memory errors ) ? …  Consider infinite-heap allocator:  All news fresh;  ignore delete No dangling pointers, invalid frees,  double frees Every object infinitely large  No buffer overflows, data overwrites  Transparent to correct program  “Erroneous” programs sound  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 27. Probabilistic Memory Safety Approximate with M-heaps (e.g., M=2) DieHard: fully-randomized M-heap  Increases odds of benign errors  Probabilistic memory safety  i.e., P(no error) n  Errors independent across heaps  E(users with no error) n * |users|  ? Efficient implementation… UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 28. Implementation Choices Conventional, freelist-based heaps  Hard to randomize, protect from errors  Double frees, heap corruption  What about bitmaps? [Wilson90]  – Catastrophic fragmentation Each small object likely to occupy one page  obj obj obj obj pages UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 29. Randomized Heap Layout 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap Bitmap-based, segregated size classes  Bit represents one object of given size  i.e., one bit = 2i+3 bytes, etc.  Prevents fragmentation  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 30. Randomized Allocation 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap malloc(8): compute size class = ceil(log2 sz) – 3  randomly probe bitmap for zero-bit (free)  Fast: runtime O(1)  M=2 – E[# of probes] · 2  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 31. Randomized Allocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap malloc(8): compute size class = ceil(log2 sz) – 3  randomly probe bitmap for zero-bit (free)  Fast: runtime O(1)  M=2 – E[# of probes] · 2  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 32. Randomized Deallocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 33. Randomized Deallocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 34. Randomized Deallocation 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 35. Randomized Heaps & Reliability object size = 2i+3 object size = 2i+4 … 24 5 3 1 6 3 My Mozilla: “malignant” overflow Objects randomly spread across heap  Different run = different heap  Errors across heaps independent  Your Mozilla: “benign” overflow … 1 6 3 2 54 1 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 36. DieHard software architecture replica1 seed1 input output replica2 seed2 vote broadcast replica3 seed3 execute replicas (separate processes) Replication-based fault-tolerance  Requires randomization: errors independent  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 37. DieHard Results Analytical results (pictures!)  Buffer overflows  Uninitialized reads  Dangling pointer errors (the best)  Empirical results  Runtime overhead  Error avoidance  Injected faults & actual applications  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 38. Analytical Results: Buffer Overflows Model overflow as write of live data  Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 39. Analytical Results: Buffer Overflows Model overflow as write of live data  Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 40. Analytical Results: Buffer Overflows Model overflow: random write of live  data Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 41. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 42. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 43. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas P(Overflow in all replicas) = (½)3 = 1/8  P(No overflow in > 1 replica) = 1-(½)3 = 7/8  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 44. Analytical Results: Buffer Overflows F = free space  H = heap size  N = # objects  worth of overflow k = replicas  Overflow one object  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 45. Empirical Results: Runtime UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 46. Empirical Results: Error Avoidance Injected faults:  Dangling pointers (@50%, 10 allocations)  glibc: crashes; DieHard: 9/10 correct  Overflows (@1%, 4 bytes over) –  glibc: crashes 9/10, inf loop; DieHard: 10/10 correct  Real faults:  Avoids Squid web cache overflow  Crashes BDW & glibc  Avoids dangling pointer error in Mozilla  DoS in glibc & Windows  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 47. The End UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 47