3. Introduction
● Database Engine: Optimized for Memory-resident data
● Targeted for OLTP workloads
● Integrated into SQL Server and uses T-SQL
● Fully transactional and durable
● Tables - Compiled into machine code
● Two Index Types: Hash Index and Range Index
● High-level of concurrency
OLTP (Online Transaction Processing)
T-SQL (Transact - Structured Query Language)
6. Architectural Principles
● Optimize Indexes for main memory
● Uses lock-free hash tables and Bw-trees for optimized indexing
● Index operations not logged
● Rebuilding indexes during recovery
● Eliminate Latches and Lock
● Latch-free data structure – No latches or spinlocks
● Optimistic Multi-version concurrency control – transaction isolation
● Compile requests to native code
● Decisions: Compile time rather than Runtime
● Converts statements in T-SQL into customized, highly-efficient machine
code
7. Partitioning – We don’t like..
● Problem with Partitioning
● Secondary Indexes
● Works great ONLY if workload is also partitionable
● Not sufficiently robust for SQL server
● Any thread can access any part of the database
● Single Shared hash table
8. High Level Architecture
● Hekaton Storage Engine
● Manages user data and indexes
● Base mechanism for storage, check-pointing and high-availability
● Hekaton Compiler
● Abstract tree representation of T-SQL stored procedure
● Compiles the procedure into native code
● Hekaton Runtime System
● Integration with SQL Server resources
● Common library of additional functionality
10. Storage and Indexing
● Two types of Index
● Hash Index: Lock-free hash tables
● Range Index: Bw-trees
● Use of Multiversioning – Updates create new version
● Reads:
● Read operation specifies a logical read time and only versions whose valid
time overlaps the read time are visible to the read
● At most one version is visible
● Updates:
● Delete Old - Insert New
13. Programmability and Query Processing
● Compile-once Execute-many-times
● High level of language compatibility
● Reuse of SQL Server T-SQL compilation stack
● Output of Hekaton compiler is C code
● Invoking the compiler:
● During creation of a memory optimized table
● During creation of a compiled stored procedure
14. Schema Compilation
● Hekaton storage engine treats records as opaque objects
● Hekaton compiler provides the engine with customized callback
functions for each table
● Task of Callback functions
● Computing a hash function on a key or record
● Comparing two records
● Serializing a record into a log buffer
● Callback functions are compiled into Native code which makes index
operations extremely efficient
15. Compiled Stored Procedure
● Compatibility issues between T-SQL and C datatypes
● Problem Solver:
● MAT (Mixed Abstract Tree)
● PIT (Pure Imperative Tree)
● Each operator implements a common interface so that they can be
composed into arbitrarily complex plans
● Entire Query plan into a single function using labels and gotos
● Supports both blocking and non-blocking operators
16. Example
Fig.1: Sample T-SQL Procedure Fig.2: Query Plan
Fig.3: Operator interconnections for Sample Procedure
17. Query Interop
● Restrictions of Compiled Stored Procedures
● Supports limited set of options
● Stored procedures must execute in a predefined security context
● Must execute in the context of a single transaction
● Ad-hoc mechanism that enables conventional query execution engine
to access memory optimized tables
● Features
● Import and Export for memory optimized tables
● Ad-hoc queries and data repair support
● Support for transactions that access both kind of tables
● Ease of app migration
18. Transaction Management
● Hekaton utilizes optimistic multiversion concurrency control (MVCC)
to provide snapshot, repeatable read and serializable transaction
isolation without locking
● Serializable – guarantee that transaction will see exactly the same
data if all its reads were repeated at the end of the transaction
● Properties to ensure serializability:
● Read stability
● Phantom avoidance
● Timestamps are used to specify
● Valid Time
● Logical Read Time
● Commit/End time
● Version visible if Begin Time < Read Time < Execution Time
19. Transaction Commit Processing
● Validation and Dependencies
● Obtain End timestamp
● Validate for Read Stability and Phantom Avoidance
● Commit Dependency
● Dependency counter
● Read barrier
● Commit Logging and Post-Processing
● Changes to database are logged to transaction log
● Update versions with end timestamp of transactions
● Transaction Rollback
● Invalidate all versions created by the transaction using Write Sets.
20. Transaction Durability
● Uses transaction logs and checkpoints to ensure durability
● Integrated with Always-On component that maintains highly available
replicas
● Data on external storage consists of –
● Log streams (Logical effects of committed transactions to redo it)
● Checkpoint streams (Compressed representation of the log)
● Data Stream (all inserted versions during a timestamp interval)
● Delta Stream (a dense list of integers identifying deleted versions for its
corresponding data stream)
● Note: Index operations are not logged; They are reconstructed on
recovery.
21. Transaction Logging and Checkpoints
● Transaction Logging
● One transaction – one log file
● Does not use WAL (Write-ahead logging)
● Uses a single log stream per database
● Checkpoints
● Continuous Checkpointing
● Streaming I/O
● Checkpoint Files and Checkpoint Process
● Recovery
● Parallelism within Hekaton
● Parallelism between SQL Server and Hekaton
22. Garbage Collection
● Version of a record is garbage if it is no longer visible to any active
transaction
● Properties of GC subsystem: Non-blocking, co-operative, incremental,
parallelizable and scalable
● Garbage Correctness
● Version whose end timestamp < Oldest active transaction is not
visible
● Version becomes garbage if -
●Deleted (Explicit DELETE or through UPDATE)
●Cannot be read or acted upon by any active transaction
●Transaction Rollback
● Garbage Removal
● Unlink from indexes
● Reclaim the version
23. Experimental Results - CPU Efficiency
Comparison of CPU efficiency for lookups Comparison of CPU efficiency for updates
24. Experimental Results - Scaling Under
Contention
• Experiment illustrating scalability of Hekaton engine
25. Conclusion
● Optimized in-memory OLTP workloads oriented database engine by
Microsoft
● Fully integrated with SQL Server
● Uses latch-free data structures, multi-versioning concurrency control,
compiled T-SQL stored procedure
● Ensure durability by logging and checkpointing
● High availability – SQL Server’s Always-On feature
● Order of magnitude improvement in efficiency and scalability with
minimal changes to user applications.