Microsoft Hekaton

HEKATON
SQL Server’s Memory-Optimized OLTP Engine
Presented by: Prutha Date and Siraj Memon

Outline
● Introduction
● Design Consideration
● High-Level Architecture
● Storage and Indexing
● Programmability and Query Processing
● Transaction Management and Logging
● Garbage Collection
● Experimental Results
● Conclusion
● Demo

Introduction
● Database Engine: Optimized for Memory-resident data
● Targeted for OLTP workloads
● Integrated into SQL Server and uses T-SQL
● Fully transactional and durable
● Tables - Compiled into machine code
● Two Index Types: Hash Index and Range Index
● High-level of concurrency
OLTP (Online Transaction Processing)
T-SQL (Transact - Structured Query Language)

Terminology
● Hekaton Table
● Hekaton Index
● Regular Table
● Regular Index
● Compiled Stored Procedure
● Interpreted Stored Procedure

Competitors
● Commercial
● VoltDB
● SAP in-memory computing
● Oracle TimesTen
● IBM SolidDB
● Research
● Hyrise
● H-store
● HyPer

Architectural Principles
● Optimize Indexes for main memory
● Uses lock-free hash tables and Bw-trees for optimized indexing
● Index operations not logged
● Rebuilding indexes during recovery
● Eliminate Latches and Lock
● Latch-free data structure – No latches or spinlocks
● Optimistic Multi-version concurrency control – transaction isolation
● Compile requests to native code
● Decisions: Compile time rather than Runtime
● Converts statements in T-SQL into customized, highly-efficient machine
code

Partitioning – We don’t like..
● Problem with Partitioning
● Secondary Indexes
● Works great ONLY if workload is also partitionable
● Not sufficiently robust for SQL server
● Any thread can access any part of the database
● Single Shared hash table

High Level Architecture
● Hekaton Storage Engine
● Manages user data and indexes
● Base mechanism for storage, check-pointing and high-availability
● Hekaton Compiler
● Abstract tree representation of T-SQL stored procedure
● Compiles the procedure into native code
● Hekaton Runtime System
● Integration with SQL Server resources
● Common library of additional functionality

Storage and Indexing
● Two types of Index
● Hash Index: Lock-free hash tables
● Range Index: Bw-trees
● Use of Multiversioning – Updates create new version
● Reads:
● Read operation specifies a logical read time and only versions whose valid
time overlaps the read time are visible to the read
● At most one version is visible
● Updates:
● Delete Old - Insert New

Storage and Indexing (continued)

Architecture of Hekaton Compiler

Programmability and Query Processing
● Compile-once Execute-many-times
● High level of language compatibility
● Reuse of SQL Server T-SQL compilation stack
● Output of Hekaton compiler is C code
● Invoking the compiler:
● During creation of a memory optimized table
● During creation of a compiled stored procedure

Schema Compilation
● Hekaton storage engine treats records as opaque objects
● Hekaton compiler provides the engine with customized callback
functions for each table
● Task of Callback functions
● Computing a hash function on a key or record
● Comparing two records
● Serializing a record into a log buffer
● Callback functions are compiled into Native code which makes index
operations extremely efficient

Compiled Stored Procedure
● Compatibility issues between T-SQL and C datatypes
● Problem Solver:
● MAT (Mixed Abstract Tree)
● PIT (Pure Imperative Tree)
● Each operator implements a common interface so that they can be
composed into arbitrarily complex plans
● Entire Query plan into a single function using labels and gotos
● Supports both blocking and non-blocking operators

Example
Fig.1: Sample T-SQL Procedure Fig.2: Query Plan
Fig.3: Operator interconnections for Sample Procedure

Query Interop
● Restrictions of Compiled Stored Procedures
● Supports limited set of options
● Stored procedures must execute in a predefined security context
● Must execute in the context of a single transaction
● Ad-hoc mechanism that enables conventional query execution engine
to access memory optimized tables
● Features
● Import and Export for memory optimized tables
● Ad-hoc queries and data repair support
● Support for transactions that access both kind of tables
● Ease of app migration

Transaction Management
● Hekaton utilizes optimistic multiversion concurrency control (MVCC)
to provide snapshot, repeatable read and serializable transaction
isolation without locking
● Serializable – guarantee that transaction will see exactly the same
data if all its reads were repeated at the end of the transaction
● Properties to ensure serializability:
● Read stability
● Phantom avoidance
● Timestamps are used to specify
● Valid Time
● Logical Read Time
● Commit/End time
● Version visible if Begin Time < Read Time < Execution Time

Transaction Commit Processing
● Validation and Dependencies
● Obtain End timestamp
● Validate for Read Stability and Phantom Avoidance
● Commit Dependency
● Dependency counter
● Read barrier
● Commit Logging and Post-Processing
● Changes to database are logged to transaction log
● Update versions with end timestamp of transactions
● Transaction Rollback
● Invalidate all versions created by the transaction using Write Sets.

Transaction Durability
● Uses transaction logs and checkpoints to ensure durability
● Integrated with Always-On component that maintains highly available
replicas
● Data on external storage consists of –
● Log streams (Logical effects of committed transactions to redo it)
● Checkpoint streams (Compressed representation of the log)
● Data Stream (all inserted versions during a timestamp interval)
● Delta Stream (a dense list of integers identifying deleted versions for its
corresponding data stream)
● Note: Index operations are not logged; They are reconstructed on
recovery.

Transaction Logging and Checkpoints
● Transaction Logging
● One transaction – one log file
● Does not use WAL (Write-ahead logging)
● Uses a single log stream per database
● Checkpoints
● Continuous Checkpointing
● Streaming I/O
● Checkpoint Files and Checkpoint Process
● Recovery
● Parallelism within Hekaton
● Parallelism between SQL Server and Hekaton

Garbage Collection
● Version of a record is garbage if it is no longer visible to any active
transaction
● Properties of GC subsystem: Non-blocking, co-operative, incremental,
parallelizable and scalable
● Garbage Correctness
● Version whose end timestamp < Oldest active transaction is not
visible
● Version becomes garbage if -
●Deleted (Explicit DELETE or through UPDATE)
●Cannot be read or acted upon by any active transaction
●Transaction Rollback
● Garbage Removal
● Unlink from indexes
● Reclaim the version

Experimental Results - CPU Efficiency
Comparison of CPU efficiency for lookups Comparison of CPU efficiency for updates

Experimental Results - Scaling Under
Contention
• Experiment illustrating scalability of Hekaton engine

Conclusion
● Optimized in-memory OLTP workloads oriented database engine by
Microsoft
● Fully integrated with SQL Server
● Uses latch-free data structures, multi-versioning concurrency control,
compiled T-SQL stored procedure
● Ensure durability by logging and checkpointing
● High availability – SQL Server’s Always-On feature
● Order of magnitude improvement in efficiency and scalability with
minimal changes to user applications.

References
● http://vldb.org/pvldb/vol5/p298_per-akelarson_vldb2012.pdf
● http://nms.csail.mit.edu/~stavros/pubs/OLTP_sigmod08.pdf
● http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/edbt09shor
emt.pdf
● http://research.microsoft.com/pubs/178758/bw-tree-icde2013-final.pdf
● https://voltdb.com/
● http://llvm.org/
● http://www.oracle.com/technetwork/database/database-
technologies/timesten/overview/index.html

Microsoft Hekaton

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (11)

Similar a Microsoft Hekaton

Similar a Microsoft Hekaton (20)

Último

Último (20)

Microsoft Hekaton