More Related Content Similar to Impact Of Column Oriented Main Memory Databases On Enterprise Applications Similar to Impact Of Column Oriented Main Memory Databases On Enterprise Applications (20) More from Matthieu Schapranow More from Matthieu Schapranow (20) Impact Of Column Oriented Main Memory Databases On Enterprise Applications2. Disclaimer
This presentation outlines our general product direction and should not be
relied on in making a purchase decision. This presentation is not subject to
your license agreement or any other agreement with SAP. SAP has no
obligation to pursue any course of business outlined in this presentation or to
develop or release any functionality mentioned in this presentation. This
presentation and SAP's strategy and possible future developments are
subject to change and may be changed by SAP at any time for any reason
without notice. This document is provided without a warranty of any kind,
either express or implied, including but not limited to, the implied warranties
of merchantability, fitness for a particular purpose, or non-infringement. SAP
assumes no responsibility for errors or omissions in this document, except if
such damages were caused by SAP intentionally or grossly negligent.
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 2
3. Agenda
1. The Hasso Plattner Institute
2. Technical Foundation of Columnar In-Memory Databases
3. Impact on Enterprise Applications
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 3
4. Agenda
1. The Hasso Plattner Institute
2. Technical Foundation of Columnar In-Memory Databases
3. Impact on Enterprise Applications
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 4
5. Key Facts about the Hasso Plattner Institute
Founded as a public private partnership
in 1998 in Potsdam near Berlin, Germany
Institute belongs to the
University of Potsdam
Ranked 1st in “CHE”
340 B.Sc. and M.Sc. students
10 professors, 91 PhD students
Course of study: IT Systems Engineering
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 5
6. Research Group
Enterprise Platform & Integration Concepts
Prof. Dr. h.c. Hasso Plattner / Dr. Alexander Zeier
Research focuses on the technical aspects of enterprise software and
design of complex applications
Memory-Based Data Management for Enterprise Applications
Human-Centered Software Design and Engineering
Maintenance and Evolution of Service-Oriented Enterprise Software
Integration of RFID Technology in Enterprise Platforms
Architecture-based Performance Simulation
Research co-operations with
Stanford, MIT, etc.
Industry co-operations with
SAP, Siemens, Audi, etc.
Partner of Stanford Partner of MIT in
Center for Design Supply Chain
Research Innovation
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 6
7. Agenda
1. The Hasso Plattner Institute
2. Technical Foundation of Columnar In-Memory Databases
3. Impact on Enterprise Applications
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 7
8. Two separate worlds: OLTP and OLAP?
OLTP OLAP/DSS
Level of operation Full row Selected attributes only
Query complexity Simple Complex
Level of detail Row-level, e.g. entire Colum-level, e.g. aggregation
customer record or group-by
Dominant operation INSERT, UPDATE, and Mainly SELECT
SELECT
Transaction duration Short running Long running
Size of result set Small Large
Query forecast Pre-determined Adhoc
Processing Real-time updates Batch updates
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 8
9. Two separate worlds: OLTP and OLAP?
OLTP OLAP/DSS
Level of operation Full row Selected attributes only
Query complexity Simple Complex
Level of detail Row-level, e.g. entire Colum-level, e.g. aggregation
customer record or group-by
Dominant operation INSERT, UPDATE, and Mainly SELECT
SELECT
Transaction duration Short running Long running
Size of result set Small Large
Query forecast Pre-determined Adhoc
Processing Real-time updates Batch updates
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 9
10. 3 Aspects for a Hybrid Solution
Columnar Storage
New database layout accessing only needed portions of data
Improve access for subsets of attributes
In-Memory
Fastest possible data access
Spatial proximity
Compression
Reduce amount of data to fit in main memory
Use cache and bus capacities more efficient
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 10
11. Columnar Storage: Architecture
Claim: Columnar storage is suited for update-intensive applications
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 11
12. In-Memory: Aggregate Processing Time
The value of an attribute changes by calculation
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 12
13. Compression: Types
Few Distinct Values Many Distinct Values
Ordered Sequence of triples: Delta representation
• Value
• Offset position
• # Occurences
Unordered Sequence of tuples: ?
• Value
• Bitmap for positional
occurence
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 13
14. Scalability: Multiple CPU Cores
Set processing is most frequent access type in EAs
(scan is dominant pattern)
Sequential column-wise scans show best bandwidth utilization between
CPU cores and main memory
Independence of tuples per column allows:
easy partitioning, and
parallel processing (see Hennessy [1])
Faster memory scans by improved memory bandwidth in next
generation CPUs
Neither materialized views nor aggregates
everything is calculated on-the-fly
[1] John L. Hennessy, David A. Patterson: Computer Architecture: A Quantitative Approach
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 14
15. Myth 1: Adapting existing databases leverages
column-oriented perfomance improvement
Traditional Column-Oriented Neither application nor database
caches are necessary
Application
Cache Redundant data objects are
eliminiated
Database
Cache Neither indices nor aggregates
need to be maintained
Number of layers is minimized
Pre-Built
Aggregates No updates
Application logic is adjacent to
raw data
No database locks required
Raw Data
Data movements are minimzed
Sustain use of existing resources
+ Stored Procedures + Mathematical Algorithms
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 15
16. Myth 2: The entire set of business data does
not fit into main memory
SCM SRM
etc.
CRM FI
Use cumulated memory capacity of various blades
Only few columns have high many Only relevant data in memory
different attribute values Partitioning across hardware
Up to ten times higher compression Redundant-free data
possible
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 16
17. Myth 3: Update/Insert of Huge Amounts of
Data Degrades Columnar Performance
Traditional Storing Columnar Storing
Updates
Insert
Our research activities at the HPI in Potsdam showed:
Updates are performed rare
Insert Only
Only very few columns are affected by updates
Further insights available at SAP TechEd 2009 HPI booth.
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 17
18. Agenda
1. The Hasso Plattner Institute
2. Technical Foundation of Columnar In-Memory Databases
3. Impact on Enterprise Applications
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 18
19. Architecture of Existing Financials Systems
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 19
20. Architecture of Simplified Financials Systems
Only base tables and algorithms
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 20
21. Analyzing Real Customer Data
Customer1 Customer2 Customer3 Customer4
BKPF 23M 20M 13M 122K
BSEG 268M 85M 28M 1M
Years 2003-2008 2004-2008 2003-2007 2008/2009
1M records in BSEG ~ 1GB disk storage
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 21
22. Accounting Document Header
Customer 1 Customer 3
Customer 2 Customer 4
99 attributes per customer
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 22
23. Value Updates
Percentage of rows updated
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 23
24. Dunning
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 24
25. Available to Promise
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 25
26. Demand Planning
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 26
27. Insert Only
Tuple visibility indicated by timestamps
(POSTGRES-style time-travel [2])
Additional storage requirements can be
neglected due to low update frequency
Timestamp columns are not compressed to avoid
additional merge costs
Snapshot isolation
Application-level locks
Insert Only
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 27
28. Memory Consumption
Experiments show a general factor 10 in compression (using
dictionary compression and bit vector encoding)
Additional storage savings by removing materialized
aggregates, save ~2×
Keep only the active partition of the data in memory (based
on fiscal year), save ~5×
Next generation blade servers will allow up to 500GB RAM.
Arrays of 100 blades already available
50 TB main memory would allow to cover the majority of
SAP Business Suite customers
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 28
29. Impact on Application Development
Formalized logic must be moved close to the engine - calculations must
take place close to the data
Reduction of application code
OLTP queries must use minimal projections
(SELECT * is not allowed)
No caching necessary anymore
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 29
30. Conclusion
Technology improvements allow re-thinking of how we build
enterprise apps:
A combined OLTP and OLAP system can share the same
in-memory column store data base
Our experiments with real applications and data prove it
Open research challenges:
Disaster recovery, extension for unstructured data,
life cycle based data management
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 30
31. Further Information
# SAP Public Web:
EPIC@HPI: https://epic.hpi.uni-potsdam.de
Hasso Plattner Institute: http://www.hpi-web.de
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 31
32. Work at the speed of thought: Memory-Resident
Technology and the Future of Business
Strategy Session and One-on-One Conversations about what faster more flexible data access
could mean to you know and in the future.
Today | 12:30 – 14:00 | West Meeting Room 103A
33. Thank you! Contact us!
Hasso Plattner Institute
EA²L / Enterprise Platform & Integration Concepts
Matthieu-P. Schapranow
August-Bebel-Str. 88
D-14482 Potsdam, Germany
Matthieu-P. Schapranow
matthieu.schapranow@hpi.uni-potsdam.de
Responsible: Deputy Prof. of Prof. Hasso Plattner
Dr. Alexander Zeier
zeier@hpi.uni-potsdam.de
© SAP 2008 / SAP TechEd 08 / <Session ID> Page 33
34. Feedback
Please complete your session evaluation.
Be courteous — deposit your trash,
and do not take the handouts for the following session.
Thank You !
© SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 34