A Time Traveller’s Guide to DB2: Technology Themes for 2014 and Beyond
1. #IDUG
A Time Traveller’s Guide to DB2:
Technology Themes for 2014 and
Beyond
Julian StuhlerJulian Stuhler
Principal Consultant
Triton Consulting
2. #IDUG
Disclaimer
“Prediction is very difficult, especially if it's about
the future”
Nils Bohr, Nobel laureate in Physics
• Any mention of future features, products or overall strategic direction are purely my personal
opinion, and no reliance should be placed upon them ever coming to pass
• The user assumes the entire risk related to its use of information in this presentation. Triton
Consulting provides such information "as is," and disclaims any and all warranties, whether
express or implied, including (without limitation) any implied warranties of merchantability or
fitness for a particular purpose. In no event will Triton Consulting be liable to you or to any
third party for any direct, indirect, incidental, consequential, special or exemplary damages or
lost profit resulting from any use or misuse of this data.
• DB2 for OS/390 and DB2 for z/OS are trademarks of International Business Machine
corporation. This presentation uses many terms that are trademarks. Wherever we are
aware of trademarks the name has been spelled in capitals.
2
4. #IDUG
DB2 Today
“The only constant is change”
“Good character is not formed in a week or a
month. It is created little by little, day by day.
Protracted and patient effort is needed to develop
DBAs are
DBAs.”
They are
4
Heraclitus
Greek Philosopher
(c.535 BC – 475 BC)
good character.”DBAs.”
5. #IDUG
DB2 Today
5
June 1 2008 Sept 24 2013
Worldles generated by Tagxedo
http://http://www.tagxedo.com
Old IBM website from Wayback
Machine
http://web.archive.org http://www-01.ibm.com/software/data/
7. #IDUG
DB2 Technology Themes
• Cost Reduction
• High Availability
• In-Memory Computing
• DB2 Skills Availability• DB2 Skills Availability
• Database Commoditisation
• Big Data
7
8. #IDUG
Cost Reduction – Today
• Ongoing focus on improving profitability and ruthlessly
eliminating unnecessary costs
• IT spending is a major cost component for all organisations
• Gartner’s 2013 Worldwide IT Spending analysis showed growth rate
of just 0.4% for 2013
• Managing hardware costs• Managing hardware costs
• Moore’s law is still alive and well as it approaches its 50th birthday
• Compression can dramatically reduce DASD costs
• Adaptive compression in DB2 for LUW V10 can yield spectacular gains
• Further savings possible via actionable compression in BLU
• Overall cost savings being partially offset by move to more expensive
SSD devices (although they are getting cheaper too)
• Virtualisation and consolidation technologies are helping to improve
hardware utilisation rates
• Linux on System z offers some intriguing possibilities here
8
9. #IDUG
Cost Reduction – Today
• Managing Software licence fees
• MLC pricing on the mainframe means that CPU burned during peak period
(4HRA) directly impacts software costs
• Ongoing focus within DB2 for z/OS to drive down
CPU consumption
• DB2 code optimisation in DB2 10 (and now in DB2 11)• DB2 code optimisation in DB2 10 (and now in DB2 11)
• Increased use of System z speciality engines and
hybrid solutions such as the IBM DB2 Analytic
Accelerator
• Aggressive new packaging options for DB2 for LUW
• AWSE and AESE include lots of additional functionality
such as compression, BLU, pureScale, etc
• Linux on System z can offer major software licence savings
• Managing people costs
• Salary increases have been generally outstripping increase in overall IT
spend, so we’re all consuming a greater proportion of the IT budget
• From ALTER to Autonomics, it’s all about improving productivity and doing
more with less
9
10. #IDUG
Cost Reduction – Tomorrow
• Signs that pressure is easing on
overall IT budgets
• Latest Gartner estimates show 3.2%
annual increase for 2014, to $3.8
trilliontrillion
• 6.9% increase in enterprise software
spending, with CRM, DBMS and data
management the major items
• However, Gartner expects a
renewed focus on implementing
new IT systems which will consume
budget
• Current cost management pressures
unlikely to reduce
10
11. #IDUG
Cost Reduction – Tomorrow
• Hardware
• Moore’s Law under pressure, only has 6-8 years left before physics
dictate fundamental shift from CMOS to other technologies
• Photonics, quantum computers
• Ongoing focus on reducing operational
costs will continue to deliver benefits
• Recent Intel POC submerged high-end
servers in 3M’s dielectric “Novec
Engineered Fluid” to increase server
density and cut cooling costs by up
to 95%
• System z hardware approaching thermal limits for indirect cooling so
mainframes may go this way too
11
12. #IDUG
Cost Reduction – Tomorrow
• Software Licence Fees
• Increased offload to zIIP, IDAA and other speciality processors and hybrid
solutions
• But what happens when we approach 100% CP offload?
• New MLC models to recognise the changing role of the mainframe• New MLC models to recognise the changing role of the mainframe
• IBM announcement on 8th April 2014 for new model offering up to 60% reduction on
processor capacity reported for Mobile transactions http://www-
03.ibm.com/press/uk/en/pressrelease/43619.wss
• Practice of “bundling” likely to continue as a way of maintaining software
revenues on distributed platforms
• People Costs
• Skills shortages likely to continue to increase people costs. See skills section
later
• Continued emphasis on autonomics, ease of use and productivity features
12
13. #IDUG
High Availability – Today
• Impact of down time in critical IT systems has never been
higher
• Revenue loss
• Reputational damage
• Remedial costs• Remedial costs
• Regulatory and Contract Compliance Impact
• How much?
• A 2011 Ponemon Institute report calculated average of $5,617 per
minute for large US data centres
• Amazon “went dark” for 49 minutes in Jan 2013, at estimated cost of
$66,240 per minute
• Unplanned outage is usually the most painful, but planned
outage hurts too
13
14. #IDUG
High Availability – Today
• Relax, you’re working with IBM – DB2 on both platforms is
in good shape for reducing unplanned outage
• Data Sharing on DB2 for z/OS is mature and generally much
better understood by customers than it used to be
• “Gold standard” for continuous availability
• DB2 11 for z/OS contains some valuable new performance
enhancements
• DB2 for LUW pureScale feature implements similar architecture
• Included in AWSE and AESE
• Until recently pureScale supported only on IBM POWER and System
x servers, but as of DB2 10.1 FP2 or DB2 10.5 FP1, non-IBM x86
servers also supported
14
15. #IDUG
High Availability – Today
• Eliminating planned outage is an ongoing challenge, but
news is generally good and improving all of the time
• Schema change
• Housekeeping• Housekeeping
• Preventative maintenance
• Version upgrades
15
16. #IDUG
High Availability – Tomorrow
• Further data sharing and GDPS enhancements for DB2
for z/OS to re-open the gap with competitors
• Continued expansion of dynamic schema change
capabilities for LUW and z/OScapabilities for LUW and z/OS
• Online version upgrades
• Further strides towards truly online version upgrades for DB2 for
z/OS
• First steps for pureScale
16
17. #IDUG
In-Memory Computing – Today
• Disk access speeds are increasing, but processor speeds are increasing
at an even greater rate
• Therefore, relative “cost” of I/O operations is getting bigger
• Even new (expensive) SSDs are orders of magnitude slower than accessing
processor storage
• Caching data in memory avoids I/O• Caching data in memory avoids I/O
• Improves elapsed time
• Reduces CPU
• Reduces operational cost
• Allows novel access patterns to be used
• Availability of NAND / flash memory
reduces impact if I/O is required
• SSD
• Flash Express
• Pricing is volatile/complicated,
but memory is a one-off cost
17
DASD - CacheDASD - Cache
DASD - DiskDASD - Disk
Nanoseconds (10-9)
<2 milliseconds (10-3)
>5 milliseconds
Buffer
Pool
18. #IDUG
In-Memory Computing – Today
• OLTP
• Today’s server platforms can cache large amounts of data in memory
• zEC12 can support up to 3TB per CEC (1TB per LPAR)
• High-end Intel-based servers support 6-8TB per server
• Average deployed server memory is increasing on both mainframe and• Average deployed server memory is increasing on both mainframe and
distributed platforms
• Specific steps being taken to allow DB2 customers to exploit larger
memory footprints for OLTP workloads
• PGFIX(YES) in DB2 9
• PGSTEAL(NONE) and high-performance DBATs in DB2 10
• 1MB / 2GB page frames in DB2 10 / DB2 11
• Large (16MB) and Huge (16GB, AIX only) OS page support in DB2 for
LUW
18
19. #IDUG
In-Memory Computing – Today
• Analytics
• DB2 10.5 for LUW (AWSE & AESE) includes “BLU” technology - a collection of
novel technologies for optimising analytic queries,
including some specific in-memory techniques
• Columnar data store with patented dynamic
in-memory optimisation for data prefetch andin-memory optimisation for data prefetch and
retention – “treats DRAM as disk”
• Data held in compressed format in memory, while
still allowing joins and predicate evaluation –
“actionable compression”
• Very impressive query performance across a wide
variety of analytic (and even some “heavy” OLTP)
workloads
• 10x – 25x elapsed time improvement is common
• Ability to more fully utilise all of the available
memory / CPU in a given server configuration
19
20. #IDUG
In-Memory Computing – Tomorrow
• Future zEnterprise machines likely to significantly increase maximum
memory capacity per CEC / LPAR
• Cost per GB likely to continue with general downward trend
• Average installed memory per CEC will continue to increase
• DB2 for z/OS may page-fix buffer pools by default• DB2 for z/OS may page-fix buffer pools by default
• More common customer use of large / huge page frames
• Page fixing and large page frame support for other DB2 storage areas
(e.g. EDM pool)
• Possible use of pageable 1MB page frames supported by zEC12
• Increased autonomic capability, reduction of memory-specific system
parameters
• DB2 BLU will continue to evolve
• Big push just starting on DB2 BLU in the cloud
20
21. #IDUG
Database Commoditisation – Today
• We’ve always lived in a heterogeneous world, but perception of
databases as a commodity is increasing
• Many reasons, including
• The ubiquity of SQL
• The rise of packaged solutions• The rise of packaged solutions
• Java (JDBC, frameworks)
• RDBMS vendor compatibility / migration initiatives
• SOA
• Skills availability and support team size
• The result
• Lack of management awareness of business value of a specific
database
• Support teams and developers working with many database systems
• Lowest common denominator approach
21
22. #IDUG
Database Commoditisation – Today
• Fight back!
• Make it your mission to keep your management aware of
the unique business value of DB2
• If you have to be a Jack of all trades, at• If you have to be a Jack of all trades, at
least try to become a master of one
• Guess which one?
• Take pragmatic approach to lowest
common denominator issue
• Fight the battles worth winning
• Accept the rest
22
23. #IDUG
DB2 Skills – Today
• DB2 is getting more complex / capable in every release
• At the same time, IBM is trying to make it easier to use / understand
• Great until something needs fixing “under the hood”
• DB2 skills demographic is changing
• Source: My own observations only – no scientific backup!• Source: My own observations only – no scientific backup!
23
Skill Level
%ofDB2Technicians
Skill Level
%ofDB2Technicians
25. #IDUG
DB2 Skills – Tomorrow
• Jury still out on longer-term impact of greying mainframe
workforce
• IBM making efforts with its Academic Initiative
• Training provided for 80,000 students at over 1,000 schools in 70
countries during past 7 yearscountries during past 7 years
• 3 mainframe Massive Open On-line Courses
(MOOCs) will be made available in stages
throughout the year (no cost and
available to anyone, anywhere,
at any time)
• Expansion of DB2’s autonomic
capabilities will help, but requirement
for some deeper specialist skills
likely to continue for foreseeable
future
25
TaskComplexity
Autonomics
26. #IDUG
DB2 Skills – Tomorrow
SQLDBA
Permanent UK jobs requiring specific skills as proportion of total demand
Performance Tuning Big Data
27. #IDUG
Big Data – Today
• Big Data and Analytics are everywhere you look
• What’s a DB2 guy (or girl) to do?
• Things to keep in mind
• Hadoop is not a replacement for existing infrastructure, but a tool to• Hadoop is not a replacement for existing infrastructure, but a tool to
augment it
• Your role is still vital to your organisation!
• “90% of the world’s data is unstructured, but 90% of the world’s most
important data is structured”
David Barnes, IBM, 2012 IDUG Europe Keynote Speaker
• Database people have been doing big data and analytics for the past
40 years or so, just with different tools and terms (and capitalisation)
• If you have the right attitude / mind-set, a DBA background is an
excellent stepping stone to becoming a wealthy “Data Scientist”
27
28. #IDUG
Big Data – Today
• One of the secrets to DB2’s longevity is to “embrace and
extend” new technologies, and Big Data is no exception
• DB2 for z/OS
• IBM DB2 Analytics Accelerator for efficiently running complex
query workloadsquery workloads
• SQL extensions in most recent releases to improve query /
analytic workloads
• DB2 for LUW
• BLU Acceleration to dramatically speed up analytics and
reporting, by multiple orders of magnitude
• Part of DB2 for LUW V10.5 (included in AWSE and AESE)
• Remember that DB2 for LUW still holds Guinness World Record
for Largest Data Warehouse (3PB)
28
29. #IDUG
Big Data – Today
• Integration between DB2 and Hadoop opens new
possibilities for gaining actionable insight
29
30. #IDUG
Big Data – Tomorrow
• DB2 will continue with “embrace and extend” philosophy
• Efficient interaction with highly optimised big data platforms such as Hadoop /
BigInsights
• Further expand internal analytic / big data capabilities
• One size does NOT fit all !• One size does NOT fit all !
• Each approach has strengths and
weaknesses, best one is dependent
on application requirements
• NoSQL = Not Only SQL (or YeSQL)
• Several NoSQL databases have
added SQL capabilities
• NoSQL for z/OS!
• Simple Key / value NoSQL database for z/OS, currently freeware
• http://www.nosqlz.com
30
31. #IDUG
Some Questions to Ponder
• What have you done recently to:
• Reduce the operational costs of the systems you support?
• Improve your personal productivity?
• Make the savings that you’ve made visible to the budget holders?
• Test your failover / disaster recovery arrangements?• Test your failover / disaster recovery arrangements?
• Review your housekeeping / maintenance / upgrade procedures to
ensure you’re maximising availability?
• Improve and expand your DB2 skills?
• Make management aware of the business value of DB2?
• Keep yourself relevant in a Big Data world?
• Prepare for the future?
31
“The future depends on what you do today”
Mahatma Ghandi
32. #IDUG
Where’s the future I was promised?
32
Portable fusion
reactor
Self-Tying
Laces
Hoverboard
Flying
cars
33. #IDUG
A Time Travellers Guide to DB2:
Technology Themes for 2014 and
beyond
Julian StuhlerJulian Stuhler
Principal Consultant
Triton Consulting
Notas del editor
Predated Plato by about 100 years“The weeping philosopher”The first DBA?
IBM acquired Netezza in 2010pureData brand announced in late 2011
Moore's law: over the history of computing hardware, number of transistors on integrated circuits doubles approximately every two years. Named after Intel co-founder Gordon Moore, who described the trend in his 1965 paper.From thousandsof transistors in the early 1970s to billions now (Intel Xeon Phi has 5 billion, 61 cores on 22nm). 14nm Knights landing generation has 72 cores, Adaptive compression actually uses two compression approaches. The first employs the same table-level compression dictionary used in classic row compression to compress data based on repetition within a sampling of data from the table as a whole. The second approach uses a page-level dictionary-based compression algorithm to compress data based on data repetition within each page of data. The dictionaries map repeated byte patterns to much smaller symbols; these symbols then replace the longer byte patterns in the table. The table-level compression dictionary is stored within the table object for which it is created, and is used to compress data throughout the table. The page-level compression dictionary is stored with the data in the data page, and is used to compression only the data within that pageDB2 with BLU Acceleration introduces several patented techniques permitting DB2 to not only store data more efficiently, but also to better process it while it is still compressed. BLU Acceleration applies predicates, performs joins, and does grouping, all on the compressed values of column-organized tables. Since no secondary indexes or MQTs are needed on column-organized tables, you save storage space. This combination brings together all resources—I/O bandwidth, bufferpools, memory bandwidth, processor caches, and even machine cycles—through single-instruction, multiple data (SIMD) operations. SSDs now well under 1$ per GB ($40 per GB in 2008). Mention FlashExpress??Hardware utilisation rates of 20% or less common on distributed servers.
DB2 11out-of-the-box" DB2 CPU savings:Up to 10% for complex OLTPup to 10% for update intensive batch workloadsup to 25% for . Complex reporting queries against uncompressed tables (up to 40% CPU savings against compressed tables)zIIP used for pseudo-deleted index cleanup, log write/read, utilities Additional CPU savings and performance improvements may be possible with application and/or system changes that take advantage of new DB2 11 enhancements including log replication capture, data sharing using extended log record format, and pureXML120 PVUs per core on zLinux EC12 (100 on Intel)AutonomicsPseudo-deleted index cleanup, reorg avoidanceSelf tuning memory manager on LUW (less successful on z)Temporal, transparent archiving
Intel Broadwell architecture uses 14nm, existing roadmap goes down to 5nm. 7-5nm accepted as limit of current technology, around 2020-2022.New materials: indium gallium arsenide (InGaAs), indium phosphide (InP) and silicon germanium (SiGe), GraphenePhotonicszEC12 still have water cooling options
zEC12 has 2:1 ratio of zIIP to CPsThree times the number of mobile phones in the world as computersDB2 AWSE/AESE bundle Optim tooling, compression, pureScale, HADR, etc
JC will discuss DB2 11 data sharing enhancements – LRSN spin reduction, GBP write around protocolDB2 10.5 supports HADR with pureScaleRoy Cecil will talk about GDPC later
Schema change – ALTER TABLE DROP COLUMN..DB2 Luw 10.5 online fixpak
Data sharing is key system z competitive differentiator, but competition is closing gap•Improved automatic DB2 peer recovery•Asynch CF Lock duplexing for multi-site continuous availability•Improved IFI 306 performance for active/active log captureDynamic schema change – change DB owner, view changes, change tablespace type, etc. Major theme for Cypress
DRAM/NAND prices decreased historically, but now static/rising due to non-technology factors (volume down due to PC shipments down, lower demands of mobile devices, recession, supplier fires, etc)
Pools for a given subsystem can be up to 1TB total in DB2 8, 9 and 10, with limit of 2x available real storageHuge page memory is only available on AIX.
By isolating workloads, DB2 can achieve higher cache coherency and eliminate low-level thrashing of the cache. 10x compression in Blu compared to traditional techniques – due to column-based formatNo MQT, indexes etc needed so more savings
Data scientist – from David BarnesNeed curiosity and cleverness as well as technical skillsAsk questions of the dataLearn when and how to simplify the data – and when it’s good enough
IDAA V4 just announcedAccelerate greater range of SQL (e.g. static SQL)Workload balancing/failover across multiple acceleratorsSQL Extensions- DB2 11 includes advanced aggregation grouping sets, group by rollupBLU conceptsColumnar storeData skipping (skip irrelevant data)Dynamic in memory technology (RAM instead of disk)Actionable compression (keep data compressed for processing)Parallel vector processing (use of parallel cores/processors)
"Embrace, extend, and extinguish",[1] also known as "Embrace, extend, and exterminate",[2] is a phrase that the U.S. Department of Justice found[3] was used internally by Microsoft[4] to describe its strategy for entering product categories involving widely used standards, extending those standards with proprietary capabilities, and then using those differences to disadvantage its competitors.Object relational, XML, columnar all examples of DB2 embracing and extending
NoSQL v SQL – remember Object DB v Relational?NoSQL: Column: Accumulo, Cassandra, HBase Document: Clusterpoint, Couchbase, MarkLogic, MongoDB Key-value: Dynamo, MemcacheDB, Project Voldemort, Redis, Riak Graph: Allegro, Neo4J, OrientDB, Virtuoso