Detailed comparison of two common legacy databases - HP's SQL/MP running on the NonStop Guardian environment and IBM's DB2 running on its z/OS platform, comparing a range of functionalities.
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Dynamics of Leading Legacy Databases
1. • Cognizant 20-20 Insights
Dynamics of Leading Legacy Databases
A comparison of concepts and capabilities of HP’s SQL/MP and IBM’s
DB2 will enable IT to manage their critical, run-the-business processes
as they guide the transition to new environments.
Executive Summary
Legacy systems are computing environments
installed in the 1970s by leading commercial
adopters of information technology (banks,
telecoms, stock exchanges, etc.). Almost a half
century later, they plod on delivering run-the
business functionality.
Because these legacy systems are often critical to
the operation of most enterprises, deploying more
modern systems in one fell swoop introduces
an unacceptable level of operational risk. And
because most organizations with a large legacy
systems footprint cannot migrate to modern
technology all at once, a phased approach must
be considered. However, a staged approach brings
its own set of challenges, such as:
•
Ensuring complete business coverage with
well-understood and implemented overlapping
functionality.
•
Deploying more current technologies, such as
HP NonStop Cobol or IBM z/OS Cobol, upon
which most business-critical applications rely.
•
Adding modern databases such as HP NonStop
SQL/MP or IBM z/OS DB2, which are critical to
surviving in the big data era.
Modernizing databases offers many advantages,
including cost reduction, stability and efficiency,
cognizant 20-20 insights | november 2013
but it is fraught with technological and change
management difficulties.
This white paper delves into legacy database
concepts and compares wherever possible the
technological challenges and conceptual underpinnings of database environments such as SQL/
MP in the HP NonStop Guardian environment
and DB2 running under IBM z/OS. The aim is to
provide key insights for architects seeking to
migrate to modern databases or from one legacy
system to another per the organization’s needs.
Distributed Database
SQL/MP provides local autonomy within and
across the network nodes and at the application
programming level. As such, programs operate
as if all the data were local. Therefore, programmers do not need to specify the location of the
data. This also allows database administrators
to add extra partitions at the remote systems
without changing any applications. SQL/MP has
a subsystem NonStop Transaction Manager/
Massively Parallel (TM/MP) to ensure the transaction integrity across the system boundaries
by a two-phase commit protocol, which commits
the changes to the database when the system
components complete their portion of the transaction (see Figure 1, next page).
2. Distributed Database Concept Within HP Nonstop SQL/MP
Application/User
Node A
Node B
SQL Data Access Manager
Node N
SQL Data Access Manager
Table 1
SQL Data Access Manager
Table 1
Table 1
Node A – Local Node
Node B…Node N – Remote Nodes
Figure 1
DB2 also supports distributed database schemas
through its Distributed Relational Database
Architecture (DRDA) offering. SQL objects can be
spread across interconnected systems, with each
system having a DB2 manager to communicate
and cooperate in such a way that the application
programs can access data across various systems.
In simple terms, any DB2 server — other than the
local system — is called as the remote DB2 server
and the operation is considered distributed.
Each DB2 manager uses the location name to
determine the local or remote system; a maximum
of eight location names can be defined for a DB2
system. DB2 also acts as a transaction manager
to enable the database’s two-way commit process
and guarantees that the units of work are consistently committed or rolled back (see Figure 2).
Parallel Processing
SQL/MP achieves parallel processing through:
• Parallel
query processing: This is achieved
by spreading the database across multiple
partitions. SQL/MP uses multiple I/O processes
for each partition during query execution.
• Parallel
join operations: This is achieved by
the SQL executor during query processing to
increase performance.
• Parallel index maintenance: This is achieved
by different disk processes so that multiple
indexes will not have any impact on the performance.
• Parallel
index loading: This is achieved by
different process loading of all partitions of the
index at the same time.
Distributed Database Concept Within IBM DB2
Application/User
System A
System B
System N
DB2
Manager A
DB2
Manager B
DB2
Manager N
Table 1
Table 1
Table 1
System A – Local System
System B…System N – Remote System
Figure 2
cognizant 20-20 insights
2
3. >> Online partition moving.
>> Partition splitting and row
• Parallel
sorting: This is achieved by the
FastSort product that SQL/MP uses for sort
operations and this has multiple sub-sort
processes for parallel sort operations.
In contrast, DB2 achieves parallel processing
mainly through:
redistribution,
with concurrent read and update capability.
• Parallel table loads (using multiple SQLCI LOAD
PARTONLY operations) and index loads (using
the CREATE INDEX or LOAD command with
PARALLEL EXECUTION ON) to reduce the time
required to load the object.
• Query I/O parallelism: This feature concerns
allowing or managing multiple I/O for a single
query to improve I/O performance. It will be
used as a last option to increase the performance if other methods are not significant for
a particular query execution.
• Query CP parallelism: This feature splits the
large query into multiple small queries and
each query will be executed simultaneously on
multiple processors accessing data in parallel.
It is further enabled in the latest versions
and called Sysplex query parallelism where
the large query is split across different DB2
components in a data sharing group.
The following concepts are used for processing:
• Static and dynamic queries.
• Local or remote data access.
• Query using single table scans and multi-table
joins.
• Access through an index.
• Sort.
• Automatic recompilation or partial recompila-
tion, which eliminates the need to terminate
program execution when changes in database
structure or the environment make rebinding
necessary.
• Ability
to defer name resolution in SQL
statements until execution time.
On the other hand, DB2 can achieve high availability but unplanned outages are difficult to
avoid entirely. It doesn’t have specific built-in
mechanisms but a well-conceived and tested
backup, recovery and restart strategy can make a
significant difference.
• Some restart processing can occur concurrently with new work. Also, the organization can
make a choice to postpone some processing.
• During
a restart, DB2 applies data changes
from the log. This technique ensures that data
changes are not lost, even if some data was not
saved at the time of the failure. Some of the
processes of applying log changes can run in
parallel.
High Availability
Theoretically, SQL/MP has better mechanisms in
place to achieve high availability compared with
DB2. This is achieved through:
• The database administrator can register DB2
to the Automatic Restart Manager of z/OS. This
facility automatically restarts DB2 if it goes
down as a result of a failure.
• Online dumps using the TMFCOM DUMP FILES
command, with complete support for TMF file
recovery to recover a database in case of a
disaster.
• Online
database reorganization capabilities
such as:
>> Online index creation, with concurrent read
and update capability.
• Data sharing allows access to data even when
one DB2 subsystem in a group is stopped.
Database Protection
Many people view “data” as just the input and
output for programs or the information used to
generate reports. Since there is a cost to remediate
Table Showing Key Concepts to Achieve High Availability
in NonStop SQL/MP and IBM DB2
NonStop SQL/MP
IBM DB2
Online dumps
Concurrent restart
Online database reorganization
Backup data logs
Parallel table/index loading
Automatic restart manager
Figure 3
cognizant 20-20 insights
3
4. a data breach, data must have value and thus is no
longer just information but an asset.
In general, there are three main mechanisms that
allow a DBA to implement a database security plan:
Recovery
Both SQL/MP and DB2 offer data recovery
features. Both databases offer:
• Facilities and provisions to maintain two copies
of the same data, one at location on-site, the
other outside the enterprise.
• Authentication:
This is the first security
feature you’ll encounter when you attempt to
access a database.
• Authorization:
This involves determining
the operations that users and/or groups can
perform, and the data objects that they can
access.
• Privileges:
These are a bit more granular
than authorization and can be assigned to
users and/or groups. Privileges help define the
objects that a user can create or drop.
These three mechanisms are available in both
SQL/MP and DB2. Authorization to operate on
SQL tables, views, indexes, collations and SQL
programs that run in the Guardian environment are maintained by the Guardian security
subsystem. Whereas in DB2, whether a process
can gain access to a specific DB2 subsystem can
be controlled outside of DB2. The procedure to
grant access is performed through RACF or a
similar security system.
• Facilities to define the database changes within
a logical unit of work.
Data Integrity
The other important aspect of data handling is the
integrity of data and its maintenance. Both SQL/
MP and DB2 protect the integrity of the database
by ensuring the entered data meets definitional
requirements. Application programs, therefore,
do not need to perform data checking.
The following data definition features are
available in both SQL/MP and DB2 to ensure definitional integrity:
• Column definitions.
• Protection views.
• Constraints.
• Indexes.
Additional features available in DB2, but not in
SQL/MP, include:
• Referential integrity.
• Triggers.
NonStop SQL/MP: Example to Show How Security Is Achieved
Node — Super ID
SQL
Objects
SQL
Objects
Group
Manager 1
Guardian Environment
Guardian
Users
Group
Manager 1
Guardian
Users
SQL
Objects
Node — Super ID
SQL
Objects
Group
Manager 1
Guardian
Users
Node — Super ID
SQL
Objects
SQL
Objects
SQL
Objects
Group
Manager 1
Group
Manager 1
Guardian
Users
SQL
Objects
Group
Manager 1
Guardian
Users
Group
Manager 1
SQL
Objects
Group
Manager 1
Guardian
Users
Guardian
Users
Figure 4
cognizant 20-20 insights
SQL
Objects
Guardian
Users
Node — Super ID
Group
Manager 1
Guardian
Users
Group
Manager 1
Guardian
Users
SQL
Objects
Group
Manager 1
Guardian
Users
Group
Manager 1
Guardian
Users
SQL
Objects
4
Owner access
Restricted access
5. Character Set
SQL/MP supports the following character sets:
• ISO 8859/1 through ISO 8859/9:
>> Single byte character set.
>> Standard set of nine single-byte
character
sets defined by ISO.
conditions, locking is required to maintain data
integrity by avoiding two processes contending
for the same data at the same time. In other
words, application processes require a feature
that allows control over the granularity of locking.
A generic lock can be held by a process on a
subset of the rows in a table and lock granularity
is the size of a lockable unit.
>> All nine ISO 8859 sets contain ASCII char-
Generic locking provides:
>> ISO 8859/1 is HP’s default character set and
• Improved
others are used in various geographical regions.
performance, because the application acquires fewer locks while performing
operations.
>> Collating sequence can be defined using SQL
• Ability to lock large portions of a table with a
acters.
single lock without acquiring a table lock.
object collation.
• Kanji:
>> Double-byte character set.
>> Widely used for DOS and
• Reduced
risk of a program exceeding the
maximum number of locks.
Japanese main-
Both SQL/MP and DB2 offers these features but
with conceptual differences.
>> Collation not possible and sequence depends
The following are key access options available at
a high level in SQL/MP to achieve locking:
• KSC5601 (Korean industrial standard character
• Browse access: Application process can read
frames.
on the binary values.
set):
>> Double-byte character set.
>> Mandated for Korean government and banking sector systems.
the data without any lock and it could be inconsistent data. This provides minimum consistency and maximum concurrency.
• Stable
access: Application process can lock
the data but releases locks on unmodified rows
without waiting for the end of the unit of work.
>> Collation not possible and sequence depends
on the binary values.
In general, DB2 supports more character sets
than SQL/MP. The following are the most widely
used characters sets.
• Extended
Binary Coded Decimal Interchange
Code (EBCDIC):
>> Single-byte or double-byte character set.
>> Double-byte character set is used for storing
complex characters such as Chinese, Japanese, etc.
>> Collating sequence can be defined.
• UTF-8 universal character set:
>> Native language characters can be saved to
• Repeatable access: Application processes can
lock the data but release locks only at the end
of the unit of work, irrespective of whether the
data was modified or not.
DB2 offers this feature with a different mechanism
and terminology. Called isolation levels, they
include:
• Uncommitted
read: This is equivalent to
browse access in SQL/MP and allows the application process to read data at the risk of being
inconsistent/uncommitted.
• Cursor
stability: This is equivalent to stable
access in SQL/MP and allows the application to
lock only on its uncommitted changes and on
the current row of each of its cursors.
database.
>> Collating sequence can be defined.
• Read stability: This is equivalent to repeatable
access (shared) in SQL/MP and it allows the
application process to read rows more than
once and prevents access to the qualified rows
by other processes but allows row changes that
do not qualify.
Locking
In complex business scenarios it is possible
for different application processes to access/
update the same data at the same time. In these
cognizant 20-20 insights
5
6. Table Showing Locking Mechanisms Supported by
NonStop SQL/MP and IBM DB2
NonStop SQL/MP
IBM DB2
Key Access Options
Isolation Levels
Browse access
Uncommitted read
Stable access
Cursor stability
Repeatable access-shared
Read stability
Repeatable access-exclusive
Repeatable access
Figure 5
• Repeatable read: This is equivalent to repeat-
able read (exclusive) in SQL/MP and allows
the application process to read rows more
than once and prevents access to the other
processes even to rows that do not qualify.
• Unique index:
>> The column or group of columns defined as
unique index can’t be the same for two or
more rows in a base table.
>> In SQL/MP a unique index can’t be created
Other key differences:
on a column that allows NULL values.
• The lock granularity in SQL/MP is at row level
but in DB2 it is at page level. That means in DB2
the lock is always on the page with multiple
rows in it.
• In both databases, each process can wait for
a default period of time to acquire a lock on
locked data. This time period can be manipulated in SQL/MP at a program level through
CONTROL statements but in DB2 this choice is
always with the SQL optimizer.
• Partitioned index:
>> These indexes can be physically partitioned.
>> In SQL/MP, the indexes refer to the base tables always rather than the table partitions.
• Clustering index:
>> The index defined as a clustering index determines the physical ordering of the table.
>> In DB2, a user can define an index as cluster-
ing and, if not specified, then DB2 considers
the first index on the table as the clustering
index.
• In both databases, the locks can be escalated
during run time. This can be manipulated in
SQL/MP at a program level by using CONTROL
statements but in DB2 this can be achieved
by specifying the table definitions at the
creation time.
• XML index:
>> The index is defined on an XML column that
can be used for efficient evaluation of Xpath
expressions to improve performance during
queries on XML documents.
Index
Critical applications where performance matters
require a key database feature called indexing.
An index is a column or group of columns defined
on a base table that can be used for speedy data
retrieval in a situation where data is looked up in
a sequence other than a non-primary sequence.
Both SQL/MP and DB2 support indexes to achieve
better performances in critical applications and
store the index in a separate physical file. In
general the different types of indexes available
are specified below.
In general:
• Both
databases support the compression of
indexes to reduce the amount of space that an
index occupies.
• In SQL/MP, an index can be forced in an appli-
cation program by using control statements
and bypass the SQL optimizer’s choice of
index. Whereas in DB2 the choice of index that
needs to be chosen is always decided by the
SQL optimizer.
DB2 supports the padding/non-padding
variable length indexes — but not SQL/MP.
cognizant 20-20 insights
6
of
7. Index Types Supported by
NonStop SQL/MP and IBM DB2
NonStop
SQL/MP
IBM
DB2
Unique Index
In general:
Partitioned Index
Null values will be displayed for the missing
values of a particular column in a table.
Index Types
Clustering Index
uses the below four strategies to
achieve joins:
>> Nested join.
>> Sort merge join.
>> Key-sequenced merge join.
>> Hash join.
XML Index
• SQL/MP
Figure 6
The access path that will be chosen by the
optimizer can be easily viewed in SQL/MP by
referring to EXPLAIN output but in DB2 one has
to refer to the group of tables (e.g., explain plan
table) under CATALOG through utilities.
Joins
• Whereas DB2 achieve joins by:
>> Nested loop join.
>> Merge scan join.
>> Hybrid join.
• Using CONTROL statements, SQL/MP
allows
the application program to specify the type of
algorithm used for join operation and also to
specify the sequence of joins within SELECT
statements.
• DB2
When an application requires data for processing
from multiple tables, then the widely used
database concept is JOINS. A join allows an application programmer to join columns from multiple
tables into an intermediary result table. Both
SQL/MP and DB2 supports the joins but the way
the SQL optimizer achieves this conceptually is
totally different. In general, the different types of
joins are specified below.
• Inner joins: This joins multiple tables based on
the condition specified in the WHERE clause.
The output intermediary result table contains
all the rows from the multiple tables that
satisfy the condition.
• Outer joins: This joins multiple tables, including
the rows of inner joins — plus missing rows —
depending on the type of join described below.
allows the merging of data from both
columns into a single column, eliminating the
null values, by using the COALESCE function.
Database Utilities
SQL/MP has a command interface utility called
SQLCI (SQLCI2) — SQL Conversational Interface —
to manage the application database. Through this
interface, an application programmer can:
• Enter queries or generate formatted reports.
• Run database utility programs for database
administration and monitor application performance.
Table Showing Join Types
Available in NonStop
SQL/MP and IBM DB2
NonStop
SQL/MP
IBM DB2
Inner Join
Left Outer Join
Join Types
>> Left outer join: This joins multiple tables including all the rows in the left table even if
there are no matching records in the other
tables. SQL/MP and DB2 support left outer
joins.
>> Right outer join: This joins multiple tables
including all the rows in the right table even
if there is no matching record in the other
tables.
>> Full outer join: This joins multiple tables, in-
cluding all the rows in tables that are joined.
cognizant 20-20 insights
Right Outer Join
Full Outer Join
Figure 7
7
8. Diagram Showing the Join Concepts
Inner Join
Table 1
Table 2
Right Outer Join
Table 1
Table 2
Left Outer Join
Table 1
Table 2
■ Non-selected rows
■ Selected rows
Full Outer Join
Table 1
Table 2
Figure 8
• Enter
• SQLCI commands: This is a set of commands
• Create
• Statements: SQL/MP statements define SQL
special commands that manage
concurrent access to the database and maintain
the latest statistics about the database.
constraints (rules that are applicable
to data values that are being added to the
database).
supported by SQLCI for database management
activities like creating catalogs, etc.
objects and catalogs, manipulate data within
those objects and catalogs and control various
aspects of the processes that perform the data
definition and manipulation specified in the
statements.
• Create
physical database design extensions
such as distributed databases, tables audited
by Transaction Management Facility (TMF) and
non-audited tables.
• Utilities: These features are used to retrieve
information about the database, dictionary and
application programs.
Fundamentally, it supports:
• DCL statements: Data Control Language (DCL)
is the set of SQL statements and directives that
control parallel processing, name resolution
and performance-related considerations such
as access paths, join methods and locks and
cursors.
• DDL
statements: DDL is the set of SQL
statements to create or alter database objects.
For example, creating a table or altering a table
or dropping a table, etc.
• DML
statements: This is used to select,
update, insert and delete rows in one or more
tables.
• System
DEFINEs: A System DEFINE is a
DEFINE used in Tandem software to identify
system defaults or configuration information.
There is a wide range of products available to
manage DB2 applications. Mainframe business
users choose these products based on their application or GUI requirements.
For example:
• DB2I
(Database 2 Interactive) or SPUFI
(SQL Processing Using File Input): DB2I is
a TSO-based DB2 application for the OS/390
environment and a default product provided
by IBM for database management. This doesn’t
support a GUI and allows all SQL statements
through a command interface.
• DSL
statements: This retrieves status information about the version of the database.
• Functions:
Intrinsic functions are supported
by SQL/MP to manipulate the table data. For
example, AVG returns the average of a set of
numbers, COUNT counts the distinct values in
a column, etc.
cognizant 20-20 insights
• DSN: The DSN command processor is a Time
Sharing Option (TSO) command processor that
runs in the TSO foreground or under TSO in a
JES-initiated batch. It uses the TSO attachment
8
9. facility to access DB2. The DSN command
processor provides an alternative method for
running programs that access DB2 in a TSO
environment.
• BMC: BMC provides a set of integrated tools
and utilities to manage the most complex DB2
Universal Database for z/OS and OS/390 environments.
• DB2/PE: DB2 Performance Expert (PE) for
Multiplatform V2.2 is a workstation-based
performance analysis and tuning tool for
managing a heterogeneous mix of DB2
systems with a single end-user interface. DB2
PE simplifies DB2 performance management
by providing the ability to monitor applications,
system statistics and system parameters using
a single tool.
• OPTIM: IBM’s InfoSphere Optim z/OS solution
provides a unique and powerful capacity to
browse, edit, move, compare and archive relationally intact sets of DB2 data and MOVE
legacy data.
• FM/DB2:
File Manager (FM) is a powerful
set of utility functions for editing, browsing,
printing, copying and maintaining data in files
on any supported device. It provides powerful
functions in one product for use by application
programmers and system programmers.
IBM File Manager/DB2 (FM/DB2) extends
this functionality to the DB2 environment by
extending:
>> The File Manager browse and editor facilities
to apply to DB2 data.
>> The ISPF object list utility to apply to DB2
data.
>> The
File Manager data copy facility to include DB2 data as both source and target.
Among all products, the BMC tool is used primarily
for DB2 access. It is user-friendly and available
with multiple tools for each database function.
Conclusion
Our comparison of SQL/MP and DB2 informs the
following conclusions:
• Both
can achieve the key database concepts
such as high availability, high scalability, high
performance and high security in different ways.
• Both relational database management systems
support key database features such as security,
data integrity, distributed database, concurrency, locking, recovery, etc.
• Both suit the needs of business-critical applications in domains such as banking, telecom,
stock exchanges, etc.
But when it comes to database modernization (be
it moving from a legacy system to a new system,
or migrating from one legacy system to another),
it is critical to understand or study the business
requirements of business-critical applications
specific to the key database features assessed in
this white paper.
Last but not least, organizations should also
consider other key parameters such as:
• Transaction volume of their businesses.
• Environment setup.
• Infrastructure.
• Budget.
>> The File Manager data create facility to DB2
data.
References
• HP Nonstop Tandem reference manuals:
http://www.cobug.com/cobug/docs/HP-NonStopSQL-MPforCOBOL.pdf
http://h20628.www2.hp.com/km-ext/kmcsdirect/emr_na-c02132125-3.pdf
• IBM z/OS Enterprise reference manuals:
http://pic.dhe.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z10.
doc%2Fsrc%2Falltoc%2Fdb2z_lib.htm
cognizant 20-20 insights
9