This document discusses various business analysis and decision support tools. It begins by describing five main categories of decision support tools: reporting tools, managed query tools, executive information system tools, online analytical processing (OLAP) tools, and data mining tools. It provides details on the different types of tools within each category. It also discusses the Cognos Impromptu reporting and query tool, including its features and capabilities. Finally, it briefly describes common OLAP operations on multidimensional data like roll-up, drill-down, slice and dice, and pivot.
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
business analysis-Data warehousing
1. UNIT II
BUSINESS ANALYSIS
Reporting and Query tools and Applications – Tool Categories – Cognos Impromptu–– Online
Analytical Processing (OLAP) – Need –Multidimensional Data Model – OLAP Guidelines –
Multidimensional versus Multirelational OLAP – Categories of Tools.
DECISION SUPPORT TOOLS/Access Tools
Tool Categories
There are five categories of decision support tools, although the lines that separate them
are quickly blurring:
• Reporting
• Managed query
• Executive information systems
• On-line analytical processing
• Data mining
Reporting tools
Reporting tools can be divided into production reporting tools and desktop report writers.
Production reporting tools will let companies generate regular operational report or support
high-volume batch jobs, such as calculating and printing paychecks. Production reporting tools
include third-generation languages such as COBOL; specialized fourth-generation languages,
such as information Builders, Inc’s Focus; and high-end client/server tools, such as MITTs SQR.
2. Data warehouse architecture
Report writers, on the other hand, are inexpensive desktop tools designed for end users.
Products such as Seagate Software’s Crystal Reports let users design and run reports without
having to rely on the IS department. In general, report writers have graphical interfaces and
built-in charting functions. They can pull groups of data from a variety of data sources and
integrate them in a single report. Leading report writers include Crystal Reports, Actuate
Software Corp’s. Actuate Reporting System, IQ software Corp.’s IQ objects, and Platinum
Technology, Inc.’s InfoReports. Vendors are trying to increase the scalability of report writers
by supporting three-tiered architectures in which report processing is done on a Windows NT or
UNIX server. Report writers also are beginning to offer object-oriented interfaces for designing
and manipulating reports and modules for performing ad hoc queries and OLAP analysis.
Users and Related Activities
User Activity Tools
Clerk Simple retrieval 4GL*
Executive Exception reports EIS
Manager Simple retrieval 4GL
3. Business analysts Complex analysis Spreadsheets, OLAP , data mining
• Fourth-generation language
Managed query tools
Managed query tools shields end users from the complexities of SQL and database
structure by inserting a metalayer between users and the database Metalayer is the software that
provides subject-oriented views of a database and supports points and click creation of SQL.
Some vendors, such a Business Objects, Inc., call this layer a “universe.” Other vendors, such a
Cognos Corp., call it a “catalog.” Managed query tools have been extremely popular because
they make it possible for knowledge workers to access corporate data without IS intervention.
Most managed query tools have embraced three-tiered architecture to improve scalability.
They support asynchronous query execution and integrate with Web servers. Managed query
tools vendors are racing to embed support for OLAP and data mining features. Some tool
makers, such as Business Objects, take an all-in one approach. It embeds OLAP functionality in
its core 4.0 product. Other vendors, such as Cognos, Platnium Technologies, and Information
Builders, take a best-of-breed approach, offering Microsoft Corp. Office like suites composed of
managed query, OLAP, and data mining tools. Other leading managed query tools are IQ
Software IQ objects, Andyne Computing Ltd.’s GQL, IBM’s Decision Server, Speedware Corp’s
Esperant (formerly sold by software AG), and Oracle Corp.’s Discoverer/2000.
Executive information system tools
Executive information system (EIS) tools predate report writers and managed query
tools; they were first deployed on mainframes. EIS tools allow developers to build customized,
graphical decision support application or ‘briefing books” that give managers and executives a
high –level view of the business and access to external sources, such as custom , on-line news
feeds. EIS applications highlight exceptions to normal business activity or rules by using color-
coded graphics.
Popular EIS tools include Pilot Software, Inc.’s Lightship, Platinum Technolog’s Forest
and Trees, Comshare, Inc.’s Commander Decision, Oracle’s Express Analyzer, and SAS
4. Institute, Inc.’s SAS/EIS. EIS vendors are moving in two directions. Many are adding managed
query functions to compete head-on-with other decision support tools. Others are building
packaged applications that address horizontal functions, such as sales, budgeting and marketing,
or vertical industries, such as financial services. For example, Platinum Technologies offers
RiskAdvisor , a decision support application for the insurance industry that was built with Forest
and Trees. Comshare provides the Arthur family of supply-chain applications for the retail
industry.
OLAP tools
OLAP tools provide an intuitive way to view corporate data. These tools aggregate data
along common business subjects or dimensions and then let users navigate through the
hierarchies and dimensions with the click of a mouse button. Users can drill across or up levels
in each dimension or pivot and swap out dimensions to change their view of the data.
Some data, such as Arbor Software Corp’s Essbase and Oracle’s Express preaggregate
data in special multidimensional database. Other tools work directly against relations data and
aggregate date on the fly, such as Micro-strategy, Inc.’s DSS Agent or Information Advantage,
Inc’s Decision Suite. Some tools process OLAP data on the desktop instead of a server.
Desktop OLAP tools tinclude cognos’ Powerplay, Brio Technology, Inc.’s BrioQuery Planning
Sciences, Inc.’s Gentium, and Andyne Pablo. Many of the difference between OLAP tools are
fading. Vendors are rearchitecting their products to give users greater control over the tradeoff
between flexibility and performance that is inherent in OLAP tools. Many vendors are rewriting
pieces of their products in Java.
Data mining tools
Data mining tools are becoming hot commodities because they provide insights into
corporate data that aren’t easily discerned with managed query or OLAP tools. Data mining
tools use a variety of statistical and artificial-intelligence (AI) algorithms to analyze the
correlation of variables in the data and ferret out interesting patterns and relationships to
investigate.
5. Some data mining tools, such as IBM’s Intelligent Miner, are expensive and require
statisticians to implement and manage. But there is a new breed of tools emerging that promises
to take the mystery out of data mining. These tools include DataMind Corp’s DataMind, Pilot’s
Discovery Server, and tools from Business Objects and SAS Institute. These tools offer simple
user interfaces that plug in directly to existing OLAP tools or databases and can be return
directly against data warehouses.
The end-user tools are spans a number of data warehouse compounds. For example, all
end-user tools use metadata definitions to obtain access to data stored in the warehouse, and
some of these tools (e.g., OLAP tools) may employ additional or intermediary data stores (e.g.,
data marts, multidimensional databases).
COGNOS IMPROMPTU
Overview:
Impromptu from Cognos Corporation is positioned as an enterprise solution for
interactive database reporting that delivers 1- to 1000+ seat scalability. Impromptu’s object-
oriented architecture ensures control and administrative consistency across all users and reports.
Users access Impromptu through its easy-to-use graphical user interface. Impromptu has been
well received by users because querying and reporting are unified in one interface, and the users
can get meaningful views of corporate data quickly and easily.
Impromptu offers a fast and robust implementation at the enterprise level, and features
full administrative control, ease and deployment and low cost of ownership. Impromptu is the
database reporting tools that exploit the power of the database, while offering complete control
over all reporting within the enterprise. In terms of scalability, Impromptu can support a single
user reporting on personal data, or thousands of users reporting on data from large data
warehouses.
Impromptu offers a fast and robust implementation at the enterprise level, and features
full administrative control, ease of deployment, and low cost of the database, while offering
complete control over all reporting within the enterprise. In terms of scalability, Impromptu can
6. support a single user reporting on personal data, or thousands of users reporting on data from
large data warehouses.
User acceptance of Impromptu is very high because its user interface looks and feels just
like the Windows products these users already use. With Impromptu users can leverage the
skills they’ve acquired from using today’s popular spreadsheets and word processors. In
additions, Impromptu insulates users from the underlying database technology, which also
reduces the time necessary to learn the tool.
The Impromptu Information Catalog:
Improving reporting begins with the Information Catalog, a LAN- based repository of
business knowledge and data accesses rules. The Catalog insulates users from such technical
aspects of the database as SQL syntax, table joins, and cryptic table and field names. The
Catalog also protects the database from repeated queries and unnecessary processing.
Creating a catalog is a relatively simple task, so that an Impromptu administrator can be
anyone who’s familiar with basic database query functions.
The Catalog presents the database in a way that reflects how the business sis organized,
and uses the terminology of the business. Impromptu administrators are free to organize
database items such as tables and fields into Impromptu’s subject-oriented folders, subfolders
and columns. Structuring the data in this way makes it easy for users to navigate within a
database and assemble reports. In additions, users are not restricted to fixed combinations or
predetermined selections; they can select on the finest detail within a database.
Impromptu enables business relevant through business rules, which can consist of shared
calculations, filters, and ranges for critical success factors. For example, users can create a
report that includes only high-margin sales from the last fiscal year for the eastern region, instead
of having to use complex filter statements.
Reporting
7. Impromptu is designed to make it easy for users to build and run their own reports. With
Report Wise templates and Head Starts, users simply apply data to Impromptu to produce reports
rapidly.
Impromptu’s predefined Report Wise templates include templates for mailing labels,
invoices, sales reports, and directories. These templates are complete with formatting, logic,
calculations, and custom automation. Organizations can create templates for standard company
reports, and then deploy them to every user who needs them. The templates are database-
independent; therefore users simply map their data onto the existing placeholders to quickly
create sophisticated reports. Additionally, Impromptu provides users with variety of page and
screen formats, knows as HeadStarts, to create new reports that are visually appealing.
Impromptu offers special reporting options that increase the value of distributed standard
reports.
• Picklists and prompts. Organizations can create standard Impromptu reports for which
users can select from lists of values called picklists. For example, a user can select a
picklist of all sales representatives with a single click of the mouse. For reports
containing too many values for a single variable, Impromptu offers prompts. For
example, a prompt asks the user at run time to supply a value or range for the report data.
Picklists and prompts make a single report flexible enough to serve many users.
• Custom templates. Standard report templates with global calculations and business rules
can be created once and then distributed to users of different databases. Users then can
apply their data to the placeholders contained in the template. A template’s standard
logic, calculations, and layout complete the report automatically in the user’s choice of
format.
• Exception reporting. Exception reporting is the ability to have report high –light values
that the lie outside accepted ranges. Impromptu offers three types of exception reporting
that help managers and business users immediately grasp the status of their business.
8. ⇒Conditional filters. Retrieve only those values that are outside defined
thresholds, or define ranges to organize data for quick evaluation. For example,
a user can set a condition to show only those sales under $10,000.
⇒Conditional highlighting. Create rules for formatting data on the basis of data
values. For example, a user can set a condition that all sales over $10,000
always appear in blue.
⇒Conditional display. Display report objects under certain conditions. For
example, a report will display a regional sales history graph only if the sales are
below a predefined value.
• Interactive reporting .Impromptu unifies querying and reporting in a single interface.
Users can perform both these tasks interfacing with live data in one integrated module.
• Frames. Impromptu offers an interesting frame-based reporting style. Frames are
building blocks that may be used to produce reports that are formatted with fonts,
borders, colors, shading, etc. Frames know about their contents and how to display them.
Frames, or combinations of frames, simplify building even complex reports. Once a
multiframe report is designed, it can be saved as a template and return at any time with
other data. The data formats itself according to the type of frame selected by the user.
⇒List frames are used to display details information. List frames can contain
calculated columns, data filters, headers and footers, etc.
⇒Form frames offer layout and design flexibility. Form reports can contain
multiple or repeating forms such as mailing labels.
⇒Cross-tab frames are used to show the totals of summarized data at selected
intersections, for example, sales of product by outlet.
⇒Hart frames make it easy for users to see their business data in 2-D and 3-D
displays using line, bar, ribbon, area, and pie charts. Charts can be stand-alone or
attached to other frames in the same report.
9. ⇒Text frames allows users to add descriptive text to reports and display binary
large objects (BLOBs) such as product descriptions or contracts.
⇒Picture frames incorporate bitmaps to reports or specific records, perfect for
visually enhancing reports.
⇒OLE frames make it possible for users to insert any OLE object into a report.
• Impromptu’s design is tightly integrated with the Microsoft Windows environment and
standards , including OLE2 support. Users can quickly learn Impromptu using Microsoft
Office-compatible user interface that is complete with tabbed dialog boxes, bubble Help,
and customizable toolbars. Together with OLE 2 support. Users can quickly learn
Impromptu using Microsoft Office – compatible user interface that is complete with
tabbed dialog boxes, bubble Help, and customizable toolbars. Together with OLE
support, users can produce enhanced reports by simply placing data or objects in a
document, regardless of the application in which it resides. For example, Impromptu
reports can be embedded in spreadsheet files, or placed in a Word document.
OLAP OPERATIONS ON MULTIDIMENSIONAL DATA.
OLAP operations on multidimensional data.
1. Roll-up: The roll-up operation performs aggregation on a data cube, either by climbing-up a
concept hierarchy for a dimension or by dimension reduction. Figure shows the result of a roll-up
operation performed on the central cube by climbing up the concept hierarchy for location. This
hierarchy was defined as the total order street < city < province or state <country.
2. Drill-down: Drill-down is the reverse of roll-up. It navigates from less detailed data to more
detailed data. Drill-down can be realized by either stepping-down a concept hierarchy for a
dimension or introducing additional dimensions. Figure shows the result of a drill-down
operation performed on the central cube by stepping down a concept hierarchy for time defined
as day < month < quarter < year. Drill-down occurs by descending the time hierarchy from the
level of quarter to the more detailed level of month.
3. Slice and dice: The slice operation performs a selection on one dimension of the given cube,
resulting in a subcube. Figure shows a slice operation where the sales data are selected from the
10. central cube for the dimension time using the criteria time=”Q2". The dice operation defines a
subcube by performing a selection on two or more dimensions.
4. Pivot (rotate): Pivot is a visualization operation which rotates the data axes in view in order
to provide an alternative presentation of the data. Figure shows a pivot operation where the item
and location axes in a 2-D slice are rotated.
Figure : Examples of typical OLAP operations on multidimensional data.
OLAP GUIDELINES
Multidimensionality is at the core of a number of OLAP systems (databases and front-
end tools) available today. However, the availability of these systems does not eliminate the
11. need to define a methodology of how to select and use the products. Dr. E.F. Codd, the “father”
of the relational model, has formulated a list of 12 guide lines and requirements as the basis for
selecting OLAP systems. Users should prioritize this suggested list to reflect their business
requirements and consider products that best match those needs.
1. Multidimensional conceptual view. A tool should provide users with a multidimensional
model that corresponds to the business problems and is intuitively analytical and easy to use.
2. Transparency . The OLAP system’s technology, the underlying database and computing
architecture (client/server, mainframe gateways, etc.) and the heterogeneity of input data sources
should be transparent to users to preserve their productivity and profieciency with familiar front-
end environments and tools (e.g., MS Windows , MS Excel).
3. Accessibility. The OLAP system should access only the data actually required to perform the
analysis. Additionally, the system should be able to access data from all heterogeneous
enterprise data source required for the analysis.
4.Consistent reporting performance. As the number of dimensions and the size of the database
increase, users should not perceive any significant degradation in performance.
5.Client/server architecture. The OLAP system has to conform to client/server architectural
principles for maximum price and performance, flexibility, adaptivity and interoperability.
6. Generic dimensionality. Every data dimension must be equivalent in both structure and
operational capabilities.
7. Dynamic sparse matrix handling. As previously mentioned, the OLAP system has to be able
to adapt its physical schema to the specific analytical model that optimizes sparse matrix
handling to achieve and maintin the required level of performance.
8. Multiuser support. The OLAP system must be able to support a work group of users working
concurrently on a specific model.
9. Unrestricted cross-dimensional operations. The OLAP system must be able to recognize
dimensional hierarchies and automatically perform associated roll-up-calculations within and
across dimensions.
12. 10. Intuitive data manipulation. Consolidation path reorientation (pivoting), drill-down and roll-
up, and other manipulations should be accomplished via direct point-and click, drag and drop
actions on the cells of the cube.
11. Flexible reporting. The ability to arrange rows, columns and cells in a fashion that facilitates
analysis by intuitive visual presentation of analytical reports must exist.
12. Unlimited dimensions and aggregation level. Depending on business requirements, and
analytical model may have a dozen or more dimensions, each having multiple hierarchies. The
OLAP system should not impose any artificial restrictions on the number of dimensions or
aggregation levels.
In addition to these 12 guidelines, a robust production-quality OLAP system should also support.
• Comprehensive database management tools. These tools should functions as an
integrated centralized tool and allow for database management for the distributed
enterprise.
• The ability to drill down to detail (source record) level. This means that the tools should
allow for a smooth transition from the multidimensional (preaggregated) database to the
detail record level of the source relations data bases.
• Incremental database refresh .Many OLAP databases support only full refresh, and this
presents an operations and usability problem as the size of the database increases.
• Structured Query Language (SQL) interface. An important requirements for the OLAP
system to be seamlessly integrated into the existing enterprise environment.
MULTIDIMENSIONAL DATA MODEL
The multidimensional nature of business questions is reflected in the fact that, for
example, marketing managers are no longer satisfied by asking simple one-dimensional
questions such as “How much revenue did the new product generate?”Instead, they ask questions
such as “How much revenue di the new product generate by month, in the northeastern divisions,
broken down by user demographic, by sales office, relative to the previous version of the
product, compared them with plan”- a six dimensional question. One way to look at the
13. multidimensional data model is to view it as cubce . The table on the left contains detailed sales
data by product, market and time. The cube on the right associates sales numbers (unit sold)
with dimensions –product type- market, and time – with the UNIT variables organized as cell in
an array. This cube can be expanded to include another array-price- which can be associated
with all or only some dimensions (for example, the unit price of a product may or may not
change with time, or from city to city). The cube supports matrix arithmetic that allows the cube
to present the dollar sales array simply by performing a single matrix operation on all cells of the
array (dollar sales = units * price}.
The response time of the multidimensional query still depends on how many cells have to
be added on the fly. The caveat here is that, as the number of dimensions increases, the number
of the cubes cells increases exponentially. On the other hand, the majority of multidimensional
queries deal with summarized high-level data. Therefore, the solution to building an efficient
multi-dimensional database is to preaggragate (consolidate) all logical subtotals and totals along
all dimensions are hierarchical in nature. For example, the TIME dimension may contain
hierarchies for years, quarters, months, weeks and days; GEOGRAPHY may contain country,
state city, etc. Having the predefined hierarchy within dimensions allows for logical
preaggreagation and, conversely, allows for a logical drill-down – from the product group to
individual products, from annual sales to weekly sales, and so on.
Another way to reduce the size of the cube is to properly handle sparse data. Often, not
every cell has a meaning across all dimensions (many marketing database may have more than
95 percent of all cells empty or containing 0). Another kind of sparse data is create when many
cells contains duplicate data (i.e. if the cube contains a PRICE dimensions, the same price may
apply to all markets and all quarter for the year). The ability of a multidimensional data-base to
skip empty or repetitive cells can greatly reduce the size of the cube and the amount of
processing.
Dimensional hierarchy, sparse data management, and preaggregation are the key, since
they can significantly reduce the size of he database and the need to calculate values, such a
design obviates the need for multitable joins and provides quick and direct access to the arrays of
answers, thus significantly speeding up execution of the multidimensional queries.
14. Figure: Relational table and multidimensional cubes.
Multidimensional DataModel.
The most popular data model for data warehouses is a multidimensional model. This
model can exist in the form of a star schema, a snowflake schema, or a fact constellation schema.
Let's have a look at each of these schema types.
• Star schema: The star schema is a modeling paradigm in which the data warehouse
contains (1) a large central table (fact table), and (2) a set of smaller attendant tables
(dimension tables), one for each dimension. The schema graph resembles a starburst,
with the dimension tables displayed in a radial pattern around the central fact table.
15. Figure Star schema of a data warehouse for sales.
• Snowflake schema: The snowflake schema is a variant of the star schema model, where
some dimension tables are normalized, thereby further splitting the data into additional
tables. The resulting schema graph forms a shape similar to a snowflake. The major
difference between the snowflake and star schema models is that the dimension tables of
the snowflake model may be kept in normalized form. Such a table is easy to maintain
and also saves storage space because a large dimension table can be extremely large
when the dimensional structure is included as columns.
Figure Snowflake schema of a data warehouse for sales.
• Fact constellation: Sophisticated applications may require multiple fact tables to share
dimension tables. This kind of schema can be viewed as a collection of stars, and hence is
called a galaxy schema or a fact constellation.
16. Figure Fact constellation schema of a data warehouse for sales and shipping.
A Data Mining Query Language, DMQL: Language Primitives
Cube Definition (Fact Table)
define cube <cube_name> [<dimension_list>]: <measure_list>
Dimension Definition (Dimension Table)
define dimension <dimension_name> as (<attribute_or_subdimension_list>)
Special Case (Shared Dimension Tables)
First time as “cube definition”
define dimension <dimension_name> as <dimension_name_first_time> in cube
<cube_name_first_time>
Defining a Star Schema in DMQL
define cube sales_star [time, item, branch, location]:
dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
define dimension item as (item_key, item_name, brand, type, supplier_type)
define dimension branch as (branch_key, branch_name, branch_type)
define dimension location as (location_key, street, city, province_or_state, country)
Defining a Snowflake Schema in DMQL
define cube sales_snowflake [time, item, branch, location]:
dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
define dimension item as (item_key, item_name, brand, type, supplier(supplier_key,
supplier_type))
define dimension branch as (branch_key, branch_name, branch_type)
define dimension location as (location_key, street, city(city_key, province_or_state, country))
17. Defining a Fact Constellation in DMQL
define cube sales [time, item, branch, location]:
dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
define dimension item as (item_key, item_name, brand, type, supplier_type)
define dimension branch as (branch_key, branch_name, branch_type)
define dimension location as (location_key, street, city, province_or_state, country)
define cube shipping [time, item, shipper, from_location, to_location]:
dollar_cost = sum(cost_in_dollars), unit_shipped = count(*)
define dimension time as time in cube sales
define dimension item as item in cube sales
define dimension shipper as (shipper_key, shipper_name, location as location in cube sales,
shipper_type)
define dimension from_location as location in cube sales
define dimension to_location as location in cube sales
A Concept Hierarchy
Concept hierarchies allow data to be handled at varying levels of abstraction
CATEGORIZATION OF OLAP TOOLS
On-line analytical processing (OLAP) tools are based on the concepts of multi-
dimensional databases and allow a sophisticated user to analyze the data using elaborate,
multidimensional , complex views. Typical business applications for these tools include product
performance and profitability, effectiveness of a sales program or a marketing campaign, sales
forecasting, and capacity planning. These tools assume that he data is organized in a
multidimensional model which is supported by a special multidimensional database (MDDB) or
by a relational database designed to enable multidimensional properties. (e.g., star schema) a
chart comparing capabilities of these two classes of OLAP tools is shown in figure.
1. MOLAP
Traditionally, these product utilized specialized data structures [i.e.,
multidimensional database management system (MDDBMSs) to organize, navigate, and
navigate data, typically in an aggregated form, and traditionally required a tight coupling with
the application layer and presentation layer. There recently has been a quick movement by
MOLAP vendors to segregate the OLAP through the use of published application programming
18. interfaces APIs). Still, there remains the need to store the data in a way similar to the way in
which it will be utilized, to enhance the performance and provide a degree of predictability for
complex analysis queries. Data structures use array technology and, in most cases, provide
improved storage techniques to minimize the disk space requirements through sparse data
management. This architecture enables excellent performance when the data is utilized as
designed, and predictable application response times for applications addressing a narrow
breadth of data for a specific DSS requirement. In addition, some products treat time as a special
dimension (e.g., Pilot Software’s Analysis Server), enhancing their ability to perform time series
analysis. Other products provide strong analytical capabilities (e.g. Oracle’s Express Server)
built into the database.
The area of the circles indicates the data size
Figure: OLAP style comparison.
Applications requiring iterative and comprehensive time series analysis of trends are well
suited for MOLAP technology (e.g., financial analysis and budgeting). Examples include Arbor
Software’s Essbase , Oracle’s Express Server, Pilot Software’s Lightship Server, Sinper’s TM/1,
Planning Sciences’ Gentium, and Kenan Technology’s Multiway.
Several Challenges face users considering the implementation of applications with
MOLAP products. First, there are limitation in the ability of data structures to support multiple
subject areas of data (a common trait of many strategic DSS applications) and the detail data
19. required by many analysis applications. This has a begun to be addressed in some products,
utilizing rudimentary “reach through” mechanisms that enable the MOLAP tools to access detail
data maintained in an RDBMS (as shown in figure). There are also limitations in the way data
can be navigate and analyzed, because the data is structured around the navigation and analysis
requirements known at the time the data structures are built. When the navigation or dimension
requirements change, the data structures may need to be physically reorganized to optimally
support the new requirements. This problem is similar in nature to the older hierarchical and
network DBMS (e.g., IMS, IDMS), where different sets of data had to be created for each
application that used the data in a manner different from the way the date was originally
maintained. Finally, MOLAP products require a different set of skills and tools for the database
administrator to build and maintain the database, thus increasing the cost and complexity of
support.
To address this particular issue, some vendors significantly enhanced their reach-through
capabilities. These hybrid solutions have as their primary characteristics the integration of
specialized multidimensional data storage with RDBMS technology, providing users with a
facility that tightly” couples the multidimensional data structures (MDDSs) with data maintained
in a an RDBMS. (see figure left). This allows the MDDSs to dynamically obtain detail data
maintained in an RDBMS, when the application reaches the bottom of the multidimensional cells
during drill-down analysis. This may deliver the best of both worlds, MOLAP and ROLAP.
This approach can be very useful for organizations with performance –sensitive
multidimensional analysis requirements and that have built, or are in the process of building, a
data warehouse architecture that contains multiple subject areas. An example would be the
creation so sales data measured by several dimensions (e.g., product and sales regions0 to be
stored and maintained in a persistent structure. This structure would be provided to reduce the
application overhead of performing calculations and building aggregations during application
initialization. These structures can be automatically refreshed at predetermined intervals
established by an administrator.
20. Figure: MOLAP architecture
2. ROLAP
This segment constitutes the fastest-growing style of OLAP technology, with new
vendors (e.g, Sagent Technology) entering the market at an accelerating pace. Products in this
group have been engineered from the beginning to support RDBMS products directly through a
dictionary layer of metadata, by passing any requirement for creating a static multidimensional
data structure (see figure). This enables multiple multidimensional views of the two-dimensional
relational tables to be created with the need to structure the data around the desired view.
Finally, some of the products in this segment have developed strong SQL generating engines to
support the complexity of multidimensional analysis. This includes the creation of multi SQL
statements to handle user requests, being “RDBMS-aware,” and providing the capability to
generate the SQL based on the optimizer of the DBMS engine. While flexibility is an attractive
feature of ROLAP products , thee are products in this segments that recommend, or require, the
use of highly de normalized database designs (e.g. star schema). The design and performance
issues associated with the star schema have been discussed.
21. Figure: ROLAP architecture
The ROLAP tools are undergoing some technology realignment. This shift in technology
emphasis is coming in two forms. First is the movement toward pure middleware technology
that provides facilities to simplify development of multidimensional applications. Second there
continues further blurring of the lines that delineate ROLAP and hybrid –OLAP products.
Vendrors of ROLAP tools and RDBMS products look to provide an option to create
multidimensional, persistent structures, with facilities to assist in the administration of these
structures. Examples include information Advantage (Axsys), MicroStrategy (DSS Agent /DSS
server), Platnium/Prodea Software (Beacon), Informix/Stanford Technology Group(Metacube),
and Sybase (HighGate Project).
3. Managed query environment (MQE)
The style of OLAP, which is beginning to see increased activity, provides users with the
ability to perform limited analysis capability, either directly against RDBMS products, or by
leveraging an intermediate MOLAP server(see figure) . some products (e.g. Andyne’s Pablo)
that have a heritage in ad hoc query have developed features to provide “datacube” and “slice
and dice” analysis capabilities. This is achieved by first developing a query to select data from
the DBMS, which then delivers the requested data to the desktop, where it is placed into a
datacube. This datacube can be stored and maintained locally, to reduce the overhead required to
create the structure each time the query is executed. Once the data is in the datacube, users can
perform multidimensional analysis (i.e. slice, dice and pivot operations) against it. Alternatively,
22. these tools can work with MOLAP servers, and the data from the relational DBMS can be
delivered to the MOLAP server, and from there to the desktop.
The simplicity of the installation and administration of such products makes them
particularly attractive to organizations looking to provide seasoned users with more sophisticated
analyses capabilities, without the significant cost and maintenance of more complex products.
With all the ease of installation and administration that accompanies the desktop OLAP products,
most of these tools require the datacube to be built and maintained on the desktop or a separate
server. With metadata definitions that assist users in retrieving the correct set of data that makes
up the datacube, this method causes a plethora of data redundancy and strain to most network
infrastructures that support many users. Although this mechanism allow for the flexibility of
each user to build a custom datacube, the lack of data consistency among users, and the relatively
small amount of data that can be efficiently maintained are significant challenges facing tools
administrators.
Examples include Cognos Software Powerplay, Andyne, Software’s Pablo Business
Objects’ Mercury project, Dimensional Insight’s CrossTarget, and Speedware’s Media.
Figure: Hybrid /MQE architecture
By
M.Dhilsath Fathima