This paper details the process of DMC at eight different organizations while capturing the keys to success from each. These case studies were specifically selected to demonstrate several variations on the concept of consolidation. While there is no such thing as a �cookie-cutter� DMC process, there are common best practices and lessons to be shared.
How operational innovation can transform your company.
Data mart consolidation
1. White Paper
Data Mart Consolidation: Repenting for Sins of the Past
William McKnight
McKnight Consulting Group
www.m c k n i g h t cg. c o m
2. Data Mart Consolidation (DMC)
Contents
Part 1 Data Mart Consolidation (DMC): The Business Rationale 1
Building the Case for Data Mart Consolidation 1
The Benefits of the Program Approach to Data Warehousing 1
Desired Outcomes of Data Mart Consolidation 2
Approaches to Data Mart Consolidation 2
Part 2 The Interviews 3
3M 3
The Pre-Consolidation Environment 3
Reasons for Consolidation 3
The Consolidation Project 4
The Benefits Realized 4
The Post-Consolidation Environment 4
Delta Air Lines 5
The Pre-Consolidation Environment 5
Reasons for Consolidation 5
The Consolidation Project 6
The Benefits Realized 6
The Post-Consolidation Environment 7
Michigan Department of Community Health 7
The Pre-Consolidation Environment 7
Reasons for Consolidation 7
The Consolidation Project 8
The Benefits Realized 8
The Post-Consolidation Environment 8
Healthcare Insurance Company 9
The Pre-Consolidation Environment 9
Reasons for Consolidation 9
The Consolidation Project 9
The Benefits Realized 10
The Post-Consolidation Environment 10
3. Data Mar t Consolidation (DMC)
Royal Bank of Canada 10
The Pre-Consolidation Environment 10
Reasons for Consolidation 10
The Consolidation Project 11
The Benefits Realized 11
The Post-Consolidation Environment 12
Major Telecommunications Company 12
The Pre-Consolidation Environment 12
Reasons for Consolidation 13
The Consolidation Project 13
The Benefits Realized 13
The Post-Consolidation Environment 13
Anthem Blue Cross Blue Shield 14
The Pre-Consolidation Environment 14
Reasons for Consolidation 14
The Consolidation Project 15
The Benefits Realized 15
The Post-Consolidation Environment 15
Sekisui Systems Corporation 16
The Pre-Consolidation Environment 16
Reasons for Consolidation 16
The Consolidation Project 16
The Benefits Realized 17
The Post-Consolidation Environment 17
Part 3 Best Practices for Data Mart Consolidation 18
Best Practices for DMC 18
Customer-Reported Keys to DMC Success 19
Author’s Additional Keys to DMC Success 19
About the Author 20
4. 1
Data Mart Consolidation (DMC)
Part 1: Data Mart Consolidation (DMC):
The Business Rationale
Building the Case for Data Mart Consolidation
For much of the last decade, conventional theories surrounding decision support
architectures have focused more on cost than business benefit. Lack of Return on
Investment (ROI) quantification has resulted in platform selection criteria being focused
on perceived minimization of initial system cost rather than maximizing lasting value to
the enterprise. Often these decisions are made within departmental boundaries without
consideration of an overarching data warehousing strategy.
This reasoning has led many organizations down the eventual path of data mart prolif-
eration. This represents the creation of non-integrated data sets developed to address
specific application needs, usually with an inflexible design. In the vast majority of
cases, data mart proliferation is not the result of a chosen architectural strategy, but a
consequence due to lack of an architectural strategy.
To further complicate matters, the recent economic environment and ensuing budget
reduction cycles have forced IT managers to find ways of squeezing every drop of
performance out of their systems while still managing to meet users’ needs. In other
words, we’re all being asked to do more with less. Wouldn’t it be great to follow in
others’ footsteps and learn from their successes while still being considered a thought
leader?
The good news is that the data warehousing market is now mature enough that there are
successes and best practices to be leveraged. There are proven methods to reduce costs,
gain efficiencies, and increase the value of enterprise data. Pioneering organizations
have found a way to save millions of dollars while providing their users with integrated,
consistent, and timely information. The path that led to these results started with a
rapidly emerging trend in data warehousing today – Data Mart Consolidation (DMC).
I’ve learned that companies worldwide are embracing DMC as a way to save large
I’ve learned that companies amounts of money while still providing high degrees of business value with ROI. DMC
is an answer to the issues many face today. There is a way to cut BI costs and continue
worldwide are embracing
to deliver business value with BI. Others have done it and I’m going to share how they
DMC as a way to save large did it in this paper.
amounts of money while still
This paper details the process of DMC at eight different organizations while capturing
providing high degrees of the keys to success from each. These case studies were specifically selected to demon-
strate several variations on the concept of consolidation. While there is no such thing as
business value with ROI. a “cookie-cutter” DMC process, there are common best practices and lessons to be shared.
The Benefits of the Program Approach to Data Warehousing
Tenets of sound business practices apply to data warehousing. One of these is the
necessity to accomplish an objective in the most efficient manner. What is the most
efficient way to accomplish data warehousing objectives?
It’s the way that builds a data warehouse to solve specific needs, but does so in a manner
that leverages previous investment in the architecture, tools, processes, and people and
does not prohibit future growth. This enables an efficient, programmatic approach to
data warehousing created to serve information to the enterprise. By leveraging an
integrated data warehousing approach you will realize efficiencies generated by
economies of scale.
5. 2
Data Mar t Consolidation (DMC)
Efficiency as it relates to DMC comes in three primary forms. There are true cost
efficiencies involving the hardware, software and personnel carrying costs of the
environment and switching the costs over to a more manageable expense stream.
Many in this study referred to these as “IT benefits” but lower Total Cost of Ownership
(TCO) and economies of scale are business benefits as well. With one data warehousing
program as opposed to many, fewer resources and processes need to be supported in an
enterprise.
Secondly, there are efficiencies associated with having a “single version of the truth” A central warehouse helps set
to reference as opposed to engaging in internal “data warfare” or spending most of the
“analysis” time searching for data or “making do” with undesirable, outdated data. aside the politics of whose
As the interviews will attest, many companies were engaged in “data warfare,” but it’s data is better by establishing
not simply a matter of whose data is better. In many organizations, the best data is not
accessible or the users are not trained on the access method. A central warehouse helps a consistent, trustworthy
set aside the politics of whose data is better by establishing a consistent, trustworthy
source of information.
source of information. Creating a “single version of the truth” drives internal efficien-
cies by focusing resources on the value-added activities of business rather than data
gathering activities.
Thirdly, there are system efficiencies to be gained by eliminating redundant processes.
For example, although many are using the file delivery capabilities of operational
systems to feed data to their data warehousing environment, getting data out of the
source is still one of the most difficult tasks in data warehousing. Usually the first
extract request is not met with “open arms.” A second or third one can be impossible.
This leads many to a “single extract, many load” architecture which solves some
problems but not others.
Fortunately for those who have met the challenges, data warehousing has proved itself
time and time again as a valid conduit for delivering data and data analysis into business
processes and thereby improving them while helping the company achieve their stated
goals. DMC allows organizations to reap the benefits of integrated, centralized data
warehousing while delivering significant cost savings through internal efficiencies.
In essence, it is the grand slam of IT initiatives.
Desired Outcomes of Data Mart Consolidation
Data warehousing is a process, not a project, and a journey rather than a destination.
This applies to DMC as well. The case studies below represent several forms that
DMC can take including merging data marts into a new warehouse, picking an existing
warehouse/mart and merging other warehouse/marts into it, and moving analytical
functionality from other databases onto a data warehouse. The consolidation itself can
leverage existing designs and re-route Extract Transform & Load (ETL) processes into
the consolidated warehouse or consolidate designs as well as the platform.
This paper provides a framework of DMC reference points, lays out options for DMC
and provides best practices for those considering, planning or doing some form of DMC.
Approaches to Data Mart Consolidation
Approaches and steps to DMC as well as maturity levels with DMC emerged from
the interviews.
1. Rehosting – The process of picking up database designs and ETL “lock, stock
and barrel” and moving it to a different platform either as an effort to gain
performance or cost advantages. Often the rehosting will be done onto a platform
with existing data constructs, thereby expanding the utility of the platform.
6. 3
Data Mart Consolidation (DMC)
2. Rearchitecting – The process of merging database designs and therefore the
data acquisition strategy for the data as well. Rearchitecting may involve picking
the best model components from various models and/or it may involve more
zero-based approaches, starting from scratch, that use requirements as the basis
for the new model.
Part 2: The Interviews
3M
The Pre-Consolidation Environment
3M is a multi-faceted company that had a data mart environment which represented its
diversity. Before the consolidation, they had 40 major data marts, several smaller ones,
and some previous failed attempts at a data warehouse in the environment. Previous
attempts at a more encompassing data warehouse had proved to be too constrained and
inflexible to make it very far so a data mart environment had perpetuated over the years.
The marts were solving numerous business objectives including decision support,
financial and sales reporting. There were 25 different platforms in place, “just about
everything” according to Al Messerli, the former Director of the Enterprise Information
Management Group at 3M and now with Allen Messerli Enterprise Systems, LLC. This
included many UNIX, some Windows NT and some mainframe systems.
All together, it was many terabytes and the environment had grown tremendously over
30 years so obviously it began well prior to market acceptance of data warehousing.
This was a firmly entrenched environment many years in the making and it was going
to be a challenge to consolidate it!
ETL was being done mostly through “pushes” from the operational environments with
data pickup and movement through proprietary methods. There were also all kinds of
data access tools and methods deployed in the pre-consolidated environment.
As a result, major subject areas of sales, product, and customer were duplicated across
these data marts and not in a consistent manner. As a matter of fact, the main reason for
the consolidation was the inconsistent results and inability to get a corporate-wide view
of customers, which was creating enormous business pain. Without this one face to the
customer, 3M was unable to get complete customer information due to the distributed
nature of the data.
Reasons for Consolidation
Another major reason for consolidation was a very large opportunity to reduce environ-
ment carrying costs by eliminating data marts. 3M did a complete financial impact of the
consolidation. ROI was expected to be $20M per year! Indirect expense reductions from
internal efficiencies were also projected to accrue. In addition, the consolidated warehouse
was expected to help meet market and customer penetration as well as sales growth goals.
The project was made very visible to the user community. “Everybody knew” according
to Messerli. The idea for consolidation was primarily from certain individuals in IT.
Cultural resistance was faced and a year-long sell cycle from the C-level throughout
the organization was required.
Ironically, most of the resistance was from others in IT. The business saw the benefits
more readily. Culture needed to be substantially changed to make it work for the
enterprise and this required lots of selling “from the top-down and bottom-up”.
7. 4
Data Mar t Consolidation (DMC)
This took the form of chalk talks, hands-on sessions, user groups and data trusteeship
(a form of data stewardship.) With 40+ data marts, there was a huge need to provide
many mart users with a comfort level around a data warehouse environment and the
concept of data sharing.
Data security would actually be improved with the ability to apply a consistent security
policy at the data warehouse level and implement business unit specific security around
subject areas.
The Consolidation Project
Since it was impossible to pick one from among the 40 marts to use as the conduit for
the data warehouse, 3M built a brand new data warehouse from scratch to accomplish
its consolidation objectives. It was going to be totally comprehensive, with atomic level
detail on all business subject areas and constructs from the existing marts incorporated
over time so the mart platforms could be retired.
The ETL was completely redesigned. In building the new warehouse, 3M made sure the
new environment would include all the old functionality and then some. They did some
zero-based analysis around business needs for a warehouse and how to construct the
warehouse. It turned out that no pre-existing subject area in any mart was selected to
move verbatim into the new warehouse.
The extract load on the source systems was not materially affected by the DMC since
those systems had mostly been programmed to push ample data out previously and this
was not changed. Furthermore, they were previously extracting detailed data so that
was maintained.
3M normalized the new data model. The data warehouse team did extensive data
comparisons between the legacy marts and the new warehouse to demonstrate that the
data warehouse was correct (or if the numbers were different, that the data warehouse
was “better”).
Each migration was a separate project and in total it took several years to get the
functionality of all 40 into the warehouse. All 40 marts are now gone. Data outages
were managed with parallel runs, causing only glitches in a very complex undertaking.
The team had top-down support after the year-long sell cycle for the effort and a “no
choice” budget allocation back to business units.
The Benefits Realized
The benefits indeed were “many and large” and exceeded investment by quite a few
times over. Benefits came in many business areas including procurement, finance, sales,
marketing, supply chain and e-business.
The Post-Consolidation Environment
The consolidation is now complete. 3M chose Teradata for the data warehouse platform.
Teradata was deemed to be the only solution that scaled to the eventual size and users
they would have in a consolidated environment comprised of hundreds of source system,
5,000 tables, and 20,000 daily users. Scalability was the major driver behind this decision.
They now have a consolidated and manageable set of data access tools and do ETL “one
way.” The data warehouse is now 15 TB of total disk space and has over 10,000 users.
The marts were eliminated. Many of their platforms were obsolete according to Messerli.
8. 5
Data Mart Consolidation (DMC)
The environment continues to evolve with more business functions, subject areas, users,
and subsidiaries coming on board. The new warehouse environment has opened up the
data to channel partners and customers on a self-service basis.
Corporate mandates support the shared, centralized warehouse concept now and 100%
of ongoing data warehouse efforts go into the centralized, mission-critical data warehouse.
Top 3 keys to DMC success:
1. Getting complete buy-in from executives and throughout
the organization
2. Good data standardization and a good data model
3. Good user tools to help facilitate user buy-in
Delta Air Lines
The Pre-Consolidation Environment
Delta had three databases called data warehouses by their users. All three were on Teradata
and served Financial, Marketing and Flight data interests, respectively. There were only
50 users in total for all the warehouses.
The Financial warehouse was used for financial analysis. The 12 users primarily accessed
the 100 GB warehouse with a modern data access tool. The Flight data warehouse
supported revenue management – the effectiveness and profitability of flights. Its 12 users
accessed the 700 GB warehouse primarily through a data mining tool.
The largest of the warehouses was Marketing. It was used to look at frequent flyer
information in order to adjust and judge the effectiveness of marketing programs. The
500 GB were accessed with both a modern data access tool and a data mining tool.
None of the warehouses leveraged a packaged ETL tool.
Reasons for Consolidation
Ticket, flight and financial data were duplicated in the pre-consolidation environment
and they were materially inconsistent in their representation of this data. This approach
didn’t provide an accurate, consistent view of the same subject. This was not specifically
traced to negative ROI impact but there was a general feeling of dissatisfaction and data
disagreement within the user community.
There were separate staffs for each warehouse. A goal of consolidation was to bring
the warehouse under one group, which caused consternation. Typically IT groups
were functionally aligned and were the single points of contact for the business units.
Consolidating caused different groups (functional and warehouse) to be making contact
with the users and this had to be managed. Additionally, there was conflict over which
tool to use and when to use it. There was a desire to get to a standard tool set and
develop a training program to help the casual user.
The main reason for consolidation was not cost savings, but was to get to an “enterprise
view – a single source of the truth” according to Wayne Hyde, former IT Vice President
at Delta Air Lines and now with Reflection Technologies. This would eliminate compe-
tition regarding whose data is best, which was previously left to IT to figure out. The
consolidated warehouse would help put people on a common goal instead of being
in competition.
9. 6
Data Mar t Consolidation (DMC)
Bottom line improvement had to be demonstrated by getting data in the hands of lots
of people besides the financial analysts. “If only 60 people have access, they will be
overworked. But get the data to hundreds of thousands of people who can engage the
data in an adhoc fashion at the time they are performing business processes, they can
exploit the data to perform better and impact costs, processes, fraud and recover
revenue” according to Hyde.
IT did the analysis of corporate pain points and decided on DMC. The stated goal of the
project was not the end-all data warehouse, but focused on consolidating the 3 existing
warehouses and “let the future chips fall where they may.”
Several “IT” benefits were also expected including saving machine cycles by loading
one copy of the data (vs. many), redeploying people to more productive value-adding
work as opposed to redundant work, and better leveraging machine capacity. For example,
during the DMC process, it was determined that different groups were trying to perform
the same analysis!
In order to get DMC going, Delta took an ROI view of inefficiencies, redundancies,
and software licenses. They did not establish quantifiable business ROI objectives for “ Replatforming reports is like
the initial transition, but asked the business for the ROI when determining what priority trading cars but still using
to train users for the new warehouse.
the car for the same routes.
The Consolidation Project It might be a nicer car but it
Pleased with Teradata to-date, Delta stuck with Teradata for the consolidated warehouse.
The initial step was to consolidate platforms and copy the data warehouse designs for does nothing for ROI – just
Flight and Financial data onto Marketing’s platform. psychological benefits. You’ve
Once standard tools were selected, the team used zero-based analysis of business got to provide some kind of
requirements to define data warehousing needs. The users overwhelmed the data
incremental capability.”
warehouse team with demand.
However, according to Hyde,“Replatforming reports is like trading cars but still using
the car for the same routes. It might be a nicer car but it does nothing for ROI – just
psychological benefits. You’ve got to provide some kind of incremental capability. One
is changing the dimension of timeliness. There are some benefits from data marts but
you still have different business units making decisions with different People need to
look at the negative impacts of data marts.”
Delta ended up with multiple development teams organized under a central data
warehouse team. They had a business specific team that did specific reports, adhoc
analysis and dashboard building. The platform consolidation took 18-24 months and
yielded 60% - 70% of the enterprise view, the rest of which would be added over time.
Interestingly, they did not do parallel runs with the older warehouses. They just cut
over after the platform movement and dealt with any issues. Extract loads on the source
systems actually increased over time since the new data warehouse identified needs over
and above those that the previous warehouses uncovered.
The Benefits Realized
There are numerous benefits cited for the consolidation but a good example is in
Revenue Management. Delta Air Lines was able to contest tens of millions of promo-
tional dollars that were claimed by travel agents. This analysis was made possible
through a consolidated environment with a common view of the data. However, the real
value was giving access to data to hundreds of thousands of people, not just a select few.
10. 7
Data Mart Consolidation (DMC)
The Post-Consolidation Environment
The consolidation is complete and the two warehouses that were consolidated are
history. The DMC of the three warehouses also led to a total of 27 marts being elimi-
nated. Delta Air Lines is focused on its architected data warehouse now, which is 4 TB
usable data on Teradata and uses an ETL tool in places with an entirely different data
access tool than before.
Users were consolidated from the Finance and Flight data warehouses and the user base
has grown over time to 4,000 users.
Top 3 keys to DMC success:
1. Having a strategic vision of where you are going from an
enterprise view of the data
2. Having a delivery of new capabilities, not just the old.
Need NEW capabilities to establish new points of
memorable value to be tied to the effort.
3. Senior level understanding of the vision (sponsorship)
“Miss any one and you can be dead. If you have the
strategy without the sponsorship, you can get started but
not finish. If you have strategy without delivery, you’ll be
condemned” according to Hyde.
Michigan Department of Community Health (DCH)
Pre-Consolidation Environment
Starting in 1994, DCH began storing Medicaid paid claims on their data warehouse,
which maintains 5 years worth of paid claims. They have 1.2 million Medicaid recipients
and the majority of claims are paid through managed care. In 1998, they also started
receiving encounter data and accumulated 66 million encounter data records to date,
which are records of interactions between members and care providers.
David McLaury is the Director for Project Development and Implementation. The
Department of Community Health represents the largest user of the data warehouse
environment in Michigan.
The State of Michigan operates an enterprise data warehouse, which multiple state
agencies utilize. It is a Teradata implementation. The department also operates a number
of Oracle operational databases that were being used for analytical work in addition to
operational needs. Users did not and could not have robust data access tools due to how
the tools would interfere with the system’s primary operational purpose.
Reasons for Consolidation
The main reason that these operational databases were consolidated into the data
warehouse was to provide better query and analytical capabilities. By consolidating
these databases onto the warehouse, they are now also able to move information onto
a new data mart, which uses a MedStat schema and is also run by the department.
The idea for the DMC came from business needs. McLaury chairs a departmental
committee that oversees the project and approved the DMC. The user community was
actively involved in the consolidation, including acquiring the necessary federal funding
to support the project.
11. 8
Data Mar t Consolidation (DMC)
One goal for the DMC was to create an integrated data warehouse environment that they
could manageably add onto over time and was available to department managers for all
kinds of programs, not just those known initially.
There was concern about losing control and especially about security. Data owners must
sign off on new users and these users must sign usage agreements. These programs
helped assure that owners still felt like owners and alleviated cultural resistance.
The Consolidation Project
DCH created new data flows from the databases to the data warehouse. The additional
data and emphasis on the data warehouse supported additional data cleansing activity.
Data requirements were re-gathered and analysis was done on the requirements to
understand what operational data was required for analytical purposes.
Some database redesign was necessary in the move but some legacy designs were good
enough, even for analytical purposes. The consolidation will take 2 years and is being
done by stepwise movement of the data from the operational databases into the data
warehouse.
The Benefits Realized
The benefits of DMC have been broad-based, especially in analytical areas. An example
is the ability to cross-compare Medicaid paid claims and encounter data to other
departmental data sets.
The users are still adjusting to having access to more data. While many still take the
approach of accessing the same data as before only in a different database, access to
multi-source data for program purposes will over time provide the biggest benefits of
the DMC. The more data is added, the more benefits will grow. This will include data
such as long-term care, nursing facilities, mental health services, substance abuse
services, and dental services over the next year.
The Post-Consolidation Environment
The Oracle operational databases were and are still available, but they are not nearly
as attractive for analytical purposes because the data warehouse is now available with
clean, integrated, and historical data modeled for access and analytics. Reporting is also
being moved to the data warehouse. The data warehouse (combined with the MedStat
data mart) is 500 GB with 270 users.
BULL is the state’s contracted entity for the Teradata warehouse. This decision was
originally made through competitive bids. As a scalable platform available to a variety
of leading tools, Teradata was kept in place for the added data the DMC brought into
the data warehouse.
Top 3 keys to DMC Success:
1. Leadership and agreement that you have to do DMC
2. Show the ROI for DMC before and after
3. Have sufficient funding for the effort
12. 9
Data Mart Consolidation (DMC)
Healthcare Insurance Company
The Pre-Consolidation Environment
Before the merger of the two companies that formed this health care insurance company,
there was a mainframe data warehouse at one and a Teradata data warehouse at the
other. The Teradata data warehouse actually acted more like an Operational Data Store
(ODS) in that its data was immediately available to users after the data was generated
in the operational systems. After the merger, this Teradata data warehouse became the
feeder system for the mainframe data warehouse.
Eventually, both of the systems gave way to a new Teradata data warehouse – one
destined to be this company’s consolidated data warehouse. In addition, there is still
another data warehouse in the environment that is not yet part of the consolidation effort.
Prior to any consolidation, this company had three different ETL processes, three sets
of definitions, some of the same data in three places, some critical data missing from
the warehouses, and customer tracking being done in multiple data warehouses.
There was “extra everybody effort with extra cost on users – joining data from different
systems and learning different systems” according to the Director of the Data Warehouse.
Each data warehouse had different reconciliation masters (one to the general ledger, one
to invoices and one to cash). So the data did not easily reconcile and there was a cost
associated with bringing it all together.
In most cases, the atomic level detail was captured everywhere although the summaries
and some minor aspects were different between the warehouses. For example, the
Financial data warehouse has 90% currency-type fields so there are shorter records but
it is still detailed. There were also homonyms and synonyms if you looked across the
warehouse environment, which created confusion for the users who frequently had to
access data across different warehouses to accomplish a business objective.
Reasons for Consolidation
While direct carrying cost reduction was expected, this was not the most important or
the largest benefit. Although originally perceived as IT cost savings (because IT came
up with the project idea), DMC was positioned to provide business benefit. IT savings
alone would not have justified it.
The architectural goal of an enterprise-wide data warehouse was made very visible to
the user community. They had a business owner of the project and a steering committee.
Interestingly, there were more privacy issues when the data resided on 3 different data
warehouses than there were after the initial consolidation was completed!
The Consolidation Project
The DMC thus far has consisted of rehosting the (former) mainframe data warehouse to
Teradata for performance reasons and also feeding a separate schema from the ODS-like
data warehouse – two separate schemas for the two pre-merged organizations – but at
least sitting in the same Teradata instance. This allowed the pre-merger Teradata data
warehouse to focus solely on the organization’s needs for an ODS.
The parties involved recommended benchmarking to make sure the chosen DMC
environment would perform as advertised. Although they’d had Teradata for almost
10 years, they ran benchmarks prior to confirming its selection for the consolidated
environment. Teradata solved an immediate pain point by delivering a 5-fold performance
increase compared to the mainframe data warehouse.
Moving the existing ETL streams, access environments and database designs to the
consolidated platform was the first step of the DMC. Most of their data transformation
13. 10
Data Mar t Consolidation (DMC)
happens in mainframe operational environment anyway so the Extract and Transformation
stayed the same. Only the Load changed for the DMC. The number of extracts has been
reduced however based on the consolidation. To ensure integrity, a parallel run of about
three months for each pre-consolidated warehouse occurred. The mainframe cycles were
The consolidation of the third data warehouse (previously mentioned as outside the re-dedicated to OLTP-type
scope of consolidation thus far) and the redesign of the schemas remain to be accom- work. The warehouse is
plished. So, while many of the challenges in the pre-consolidation environment have
been met by the DMC efforts to date, there is still much work to be done. providing detailed data
to support complex and
The Benefits Realized
The larger benefits for this DMC came from the business perspective, specifically more diverse user queries in a
timely data to make better decisions and turn around requests quickly by not having to
manageable way.
reconcile data and prove use of the “right” data. An example of this is profiling providers
and determining whether members are being treated appropriately. This was improved
upon by consolidating the data warehouse environment.
The Post-Consolidation Environment
The mainframe cycles were re-dedicated to OLTP-type work. The warehouse is providing
detailed data to support complex and diverse user queries in a manageable way. There
will be more to this DMC story since it is not complete. Stay tuned.
Top 3 keys to DMC Success:
1. High levels of business customer Support – it’s not all IT
2. Know going into a DMC that you are fixing a business
problem
3. Benchmark to determine the best platform to use
Royal Bank of Canada (RBC) Financial Group
The Pre-Consolidation Environment
RBC Financial Group had a 2.5 TB data warehouse along with several predominant data
marts, some of which pre-dated the data warehouse. These marts ran on heterogeneous
database platforms. These data marts were loaded from a combination of source systems,
flat files and the enterprise data warehouse. There were numerous ways to load and
access data, different staffs for the different marts and the warehouse, and a variety
of vendor tools deployed to access the data.
Systems and Technology within RBC Financial Group conducted a health check on
the data warehouse environment. As a result, the decision was made to transform to a
hub and spoke environment, which would result in simplifying the ETL and processing,
as well as optimize resource utilization. According to Mohammad Rifaie, the Group
Manager of Information Resource Management at the RBC Financial Group, “Data
integration is absolutely critical to create a ‘Single Version of the Truth’ whereby all
business information/data is unified and shared across all functional departments. This
enterprise-wide view of our customer behavior along with operational data will allow
for analysis and insight that was not possible before. A consistent, single view of our
data should improve sales, reduce operational costs, increase customer retention and
satisfaction, and ultimately lead to maximized profitability.”
Reasons for Consolidation
Technological constraints imposed by existing multiple processing platforms made it
very difficult to share data. As a result, much data was replicated. This also resulted in
14. 11
Data Mart Consolidation (DMC)
duplication in resources and processes, which led to a higher cost of ownership and a
greater potential for inconsistency. An impartial assessment of the data warehouse
environment by an analyst group advocated consolidation onto a Teradata platform if
RBC Financial Group was to reduce costs, improve the effectiveness of the environment
“ DMC is like having a rearview and realize their strategic objectives.
mirror AND a front windshield.”
Besides prohibitively higher operating costs, different processing environments pre-
vented RBC Financial Group from leveraging all sources of information. The data
stored in independent data marts usually encompassed one or two subject areas (sales,
marketing, customer service) and failed to provide an integrated environment that
allowed the various pockets of information to be shared and leveraged across the
organization. RBC Financial Group had top-down support and strong executive
sponsorship for the effort. Both were cited as keys to success.
Although it was not stated that the new data warehouse would be the final architecture
for data warehousing, that’s how it worked out. RBC Financial Group now will only
have a physical mart for geographical purposes. “DMC is like having a rearview mirror
AND a front windshield” according to Rifaie.
The Consolidation Project
The first step was to port the data to the single platform, then “rationalize” the data,
removing duplicate data and unneeded ETL. They did not redesign initially – they
“forklifted” the existing designs. Then they redesigned and rearchitected. They’ve just
finished the redesign of the client subject area, which is the most widely used and are
now re-doing the ETL to load the new tables and removing the legacy constructs.
Arrangement, Product and other subject areas will be done this way as well.
RBC Financial Group chose an existing platform to consolidate onto. They had analyst help
in choosing the solution for their DMC and they chose Teradata due to multiple areas of
savings and benefits including high availability and reliability. They had 99.995% availability
in the 7 previous years with Teradata, which Rifaie says is “built for data warehousing and
they have compression and economical indexing. TCO is low for Teradata.” Cultural
challenges were overcome by keeping the focus on TCO and nothing else.
The technical re-porting took 4 months (with one more mart to go.) The team had to
“steal machines cycles whenever they could – after midnight, weekends, etc.” to keep
from impacting user environments. There was no impact on the source systems for
DMC since the systems put out files for data mart/warehouse environment pick-up
(both before and after the DMC.)
Parallel runs with the legacy marts and warehouse lasted 1 month after the queries were
converted to the new warehouse, during which time they were able to procure a written
sign off of every client in the data warehouse. The nodes, disk, and software that the
marts and warehouse resided on were then deployed elsewhere.
The Benefits Realized
“Data Warehousing is about repenting for the sins of the past” according to Rifaie. “The
data warehouse is corporate memory. Redundant data is difficult to control. In a data
mart, the primary key might be a numeric identifier column but it might be different
in another mart where it might be dual-columns. It will be problematic to join the data
from these two.”
For example, once the Business and Personal Marketing data marts are on a single
Teradata platform with the EDW, there will be additional revenue and cost-avoidance
opportunities. This will be followed by a subsequent data rationalization project to eliminate
unnecessary data and process duplication between the EDW and the data marts.
15. 12
Data Mar t Consolidation (DMC)
DMC also positions RBC Financial Group to handle new data and business require-
ments more effectively. These include business centricity, effortless scalability, high
user concurrency, ease of access, complex and ad hoc query performance, data central-
ization, fast fail-safe data load utilities, capability to handle multiple subject area, open
access, integrated metadata, generic modelling, data-source neutrality, and software
addressing all critical components of the architecture.
By consolidating data marts and the enterprise data warehouse onto the same platform
RBC Financial Group has been able to improve overall profitability by:
• Lowering the total cost to own, operate, and expand the data warehouse
environment
• Reducing the requirement for scarce and expensive skill sets
• Enabling data integration across functional areas
“ The data warehouse is corpo-
• Improving efficiency in making data available to meet changing business
requirements rate memory. Redundant data
• Providing an enterprise-wide “single version of the truth” spanning from customer is difficult to control.”
information to actionable data
• Facilitating easier implementation of Data Governance and Privacy and
Confidentiality
• Shortening the supply chain for data access so they can see a client’s complete
relationship to the bank in one place, which has helped improve client relationships
The Post-Consolidation Environment
There is one ETL tool with one way to do ETL now. The data warehouse is 3 TB and
supports 2,500 – 3,000 users.
Top 3 keys to DMC Success:
1. Base a business case on real savings. Architecture
doesn’t sell. Avoid technical terms. Build the case on
savings of FTEs, operations and strict TCO.
2. Make sure to obtain support of business partners at the
highest level.
3. Make sure to communicate with users about schedules
and changes. Do hand-holding. You may need to change
their queries for them. Get sign off. Have communication
surveys and parallel runs.
Major Telecommunications Company
The Pre-Consolidation Environment
The pre-consolidation environment had over 70 “reporting systems” which served
business unit-specific purposes. There was nothing that, from a 10,000-foot perspective,
resembled a data warehouse. None of the marts had enough of a footprint to be considered
“major” from the big picture perspective.
Each mart had a “handful” of users and people used what they informally learned had
the data they needed and that they could get access to. Their choices were not always
best for their needs, but without an organized approach, this was the environment.
16. 13
Data Mart Consolidation (DMC)
The reasons for the marts were as vast and numerous as the platforms. The environment
had grown over “centuries” and it was difficult without central management to tell how
vast and large the environment really was, although it could easily be surmised to be
multiple terabytes. Only when an inventory was done for evaluating DMC opportunities
did this company realize the extent of the problems.
ETL was hand-coded since individual projects could not justify the purchase of a
tool. Data access tools and methods were numerous and not very robust. Additionally,
support staffs were numerous and not dedicated since they resided in business areas.
Major subject areas were duplicated across the pre-consolidation environment but more
importantly business functions were also duplicated. Not only were they duplicated, as
you would imagine with so many marts, they were inconsistently duplicated. The main
problems were data duplication and confusion as opposed to inconsistent representation
problems.
Reasons for Consolidation
The idea for DMC actually came from the business side. This project had to focus on
direct expense reduction as the main key to success. This meant achieving the goal of
reducing technical support and maintenance requirements. Top-down support for the
anticipated savings helped them deal with cultural resistance, which can be the age-old
conflict between centralization and decentralization. Interestingly, privacy was no more
an issue in the new environment as it was in the old environment.
The Consolidation Project
They built a new data warehouse on Teradata to absorb all the data marts. They rerouted
the existing streams but added others that were necessary for the consolidation. They’ve
done “a little redesign as we go and we’ll see when we’re done if more is necessary”
according to the leader of the effort.
The ETLs and database design were changed, but like several other DMCs, they relied
on source system file outputs so the extract load on the source systems were not affected
much. For a few of the mart consolidations, parallel runs of 1 month were done.
The complete consolidation of all the marts, which is still occurring, will take approxi-
mately three years total. Phase 1 delivered 22 of the data marts and was two years, which
included scoping, planning and financial analysis. They have picked up momentum and
experience and anticipate finishing the remaining 48+ in the next year.
The Benefits Realized
The users have acknowledged that the new tools are better and there are many benefits,
especially in the financial reporting environment. They get financial insight they never
had before, especially into their quote-to-cash cycle. They can now make more immedi-
ate decisions with consistent data usage, less data latency and less redundancy. Overall,
it’s a more efficient business operation
The Post-Consolidation Environment
The Teradata warehouse and standardized ETL and data access tools dominate the new
environment, which was consolidated around this one set of tools. There are 2.3 TB of
usable disk, which will be doubled by project’s end. There are 5,000 users and the data-
using community has greatly expanded with this project.
Teradata was chosen for the DMC since it was already being used for some internal
applications and they were not sure how well alternative products would hold up.
Scalability was “Very important. With Teradata you know you can easily keep adding
17. 14
Data Mar t Consolidation (DMC)
nodes, but with SMP, you can add CPUs but you get to diminishing returns. They start
battling among themselves if you get too many of them”.
Normal growth has occurred in the warehouse since it went into production, but they
are more focused on bringing the other marts in, not advancing what they’ve already
brought in. While some of the marts still exist, they serve no purpose. All will be gone
soon. The success of this DMC is assured.
Top 3 keys to DMC Success:
1. Executive buy-in
2. Proper data management and data modeling techniques
3. A team that is knowledgeable with the chosen toolsets
Anthem Blue Cross Blue Shield
The Pre-Consolidation Environment
Anthem’s consolidation was a result of the merger of Blue Cross Blue Shield companies
in Indiana, Kentucky, Ohio, and Connecticut in the period 1993-97. There were three
incompatible data warehouses that needed to be brought together to provide a consoli-
dated view of the business.
Each Blue Cross Blue Shield plan used their data warehouse for pricing, understanding
treatments, some fraud and abuse, group reporting, utilization, underwriting, provider
contracting and affairs management, Exposure, mandatory government reporting, and
experience analysis. They were initially implemented in the several years prior to the
merger, some as far back as 1990.
Two were mainframe warehouses and one was on Teradata. Each was hundreds of
gigabytes in size. “In a way we were fortunate that each state had a data warehouse.
Everyone was used to using a data warehouse and there being a data warehouse around,
but also each state had its own representations of data, its own technologies and its own
way of doing things” according to the author.
ETL was done with COBOL code and there were numerous data access tools and
methods. Many of these methods were of the heavy lifting, programming variety such
as Visual Basic, Microsoft Access, Q&E, Powerbuilder, and CLIST applications.
Major subject areas were duplicated in the environment since each data warehouse
was developed with not only different staffs, but different staffs in different companies.
The models were vastly different.
For example, the representation of a customer was by policy in one, customer ID in
another and something else in the other. This inconsistent representation worked for
each independent company but when the companies got together, this presented
problems.
Reasons for Consolidation
DMC was sold mainly on the idea of integrating data to get cross-company views from
which Anthem would have much richer data for doing functions like fraud detection and
claims re-routing to best-of-breed providers. The idea came from a combination of IT,
the business, and Teradata. Each state was vested in their representations and the use
of their data. Each was a $1B+ organization so this was not a small effort.
18. 15
Data Mart Consolidation (DMC)
The project was made very visible to the user community. The chief actuary of the
consolidated company, who represented much of the usage, was also the executive
sponsor.
Still, there was cultural resistance that was appeased by keeping their data warehouses
alive while the new data warehouse was built. “There are always privacy issues when
dealing with healthcare data. Some state specific data is not accessible by all – only by
personnel in that state. Even though the companies were merged, it would take quite
some time to merge all business processes” per the author.
The Consolidation Project
The goal of the DMC was to establish one data warehouse that would, by default,
receive data from all data warehouses in Blue Cross Blue Shield plans that were being
merged with Anthem.
Ohio’s was the most recently built data warehouse. It was built on Teradata and the chief
sponsor of the project was Ohio’s chief actuary in the pre-consolidation environment.
For this reason, and the good experience to date with Teradata, it was selected as the
platform for the ADW (Anthem data warehouse). Anthem began bringing data in from
the other data warehouses. New streams were created for the Indiana and Kentucky plan
data and each subject area went through the process of data element comparison, logical
modeling, database design, code value comparison, data transformation, and implemen-
tation for the design.
The database design, ETL processes, and access environments were changed. Each
subject area was redesigned to represent the single version of the truth. Anthem wanted
one scalable data model for absorption of new data sources or new Anthem, Inc.
acquisitions. “Being able to access, review, analyze and share data across the company
made all the difference between success and failure” according to the author.
The consolidation took about one year although this included other normal development
and creation of value-added functionality. Top-down support combined with parallel
runs to make it smooth helped overcome cultural resistance.
The Benefits Realized
There were many benefits of the consolidated data warehouse. Some were related to
the fact that there was consolidation of the prior data warehouses and some are related
to the ongoing developments on the data warehouse.
The DMC helped Anthem win new business because of the flexibility and reporting
capabilities generating income. The cost of care was lowered by $250M annually by
using the ADW to identify patterns in the data that allowed Anthem to build better
networks and craft the network reimbursement arrangements in different ways. The
ADW was instrumental in reducing the cost of products for policyholders and members
(i.e., pay VALID bills ONCE).
The ADW is used to ensure practitioners are licensed to perform and Anthem was able
to craft lower costs from providers by dealing with them based on their profitability
as determined by the data warehouse. Anthem was also able to reduce the Caesarian
section rates and improve the results from coronary bypass surgeries and improve staff
productivity. “I don’t think much of this could have been accomplished without a single
version of the truth” according to the author.
The Post-Consolidation Environment
The initial consolidation of the 3 states into 1 data warehouse is complete.
19. 16
Data Mar t Consolidation (DMC)
The multi-terabyte Teradata warehouse has hundreds of users. The former warehouses
were still in place well after the DMC given their new role of feeding the ADW. Other
Blue Cross Blue Shield plans still need to be brought in so this is a work-in-progress.
Teradata was chosen because of Ohio Blue Cross Blue Shield’s good experience with
Teradata and its known scalability. If the chosen solution was unable to handle the large
workload, the shared concept would have died and Anthem would have stayed with
separate data warehouses which means they wouldn’t have gained half of what they did
with the ADW – and would have wasted millions!
Top 3 Keys to DMC Success:
1. Strong, active executive sponsorship keeping the project
out of internal politics
2. Source the data warehouse from operational systems,
not existing data warehouse/data marts
3. Create a program with standards and processes
Sekisui Systems Corp.
The Pre-Consolidation Environment
Sekisui had seven data marts distributed to branch offices fed from a central data
warehouse. These supported a variety of business functions such as increasing the
frequency of effective customer calls by saving time to create meeting materials, by
automating the sales cycle, and by providing information directly to selected customers.
It has grown over time both in data size and number of users. Before the DMC, the
marts in total had 110 GB of total disk space with 66 GB used. There were 400 users.
Despite the number of marts, they managed to keep consistency among the DBMS,
the ETL, and data access for all of them. They also had only 1 DBA for all seven marts,
but there were still efficiencies to be gained from DMC.
Reasons for Consolidation
One anticipated benefit was cost reduction by consolidating the machines from seven
branch offices. It was time to replace some of these anyway due to obsolescence, further
opening the door to DMC.
Another benefit was to unify the system operation and further standardize the operating
skill of the enterprise to the platform they could grow with. “Since the Teradata warehouse
was already constructed, we wanted to standardize the operating skill on Teradata by
consolidating the data marts to Teradata” according to Masaaki Kondo, Director of the
Corporate Group Systems Division at Sekisui.
The ROI was estimated by comparing the costs of continuing to license the branch
office machines to the cost of a consolidated approach. The System Operating
Department Manager (IT) came up with the idea for DMC at Sekisui. There wasn’t
any cultural resistance.
The Consolidation Project
Sekisui consolidated onto the existing Teradata data warehouse by redesigning the entire
system. Extract loads on the operational systems were reduced with the DMC. The
project took 6 months, just as expected.
20. 17
Data Mart Consolidation (DMC)
The Benefits Realized
The project is complete and the data marts for the sales database systems are consoli-
dated. The planned benefit, direct expense reduction, was achieved. Support costs were
reduced and all access is now against the data warehouse.
The Post-Consolidation Environment
Scalability was crucial for data expansion. Concurrency was also immensely important
since usage concentrated around 9 a.m. system-wide.
If the DBMS were unable to handle the workload, Sekisui would be isolated from
information on member daily sales activities and division managers’ sales results. All
organizations in Sekisui group using the system would be affected.
Top 3 keys to DMC Success:
1. Create an organized data warehouse (not data marts)
which is best suited to your goals
2. Educate the end users on the project and secure their
agreement
3. Unify codes and subjects in a consolidated environment
21. 18
Data Mar t Consolidation (DMC)
Part 3: Best Practices for DMC
Key Findings from the Interviews:
1. The number of marts/warehouses consolidated ranged
from 3 to 70 with a median of 7.5.
2. The majority of environments had duplicate and inconsis-
tent data across the pre-consolidation environment.
3. The primary reason for DMC varied with very strong opin-
ions for the reasons cited! Five quoted business rationale
such as creating a consolidated view of customers as the
main reason while three quoted IT cost reductions as the
main reason.
4. All performed at least some manner of rearchitecting
although several made this a later stage step that came
after rehosting.
5. Except for the case where the consolidated databases
had operational functions to perform in the environment
as well, only one kept the consolidated marts/warehouses
in the environment after the DMC. The old platforms were
redeployed to other uses or, in most cases, eliminated.
6. Every DMC was made very visible to the user community.
These projects required a great deal of support which
most received from the highest levels of the organization.
It was not possible to accomplish DMC objectives in a
skunkworks manner.
7. Very little user data access outages were reported.
Most DMC programs took great caution to transition
users smoothly to the new environment.
8. 5 programs credited IT with the idea for the DMC. The
other 3 cited the business with the initiative.
9. All said scalability was important to the data ware-
housing decision. Many referenced the sudden increase
in data and users that the warehouse would be taking
on after the DMC as putting scalability on the top of
the criteria list.
10. Almost every DMC faced some degree of cultural resis-
tance to the idea of consolidating and centralizing. Most
of this was adeptly dealt with through attaining top-down
support and cultivating user interests throughout the
project. The majority of resistance went away as soon
as early benefits of the DMC were realized.
11. Little change occurred to operational systems impact as
a result of DMC efforts.
22. 19
Data Mart Consolidation (DMC)
DMC can be used to put in place a scalable, integrated, multi-application data ware-
house that absorbs all analytical-type activity in an organization or it can be used to
“simply” get an antiquated system out of the environment by moving its function to
a system still under support from its vendor.
Regardless of the ambition, many DMC efforts eventually lead to the first goal. The act
of initiating the consolidation idea within an organization seems to spawn more and
more consolidation.
For those organizations that are considering DMC and will have opportunity to plan its
success, some best practices emerged from the interviews as well as anecdotal evidence.
The keys are also applicable to newer data warehouse efforts or those being revamped to
a centralized data warehouse environment.
Customer-Reported Keys to DMC Success
1. Get top down support. This was cited as the #1 key to success in 5 of the cases
and was a top 3 key in all but one case.
2. Fix a problem. Whether you justify on cost savings or a business benefit (or
both), the DMC should fix a major, known problem that can be quantified in
business terms.
3. Have data standards and a sound data model.
4. Pick the right tools and platform. Put DMC on a scalable platform. Your data
volume managed within a singular database will instantaneously explode with
DMC. Future efforts will be continuing to grow the environment. Also note that
in addition, many took this opportunity of changing platforms to also change
data access and ETL tools.
5. Set expectations and communicate with users. There is no such thing as over
communication in a DMC project. This is about the users and care needs to be
taken to migrate the users without any disruption in their ability to access data.
Author’s Additional Keys to DMC Success
1. Don’t just rehost, rearchitect. This time of transition is also an opportunity to
reevaluate the data warehouse program according to established best practices –
a time to evaluate what is and isn’t working and fully take advantage of the new
platform and the migration process.
2. Starve the pre-consolidated marts of attention and resources. Negotiate the
condition for user signoff prior to DMC. Make sure all utility is removed from
the marts.
3. Justify on either platform cost savings, business benefits or both. The larger the
project, the more DMC is a difficult technical challenge and the platform cost
savings more evident. It is always easiest to justify on cost savings but business
benefit based on delivering new capabilities can be significant.
4. Expect and plan for cultural resistance. Ownership, as a concept in the former
environment, may now be designated at a subject area level as opposed to a data
mart level. Carry forward security and stewardship designations and responsibili-
ties to the consolidated data warehouse. This may even be a time to improve
these programs.
5. Consolidate ETL and access tools too. Part of the re-gathering of requirements
that should be gathered for a DMC necessitates taking the opportunity to ensure
tools are still compatible with the new platform and the most fit-for-purpose.
23. 20
Data Mar t Consolidation (DMC)
About the Author
William McKnight is founder and president of McKnight Consulting Group, a consulting
firm specializing in data warehousing solutions. William is an internationally recognized
expert in data warehousing and MDM with more than 15 years of experience architecting
and managing information and technology services for G2000 organizations.
William is a frequent and highly rated speaker at major worldwide conferences and
private events, providing instruction on customer intimacy, return-on-investment,
architecture, business integration, and other business intelligence strategic and architec-
ture issues. He is a well-published author and a columnist in Information Management
for the column “Information Management Leadership".
A regularly featured expert on data warehouse/business intelligence and MDM at major
conferences, William is widely quoted on data warehouse and has been featured on
several prominent expert panels. An expert witness, skills evaluation author and a judge
for best practices competitions, William is the former executive of a recognized best
practices information management program.
5960 West Parker Road
Suite 278, #133
Plano, TX 75093
(214) 514-1444
www.mcknightcg.com