The Briefing Room with Barry Devlin and WhereScape
Live Webcast on June 10, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=5230c31ab287778c73b56002bc2c51a
The data warehouse is intended to support analysis by making the right data available to the right people in a timely fashion. But conditions change all the time, and when data doesn’t keep up with the business, analysts quickly turn to workarounds. This leads to ungoverned and largely un-managed side projects, which trade short-term wins for long-term trouble. One way to keep everyone happy is by creating an integrated environment that pulls data from all sources, and is capable of automating both the model development and delivery of analyst-ready data.
Register for this episode of The Briefing Room to hear data warehousing pioneer and Analyst Barry Devlin as he explains the critical components of a successful data warehouse environment, and how traditional approaches must be augmented to keep up with the times. He’ll be briefed by WhereScape CEO Michael Whitehead, who will showcase his company’s data warehousing automation solutions. He’ll discuss how a fast, well-managed and automated infrastructure is the key to empowering faster, smarter, repeatable decision making.
Visit InsideAnlaysis.com for more information.
3. Twitter Tag: #briefr
The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
4. ! Reveal the essential characteristics of enterprise software,
good and bad
! Provide a forum for detailed analysis of today’s innovative
technologies
! Give vendors a chance to explain their product to savvy
analysts
! Allow audience members to pose serious questions... and get
answers!
Twitter Tag: #briefr
The Briefing Room
Mission
5. Twitter Tag: #briefr
The Briefing Room
Topics
This Month: ANALYTICS & MACHINE LEARNING
July: INNOVATIVE TECHNOLOGY
August: BIG DATA ECOSYSTEM
2014 Editorial Calendar at
www.insideanalysis.com/webcasts/the-briefing-room
7. Twitter Tag: #briefr
The Briefing Room
Analyst: Barry Devlin
Dr. Barry Devlin is among the foremost authorities on business
insight and one of the founders of data warehousing, having
published the first architectural paper on the topic in 1988.
With over 30 years of IT experience, he is a widely respected
analyst, consultant, lecturer and author. His 2013 book,
“Business unIntelligence—Insight and Innovation beyond
Analytics and Big Data,” is available as hardcopy and e-book.
Barry is founder and principal of 9sight Consulting. He
specializes in the human, organizational and IT implications of
deep business insight solutions that combine operational,
informational and collaborative environments. A regular
contributor to BeyeNETWORK and TDWI, Barry is based in Cape
Town, South Africa and operates worldwide.
8. Twitter Tag: #briefr
The Briefing Room
WhereScape
! WhereScape is a data warehousing software company
! It offers WhereScape 3D, software for planning and reality-testing
data warehousing and business intelligence projects;
and WhereScape RED, an integrated development
environment used for building, deploying and managing
data warehouses and data marts.
! WhereScape RED allows developers to automate the data
warehousing life cycle
9. Twitter Tag: #briefr
The Briefing Room
Guest: Michael Whitehead
A data warehousing industry veteran,
Michael Whitehead has spent more
than a decade designing and building
commercial data warehouses for
customers in a wide variety of
industries. Prior to founding
WhereScape, Michael had Asia Pacific
responsibilities for data warehousing
for Sequent Computer Systems, Inc.
11. Why were sales
down this week
versus last
year?
Grocery
Store
with
Class,
Walter
Watzpatzkowski,
15
/1/09
12. We promoted
ice cream but the
weather was
unreasonably
cold
Grocery
Store
with
Class,
Walter
Watzpatzkowski,
15
/1/09
13. Our competitor
ran a better
promotion
Grocery
Store
with
Class,
Walter
Watzpatzkowski,
15
/1/09
14. 1990s - Decision support
system
(For the time) large amounts of data, stored in
various inscrutable file formats and database
management systems.
Want actionable information?
Write a program.
One program per analytical problem….
Reporting bureaus
This
model’s
dysfuncBons
created
the
need
for
data
warehousing…
15. 2000s - Enterprise data
warehousing
Separate the refinement of raw data – regardless of
the source – from the delivery of subsets of that
data, to various decision-making constituencies.
Build a solid, scalable information delivery
infrastructure for the corporation.
Support variability, and change, at both ends.
Apply appropriate governance, risk management,
compliance mechanisms.
[And stabilize the supply side of the market, in the
process…]
A
design
paFern
for
stable,
OperaBonalized
informaBon
refining
and
delivery
16. The economic
conditions led to a
change in
demographics of
the people walking
past my store
Grocery
Store
with
Class,
Walter
Watzpatzkowski,
15
/1/09
17. 2014 - big data technologies
Large amounts of data, stored in
various inscrutable file formats and
database management systems.
Want actionable information?
Write a program.
One program per analytical problem….
Oh, and batch-oriented.
And integrate-it-yourself.
Instead
of
JCL,
Pig.
Instead
of
CICS
and
Comshare,
Cloudera.
In
what
way
is
this
model
a
leap
forward?
19. People built
Data warehouses
that don’t support
analytics
Grocery
Store
with
Class,
Walter
Watzpatzkowski,
15
/1/09
20. 2014 – “self service”
technologies
Large amounts of data, stored in
various inscrutable file formats AND
data warehouses.
Want actionable information?
Create a dataset.
One dataset per analytical problem….
The
newer
tech
is
great.
Is
the
way
it
is
used
a
leap
forward?
21. Automation is key
for better support
of analytics
Smith
Cannery:
Extension
and
Experiment
StaBon
CommunicaBons
Photograph
CollecBon
(p120)
22. STEPS
1. Identify attributes
2. Identify business key
3. Index business key and add a unique constraint
4. Create surrogate key with auto sequence generation
5. Index surrogate key
6. Insert zero surrogate key row with values set for each attribute
7. Add a modified timestamp column
8. Write the SQL code to Insert new business keys or Update existing business key
rows. Maintain the modified timestamp
9. Create any other indexes required for querying
10. Decide best practice for index maintenance during load. Keep in situ or drop and
recreate after load.
11. Document procedure
Etc Etc
23. Really?
1. Identify attributes
2. Identify business key
3. Index business key and add a unique constraint
4. Create surrogate key with auto sequence generation
5. Index surrogate key
6. Insert zero surrogate key row with values set for each attribute
7. Add a modified timestamp column
8. Write the SQL code to Insert new business keys or Update existing business key
rows. Maintain the modified timestamp
9. Create any other indexes required for querying
10. Decide best practice for index maintenance during load. Keep in situ or drop and
recreate after load.
11. Document procedure
Etc Etc
24. What can be automated?
• Profiling
• Model conversion
• Object creation
• Code generation
• Indexing
• Impact analysis
• Documentation
26. The new data warehouse
Five Key Changes
Pooling – new types of data, staged differently
than we’ve staged pampered data, in the past.
A multi-engine “logical” data warehouse:
NoSQL à Not Only SQL
Support for discovery, prototyping and
evaluation of analytics
Support for continuing data integration,
through to the “end use” tier
Automation of the data warehousing platform’s
core functionality
Back
to
best-‐of-‐breed,
customer-‐specific
IntegraBon
models
27. Conclusion
Let’s not stuff it up (again)
• Data people – challenge
ourselves to do more, faster
• Analysts – don’t give up on the
data people
28. Twitter Tag: #briefr
The Briefing Room
Perceptions & Questions
Analyst:
Barry Devlin