SlideShare una empresa de Scribd logo
1 de 51
Descargar para leer sin conexión
What Is In Your
Dataaa
Warehouse?
© Copyright 2022 by Peter Aiken Slide # 1
peter.aiken@anythingawesome.com +1.804.382.5957 Peter Aiken, PhD
© Copyright 2022 by Peter Aiken Slide #
https://anythingawesome.com
Program
2
Of What Do Your
Data
Warehousing
Operations
Consist?
• Definitions
– Data Warehousing
– DM BoK guidance
– Legacy to digital conversion (cloud)
• Integration
– Emphasize engineering talent
– Incorporate leading not bleeding edge
– Requires an adaptive rather than a prescriptive approach
• Preparation
– Emphasis on storytelling first and visualization second
– Analytics is both ubiquitous and not well understood
– Keep improvements practically focused by strategy
• Best Practices
– Cannot use what is not understood-understand what you have
– PDCA (plan, do, check, act)
– Iteratively refine/cull with respect to strategic direction
• Take Aways/References/Q&A
Process
Data Warehousing IPO Diagram
© Copyright 2022 by Peter Aiken Slide # 3
https://anythingawesome.com
Inputs Outputs
Warehoused
Data
Marts
Diffuse Applications
ETL-Extract Transform Load
A major source of data structure/
transformation knowledge
© Copyright 2022 by Peter Aiken Slide #
Metadata
Management
4
https://anythingawesome.com
Data
Management
Body
of
Knowledge
(DM
BoK
V2)
from The DAMA Guide to the Data Management Body of Knowledge 2E © 2017 by DAMA International
11
Practice
Areas
Data Warehousing & Business Intelligence Management
© Copyright 2022 by Peter Aiken Slide # 5
https://anythingawesome.com
Defining Data Warehousing, BI/Analytics
• Data Warehousing
– A technology solution supporting … business capabilities
such as: query, analysis, reporting and development
of these capabilities
– Analysis of information not previously integrated
– Another, often new, set of organizational capabilities
• Business Intelligence (aka. decision support)
– Dates at least to 1958
– Support better business decision making
– Technologies, applications and practices for the collection, integration, analysis, and
presentation of business information
– Understanding historical patterns in data to improve future performance
– Use of mathematics in business
• Analytics (aka.) enterprise decision management, marketing analytics,
predictive science, strategy science, credit risk analysis. fraud analytics -
often based on computational modeling
• Reframing the question …
– From: what data warehouse should we build?
– To: how can data warehouse-based integration address challenges?
© Copyright 2022 by Peter Aiken Slide # 6
https://anythingawesome.com
From The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Payroll Application
(3rd GL)
Payroll Data
(database)
R& D Applications
(researcher supported, no documentation)
R & D
Data
(raw) Mfg. Data
(home grown
database)
Mfg. Applications
(contractor
supported)
Finance
Data
(indexed)
Finance Application
(3rd GL, batch
system, no source)
Marketing Application
(4rd GL, query facilities,
no reporting, very large)
Marketing Data
(external database)
Personnel App.
(20 years old,
un-normalized data)
Personnel Data
(database)
Typical System Evolution
© Copyright 2022 by Peter Aiken Slide # 7
https://anythingawesome.com
Multiple Sources of
(for example)
Customer Data
Payroll Data
(database)
R & D
Data
(raw)
Mfg. Data
(home grown
database)
Finance
Data
(indexed)
Marketing Data
(external database)
Personnel Data
(database)
... Then Integrate
© Copyright 2022 by Peter Aiken Slide # 8
https://anythingawesome.com
Organizational
Data
Payroll Data
(database)
R & D
Data
(raw)
Mfg. Data
(home grown
database)
Finance
Data
(indexed)
Marketing Data
(external database)
Personnel Data
(database)
... Then Re-architect
© Copyright 2022 by Peter Aiken Slide # 9
https://anythingawesome.com
Organizational
Data
Linked Data
© Copyright 2022 by Peter Aiken Slide # 10
https://anythingawesome.com
Linked Data is about using the Web to connect related data
that wasn't previously linked, or using the Web to lower the
barriers to linking data currently linked using other methods.
More specifically, Wikipedia defines Linked Data as "a term
used to describe a recommended best practice for exposing,
sharing, and connecting pieces of data, information, and
knowledge on the Semantic Web using URIs and RDF."
linkeddata.org
© Copyright 2022 by Peter Aiken Slide # 11
https://anythingawesome.com
• EII
Enterprise
Information
Integration
– between ETL and EAI - delivers
tailored views of information to
users at the time that it is
required
MetaMatrix Integration Example
AWS Redshift
© Copyright 2022 by Peter Aiken Slide # 12
https://anythingawesome.com
Cloud Definition
• Location-independent computing, whereby
shared servers provide resources, software,
and data to computers and other devices on
demand, as with the electricity grid.
• Cloud computing is a natural evolution of the
widespread adoption of virtualization,
service-oriented architecture and utility
computing.
• Details are abstracted from consumers, who no
longer have need for expertise in, or control
over, the technology infrastructure "in the cloud"
that supports them.
• Location-independent computing, whereby
shared servers provide resources, software,
and data to computers and other devices on
demand, as with the electricity grid.
• Cloud computing is a natural evolution of the
widespread adoption of virtualization,
service-oriented architecture and utility
computing.
• Details are abstracted from consumers, who no
longer have need for expertise in, or control
over, the technology infrastructure "in the cloud"
that supports them.
© Copyright 2022 by Peter Aiken Slide # 13
https://anythingawesome.com
Cloud Computing
Cloud Scalability
© Copyright 2022 by Peter Aiken Slide # 14
https://anythingawesome.com
Cisco's Ladder to the Cloud
© Copyright 2022 by Peter Aiken Slide # 15
https://anythingawesome.com
Cloud Options
© Copyright 2022 by Peter Aiken Slide # 16
https://anythingawesome.com
Data in the cloud should have three attributes that
data outside the cloud/warehouse should not
have. It should be:
Sharable-er
© Copyright 2022 by Peter Aiken Slide # 17
https://anythingawesome.com
Cleaner
Smaller
Transform
Problems with forklifting
1. no basis for decisions made
2. no inclusion of architecture/
engineering concepts
3. no idea that these
concepts are missing
from the process
4. 80% of
organizational
data is ROT
© Copyright 2022 by Peter Aiken Slide # 18
https://anythingawesome.com
Less
Cleaner
More shareable
... data
Making Warehousing Successful
-—Cloud——
Fixing Data in the Cloud is Like Using A Glovebox
© Copyright 2022 by Peter Aiken Slide # 19
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide #
https://anythingawesome.com
Program
20
Of What Do Your
Data
Warehousing
Operations
Consist?
• Definitions
– Data Warehousing
– DM BoK guidance
– Legacy to digital conversion (cloud)
• Integration
– Emphasize engineering talent
– Incorporate leading not bleeding edge
– Requires an adaptive rather than a prescriptive approach
• Preparation
– Emphasis on storytelling first and visualization second
– Analytics is both ubiquitous and not well understood
– Keep improvements practically focused by strategy
• Best Practices
– Cannot use what is not understood-understand what you have
– PDCA (plan, do, check, act)
– Iteratively refine/cull with respect to strategic direction
• Take Aways/References/Q&A
© Copyright 2022 by Peter Aiken Slide #
https://anythingawesome.com
Program
21
Of What Do Your
Data
Warehousing
Operations
Consist?
• Definitions
– Data Warehousing
– DM BoK guidance
– Legacy to digital conversion (cloud)
• Integration
– Emphasize engineering talent
– Incorporate leading not bleeding edge
– Requires an adaptive rather than a prescriptive approach
• Preparation
– Emphasis on storytelling first and visualization second
– Analytics is both ubiquitous and not well understood
– Keep improvements practically focused by strategy
• Best Practices
– Cannot use what is not understood-understand what you have
– PDCA (plan, do, check, act)
– Iteratively refine/cull with respect to strategic direction
• Take Aways/References/Q&A
Two Basic Warehouse Purposes
• Integration
– of disparate data sources
for purpose of potential
subsequent analyses
– Most organization data is
never analyzed
– Same type of inputs as
output
– Downsteam knowledge is
incorporated upstream
• Preparation
– Seen as the last mile
(preparation) of the data
before presentation as part
of decision making activities
– Closed ended activities
– Final possible application of
programmatic quality
measures
© Copyright 2022 by Peter Aiken Slide # 22
https://anythingawesome.com
Integration
Choose
One
Only!
How many interfaces are required to solve this integration problem?
© Copyright 2022 by Peter Aiken Slide #
Application 4 Application 5 Application 6
Application 1 Application 2 Application 3
RBC: 200 applications - 4900 batch interfaces
23
https://anythingawesome.com
Application 4 Application 5 Application 6
Application 1 Application 2 Application 3
15 Interfaces
(N*(N-1))/2
The rapidly increasing cost of complexity
© Copyright 2022 by Peter Aiken Slide #
0
10000
20000
1 101 201
Number of Silos
Worst case number of interconnections
24
https://anythingawesome.com
Silos / Interconnections
• 6 / 15
• 60 / 1,770
• 600 / 179,700
• 200 / 19,900
• 200 / 5,000 (actual)
X=5,000
© Copyright 2022 by Peter Aiken Slide # 25
https://anythingawesome.com
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Corporate Information Factory Architecture
Warehousing applied to a specific challenge
© Copyright 2022 by Peter Aiken Slide # 26
https://anythingawesome.com
http://www.infosys.com/industries/healthcare/industryofferings/Pages/healthcare-data-warehousing.aspx
OPERATIONAL SYSTEM
OPERATIONAL SYSTEM
FLAT FILES
SUM MARY
DATA
RAW DATA
M ETADATA
PURCHASING
SALES
INVENTORY
ANALYSIS
REPORTING
MINING
Inmon Implementation/3NF
© Copyright 2022 by Peter Aiken Slide # 27
https://anythingawesome.com
"A subject oriented, integrated, time variant, and non-volatile collection of summary and detailed
historical data used to support the strategic decision-making processes of the organization."
Third Normal Form
• Each attribute in the relationship is a fact about a key
– Highly normalized structure
• Use Cases
– Transactional Systems
– Operational Data Stores
© Copyright 2022 by Peter Aiken Slide # 28
https://anythingawesome.com
Third Normal Form: Pros and Cons
© Copyright 2022 by Peter Aiken Slide # 29
https://anythingawesome.com
Neo4j.com
• Pros
– Easily understood by business and end users
– Reduced data redundancy
– Enforced referential integrity
– Indexed attributes/flexible querying
• Cons
– Joins can be expensive
– Does not scale
Engineering
Architecture
Engineering/Architecting
Relationship
• Architecting is used to create
and build systems too complex
to be treated by engineering
analysis alone
– Require technical details as the
exception
• Engineers develop the
technical designs
– Engineering/Crafts-persons deliver
components supervised by:
• Manufacturer
• Building Contractor
© Copyright 2022 by Peter Aiken Slide # 30
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide # 31
https://anythingawesome.com
• Solution design is
not based on
semantic
understanding
• Focus is often
critical due to
speed issues
• AI/ML focus on
learning how the
system performs
and how to
improve it
Emphasize Engineering Talent
Target Isn't Just Predicting Pregnancies
© Copyright 2022 by Peter Aiken Slide # 32
https://anythingawesome.com
http://rmportal.performedia.com/node/1373 and http://www.predictiveanalyticsworld.com/patimes/target-really-predict-teens-pregnancy-inside-story/ http://rmportal.performedia.com/rm/paw10/gallery_01#1373
Where To Focus Limited Analytics Resources?
© Copyright 2022 by Peter Aiken Slide # 33
https://anythingawesome.com
Warehoused
Data
Marts
Diffuse Applications
Data Management Practices
Duplicated but ETLed Data
(quality & transformations applied)
Learning/
Feedback
Analytics Practices
Mission
• Advance change in America
by ensuring equitable access
to nutritious food for all in
partnership with food banks,
policymakers, supporters, and
the communities we serve
• Great understanding and skill
executing rapid response to
local needs
• What else could be done with
their data?
• Rerouting busses =
eliminating food deserts for
thousands of residents
© Copyright 2022 by Peter Aiken Slide # 34
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide #
https://anythingawesome.com
Program
35
Of What Do Your
Data
Warehousing
Operations
Consist?
• Definitions
– Data Warehousing
– DM BoK guidance
– Legacy to digital conversion (cloud)
• Integration
– Emphasize engineering talent
– Incorporate leading not bleeding edge
– Requires an adaptive rather than a prescriptive approach
• Preparation
– Emphasis on storytelling first and visualization second
– Analytics is both ubiquitous and not well understood
– Keep improvements practically focused by strategy
• Best Practices
– Cannot use what is not understood-understand what you have
– PDCA (plan, do, check, act)
– Iteratively refine/cull with respect to strategic direction
• Take Aways/References/Q&A
© Copyright 2022 by Peter Aiken Slide # 36
https://anythingawesome.com
Business
Intelligence
© Copyright 2022 by Peter Aiken Slide # 37
https://anythingawesome.com
• Users can
"drill"
anywhere
• Entire collection
"cube" is
accessible
• Summaries to
transaction-level
detail
Basics
Sample questions …
© Copyright 2022 by Peter Aiken Slide # 38
https://anythingawesome.com
Cancer patient
revenue across
all facilities
Revenue for diseases
this year versus last
year in the NE region
Total costs and revenue at
top 10 facilities
• Emphasis on the
"cube"
• 'N' dimensions
• Permits different
users to "slice and
dice" subsets of
data
• Viewing from
different
perspectives
Example: Set Analysis
© Copyright 2022 by Peter Aiken Slide # 39
https://anythingawesome.com
Portfolio Analysis
• Bank accounts are of
varying value and risk
• Cube by
– Social status
– Geographical location
– Net value, etc.
• Strategy: balance return on the loan with risk of default
• How to evaluate the portfolio as a whole?
– Least risk loan may be to the very wealthy, but there are a very limited number
– Many poor customers, but greater risk
• Solution may combine types of analyses
– When to lend?
– Interest rate charged?
© Copyright 2022 by Peter Aiken Slide # 40
https://anythingawesome.com
Kimball Implementation/Dimensional
© Copyright 2022 by Peter Aiken Slide # 41
https://anythingawesome.com
"A copy of transaction data specifically structured for query and analysis."
• Comprised of “fact tables” that contain quantitative data, and any
number of adjoining “dimension” tables
• Optimized for business reporting
• Use Cases
– OLAP (Online Analytic Processing)
– BI
Star Schema
© Copyright 2022 by Peter Aiken Slide # 42
https://anythingawesome.com
Wikipedia
Star Schema Pros and Cons
• Pros
– Simple Design
– Fast Queries
– Most major DBMS are optimized for Star Schema Designs
• Cons
– Questions must be
built into the design
– Data marts are
often centralized
on one fact table
© Copyright 2022 by Peter Aiken Slide # 43
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide # 44
https://anythingawesome.com
• Solution design is
based on semantic
understanding
• Broad category
ranging from CRM
to analytics
• AI/ML focus on
data exploration
Emphasize Data Architecture Talent
Supply/demand for data talent
https://www.logianalytics.com/bi-trends/3-keys-understanding-data/
Growth of Data vs. Growth of Data Analysts
• Stored data accumulating at
28% annual growth rate
• Data analysts in workforce
growing at 5.7% growth rate
© Copyright 2022 by Peter Aiken Slide # 45
https://anythingawesome.com
DataGrowth
Data Analysis Capabilities Growth
© Copyright 2022 by Peter Aiken Slide # 46
https://anythingawesome.com
0%
25%
50%
75%
100%
Data Analysis Data Preparation
20%
80%
Everyone wants to do better data analysis …
• Some data preparation is inevitable
• What would a 'good' ratio be?
• "Everyone knows"
© Copyright 2022 by Peter Aiken Slide # 47
https://anythingawesome.com
2020 American Airlines market value ~ $6b
AAdvantage valued between $19.5-$31.5b
2020 United market value ~ $9b
MileagePlus ~ $22b
https://www.forbes.com/sites/advisor/2020/07/15/how-airlines-make-billions-from-monetizing-frequent-flyer-programs/?sh=66da87a614e9
Simple Business Case
• "Knowledge workers"
– The Landmarks of Tomorrow (1957)
by Peter Drucker
– Examples include programmers, physicians,
pharmacists, architects, engineers,
scientists, design thinkers, public
accountants, lawyers, and academics, and
any other white-collar workers, whose line of
work requires one to "think for a living".
– Think about inputs and make decisions
based on the data and their processes
© Copyright 2022 by Peter Aiken Slide # 48
https://anythingawesome.com
When asked to incorporate data
• Data appreciation isn't translating
into employee adoption
– 48% frequently make gut decisions
– 66% for C-suite executives
• Lack of data skills is limiting
workplace productivity
– 36% said they would find an
alternative method to complete the task
without using data
– 14 percent would avoid the task entirely
© Copyright 2022 by Peter Aiken Slide # 49
https://anythingawesome.com
http://TheDataLiteracyProject.org
Avoid Task Entirely
14%
Find Alt. Method
36%
Incorporate Data
50%
Too many organizations have simply
put data in the hands of employees and
expected them to make a success of it
Incorporate Data
52%
Defer to Gut
48%
Incorporate Data
34%
Defer to Gut
66%
Data is not broadly or widely understood
© Copyright 2022 by Peter Aiken Slide # 50
https://anythingawesome.com
adapted from: http://www.dailymirror.lk/print/opinion/editorial-we-need-to-become-channels-of-peace/172-27164
It is like a fan!
It is like a snake!
It is like a wall!
It is like a rope!
It is like a tree! Blind Persons and the Elephant
It is like a story!
It is like a dashboard!
It is like a pipes!
It is like a warehouse!
It is like statistics!
Level 5
Level 4
Level 3
Level 2
Level 1
Data professional
Data teacher
Knowledge worker
Adult data spreader
Mobile data spreader
Data Literacy Levels
© Copyright 2022 by Peter Aiken Slide #
Acumen-DP
Acumen-DT
Acumen-KW
Proficient
Literate
51
https://anythingawesome.com
Everything here is
required of
data scientists and
cyber professionals
1,000,000,000
Level 3–KW–Citizen Data Knowledge Areas
• Elevator story
• Data stewardship
• Demonstrating value
• Currency
• Fiduciary responsibilities
• Shared fate
– https://www.techtarget.com/searchenterpriseai/definition/data-scientist
© Copyright 2022 by Peter Aiken Slide # 52
https://anythingawesome.com
Data
Things
Happen
Organizational
Things
Happen
More work required to incorporate greater focus
© Copyright 2022 by Peter Aiken Slide # 53
https://anythingawesome.com
≈
≈
≈
≈
≈
≈
X
$
X $
X
$
X $
X $
X
$
X
$
X $
X $
What is Strategy?
• Current use derived from military
- a pattern in a stream of decisions
[Henry Mintzberg]
© Copyright 2022 by Peter Aiken Slide # 54
https://anythingawesome.com
A thing
Every Day
Low Price
Former Walmart Business Strategy
© Copyright 2022 by Peter Aiken Slide # 55
https://anythingawesome.com
Wayne
Gretzky’s
Strategy
© Copyright 2022 by Peter Aiken Slide #
He skates to where he
thinks the puck will be ...
56
https://anythingawesome.com
Strategy Example 3
© Copyright 2022 by Peter Aiken Slide #
Good Guys
(Us)
Bad Guys
(Them)
57
https://anythingawesome.com
Strategy Example 3
© Copyright 2022 by Peter Aiken Slide #
Good Guys
(Us)
Bad Guys
(Them)
58
https://anythingawesome.com
Strategy Example 3
© Copyright 2022 by Peter Aiken Slide #
Good Guys
(Us)
Bad Guys
(Them)
59
https://anythingawesome.com
Strategy Guides Workgroup Activities
© Copyright 2022 by Peter Aiken Slide #
A pattern
in a stream
of decisions
60
https://anythingawesome.com
Your Data Strategy
• Highest level data guidance
available ...
• Focusing data activities on
business-goal achievement ...
• Providing guidance when
faced with a stream of
decisions or uncertainties
• Data strategy most usefully
articulates how data can be
best used to support
organizational strategy
• This usually involves a
balance of remediation
and proactive measures
© Copyright 2022 by Peter Aiken Slide # 61
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide #
https://anythingawesome.com
Program
62
Of What Do Your
Data
Warehousing
Operations
Consist?
• Definitions
– Data Warehousing
– DM BoK guidance
– Legacy to digital conversion (cloud)
• Integration
– Emphasize engineering talent
– Incorporate leading not bleeding edge
– Requires an adaptive rather than a prescriptive approach
• Preparation
– Emphasis on storytelling first and visualization second
– Analytics is both ubiquitous and not well understood
– Keep improvements practically focused by strategy
• Best Practices
– Cannot use what is not understood-understand what you have
– PDCA (plan, do, check, act)
– Iteratively refine/cull with respect to strategic direction
• Take Aways/References/Q&A
From: How shall we build this data warehouse?
• (or worse) … What should go into this warehouse?
To: How can warehousing capabilities solve this business challenge?
• (better still) … How can warehousing capabilities
solve this class of business challenges?
Other examples
• Are you ready for warehousing?
– Foundational practices
– Project deliverables
• Will you get it right the first time?
– Is the business environment constantly evolving?
– Do you have an agreed upon enterprise-wide vocabulary?
• Is your data warehouse intended to be the
enterprise audit-able system of record?
– Extract, transform and load requirements
– Data transformation requirements
– How fast do you need results?
– Performance of inserts vs reads
Reframing the question
© Copyright 2022 by Peter Aiken Slide # 63
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide # 64
https://anythingawesome.com
Analytics
Business Analytics, or simply analytics, is:
• is the use of data, information technology,
statistical analysis, quantitative methods,
and mathematical or computer-based
models to help managers gain improved
insight about their business operations and
make better, fact-based decisions.
• is “a process of transforming data into
actions through analysis and insights in the
context of organizational decision making
and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-ins,
commercial statistical software package
such as SAS or Minitab and more complete
business intelligence suites that integrate
data with analytical software
© Copyright 2022 by Peter Aiken Slide # 65
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics
© Copyright 2022 by Peter Aiken Slide # 66
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
• is the use of data, information technology,
statistical analysis, quantitative methods,
and mathematical or computer-based
models to help managers gain improved
insight about their business operations and
make better, fact-based decisions.
• is “a process of transforming data into
actions through analysis and insights in the
context of organizational decision making
and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-ins,
commercial statistical software package
such as SAS or Minitab and more complete
business intelligence suites that integrate
data with analytical software
Analytics
• is the use of data, hardware/software,
statistical analysis, quantitative methods,
and mathematical or computer-based
models to help managers gain improved
insight about their business operations and
make better, fact-based decisions.
• is “a process of transforming data into
actions through analysis and insights in the
context of organizational decision making
and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-ins,
commercial statistical software package
such as SAS or Minitab and more complete
business intelligence suites that integrate
data with analytical software
© Copyright 2022 by Peter Aiken Slide # 67
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics
• is the use of data, hardware/software,
quantitative methods, and
mathematical or computer-based models
to help managers gain improved insight
about their business operations and
make better, fact-based decisions.
• is “a process of transforming data into
actions through analysis and insights in
the context of organizational decision
making and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-
ins, commercial statistical software
package such as SAS or Minitab and
more complete business intelligence
suites that integrate data with analytical
software
© Copyright 2022 by Peter Aiken Slide # 68
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics
• is the use of data, hardware/software,
quantitative methods, and models
to help managers gain improved insight
about their business operations and
make better, fact-based decisions.
• is “a process of transforming data into
actions through analysis and insights in
the context of organizational decision
making and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-
ins, commercial statistical software
package such as SAS or Minitab and
more complete business intelligence
suites that integrate data with analytical
software
© Copyright 2022 by Peter Aiken Slide # 69
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics
• is the use of data, hardware/software,
quantitative methods, and models
to help managers make better
decisions.
• is “a process of transforming data into
actions through analysis and insights in
the context of organizational decision
making and problem solving.”
• supported by various tools such as
Microsoft Excel and various Excel add-
ins, commercial statistical software
package such as SAS or Minitab and
more complete business intelligence
suites that integrate data with analytical
software
© Copyright 2022 by Peter Aiken Slide # 70
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics
• is the use of data, hardware/software,
quantitative methods, and models
to help managers make better decisions.
• the process of transforming data into
actions through analysis to solve problems
• supported by various tools such as
Microsoft Excel and various Excel add-ins,
commercial statistical software package
such as SAS or Minitab and more complete
business intelligence suites that integrate
data with analytical software
© Copyright 2022 by Peter Aiken Slide # 71
https://anythingawesome.com
Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
Analytics (31 words almost ⅓ of original)
• is the use of data, hardware/software,
quantitative methods, and models
to help managers make better decisions-
the process of transforming data into
actions through analysis to solve problems
using tools
© Copyright 2022 by Peter Aiken Slide # 72
https://anythingawesome.com
Data Action
Analysis
Business context
is missing
© Copyright 2022 by Peter Aiken Slide # 73
https://anythingawesome.com
Data
Analysis
Understanding/Data Architectures: here, whether you like it or not
© Copyright 2022 by Peter Aiken Slide # 74
https://anythingawesome.com
deviantart.com
• All organizations have architectures
– Business
– Process
– Systems
– Security
– Technical
– Data/Information
• Some are better understood and
documented (and therefore more
useful to the organization) than others
• A specific definition
– 'Understanding an architecture'
– Documented and articulated as a (digital)
blueprint illustrating the commonalities
and interconnections among the
architectural components
– Ideally the understanding
is shared by systems and humans
Data Vault Implementation
© Copyright 2022 by Peter Aiken Slide # 75
https://anythingawesome.com
This is always the best starting architecture-
especially if the ultimate outcome is uncertain
Data Vault
© Copyright 2022 by Peter Aiken Slide # 76
https://anythingawesome.com
Bukhantsov.org
• Designed to facilitate long-term historical storage, focusing on
ease of implementation
• Retains data lineage information (source/date)
• “All the data, all the time” - hybrid approach of Inmon and Kimball.
• Comprised of
- Hubs (which contain a list of business keys that do not change)
- Links (Associations/transactions between hubs)
- Satellites (descriptive attributes associated with hubs and links)
Data Vault Pros and Cons
© Copyright 2022 by Peter Aiken Slide # 77
https://anythingawesome.com
• Pros
- Simple integration
- Houses immense amounts of
data with excellent performance
- Full data lineage captured
• Cons
- Complication is pushed to the
“back end”
- Can be difficult to setup for
many data workers
- No widespread support for ETL
tools yet
Initial Design
Starting Point
Comparison
© Copyright 2022 by Peter Aiken Slide # 78
https://anythingawesome.com
Source:http://dmreview.com/article_sub.cfm?articleID=1000941
Meta Data Models
© Copyright 2022 by Peter Aiken Slide # 79
https://anythingawesome.com
Marco & Jennings's Metadata Model
© Copyright 2022 by Peter Aiken Slide # 80
https://anythingawesome.com
ETL Metadata Data Model
SCREEN
ELEMENT
screen element id #
data item id #
screen element descr.
INTERFACE
ELEMENT
interface element id #
data item id #
interface element descr.
INPUT
ELEMENT
input element id #
data item id #
input element descr.
OUTPUT
ELEMENT
output element id #
data item id #
output element descr.
MODEL
VIEW
model view element id #
data item id #
model view element des.
DEPENDENCY
dependency elem id #
data item id #
process id #
dependency description
CODE
code id #
data item id #
stored data item #
code location
INFORMATION
information id #
data item id #
information descr.
information request
PROCESS
process id #
data item id #
process description
USER TYPE
user type id #
data item id #
information id #
user type description
LOCATION
location id #
information id #
printout element id #
process id #
stored data items id #
user type id #
location description
PRINTOUT
ELEMENT
printout element id #
data item id #
printout element descr.
STORED DATA ITEM
stored data item id #
data item id #
location id #
stored data description
DATA ITEM
data item id #
data item description
© Copyright 2022 by Peter Aiken Slide # 81
https://anythingawesome.com
https://anythingawesome.com/reverseengineeringofdata.html
Warehouse
Process
Warehouse
Operation
Transformation
XML
Record-
Oriented
Multi
Dimensional
Relational
Business
Information
Software
Deployment
ObjectModel
(Core, Behavioral, Relationships, Instance)
Warehouse
Management
Resources
Analysis
Object-
Oriented
(ObjectModel)
Foundation
OLAP
Data
Mining
Information
Visualization
Business
Nomenclature
Data
Types
Expressions
Keys
Index
Type
Mapping
Overview of CWM Metamodel
http://www.omg.org/technology/documents/modeling_spec_catalog.htm
© Copyright 2022 by Peter Aiken Slide # 82
https://anythingawesome.com
https://giphy.com/gifs/xUPGcoTKhfDvStjxMA/html5
Focus on Knowledge Worker Productivity
1. Remove 1 click from some repetitive process
2. Tally the number
of clicks
3. Accurately describe the impact of the potential improvement
© Copyright 2022 by Peter Aiken Slide # 83
https://anythingawesome.com
https://giphy.com/gifs/latenightseth-lol-seth-meyers-lnsm-l0XtbC8EniiuwAEOQn
https://tenor.com/search/counting-money-gifs
• A management paradigm that views any
manageable system as being limited in
achieving more of its goals by a small
number of constraints (Eliyahu M. Goldratt)
• There is always at least one constraint, and
TOC uses a focusing process to identify the
constraint and restructure the rest of the
organization to address it
• TOC adopts the
common idiom "a
chain is no stronger
than its weakest link,"
processes, organizations, etc., are
vulnerable because the weakest
component can damage or break them or
at least adversely affect the outcome
© Copyright 2022 by Peter Aiken Slide # 84
https://anythingawesome.com
https://en.wikipedia.org/wiki/Theory_of_constraints
(TOC)
The PDCA Cycle
© Copyright 2022 by Peter Aiken Slide # 85
https://anythingawesome.com
• Deming cycle
• "Plan-do-study-act" or
"plan-do-check-act"
– Identifying data issues that are
critical to the achievement of
business objectives
– Defining business requirements
for data
– Identifying key data items
– Defining business rules critical
to ensuring correct data
https://deming.org/explore/pdsa/
How to Tell Your Story with Data
© Copyright 2022 by Peter Aiken Slide # 86
https://anythingawesome.com
https://visage.co/how-to-tell-a-story-with-data/
1. Find an irresistible
story
2. Remember your
audience
3. Research to find
data
4. Vet your data
sources
5. Filter your findings
6. Decide on a data
visualization
7. Craft the story
8. Gather feedback
© Copyright 2022 by Peter Aiken Slide # 87
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide #
Data Mapping
12
Mental
illness
Deploy
ments
Work
History
Soldier Legal
Issues
Abuse
Suicide
Analysis
FAP
DMSS G1 DMDC CID
Data objects
complete?
All sources
identified?
Best source for
each object?
How reconcile
differences
between
sources?
MDR
88
https://anythingawesome.com
z
© Copyright 2022 by Peter Aiken Slide #
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
89
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide # 90
https://anythingawesome.com
Senior Army Official
• Room full of Stewards
• A very heavy dose of management support
• Advised the group of his opinion on the matter
• Any questions as to future direction
– "They should make an appointment to speak directly with me!"
• Empower the team
– The conversation turned from "can this be done?" to
"how are we going to accomplish this?"
– Mistakes along the way would be tolerated
– Implement a workable solution in prototype form
© Copyright 2022 by Peter Aiken Slide # 91
https://anythingawesome.com
© Copyright 2022 by Peter Aiken Slide # 92
https://anythingawesome.com
6 Jan
Parler Users at the Capitol (Photo metadata)
© Copyright 2022 by Peter Aiken Slide # 93
https://anythingawesome.com
https://gizmodo.com/parler-users-breached-deep-inside-u-s-capitol-building-1846042905?rev=1610480731991
Parler Users at the Capitol (Multimedia)
© Copyright 2022 by Peter Aiken Slide # 94
https://anythingawesome.com
https://thepatr10t.github.io/yall-Qaeda/
Videos Timeline
© Copyright 2022 by Peter Aiken Slide # 95
https://anythingawesome.com
https://projects.propublica.org/parler-capitol-videos/
Illegally Obtained Data
© Copyright 2022 by Peter Aiken Slide # 96
https://anythingawesome.com
https://www.nytimes.com/2021/02/05/opinion/capitol-attack-cellphone-data.html
https://www.washingtonpost.com/politics/2021/01/12/cybersecurity-202-parler-scrape-puts-some-capitol-rioters-legal-jeopardy/
© Copyright 2022 by Peter Aiken Slide # 97
https://anythingawesome.com
• As of Oct 2022
– 880 charged
– 272 assaulting or impeding officers
– 95 charged with using a deadly or dangerous weapons
– 294 felony obstruction of an official proceeding
– 100 have pleaded guilty to felonies
– 313 have pleaded guilty to misdemeanors
– 21 found guilty at trial
– 360 not yet found
https://abcnews.go.com/Politics/jan-prosecutions-numbers-21-months-880-charged/story?id=91391596
(16) Causes of Data Warehousing Failure
• The project is over budget
• Slipped schedule
• Unimplemented functions
and capabilities
• Unhappy users
• Unacceptable performance
• Poor availability
• Inability to expand
• Poor quality data/reports
• Too complicated for users
• Project not cost justified
• Poor quality data
• Many more values of gender code than (M/F)
• Incorrectly structured data
• Provides correct answer to wrong question
• Bad warehouse design
• Overly complex
© Copyright 2022 by Peter Aiken Slide # 98
https://anythingawesome.com
from The Data Administration Newsletter, www.tdan.com and The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
X
Warehouse Requirements are Largely Use Case Driven
Use Case
• A usage scenario for a piece of
software; often used in the plural
to suggest situations where a
piece of software
may be useful.
• A potential scenario in which a
system receives an external
request (such as user input) and
responds to it.
https://en.wikipedia.org/wiki/Use_case
• Difficult to holistically evaluate
without integrated glossary
• Unable to capture
non-functional
requirements
• The plan for implementing non-
functional requirements is
detailed in the system
architecture, because they are
usually architecturally significant
requirements.
• The average data warehouse is
rebuilt 7 times before it is
considered useful © Copyright 2022 by Peter Aiken Slide # 99
https://anythingawesome.com
Trusted Catalog/Glossary/Dictionary/Encyclopedia
Upcoming Events
A 12-Step Program for Improving Organizational Data Literacy
13 December 2022 (08:30 AM - 11:45 AM)
Data Management Best Practices
13 December 2022
Data Strategy Best Practices
10 January 2023
Data Modeling Fundamentals
14 February 2023
© Copyright 2022 by Peter Aiken Slide # 100
https://anythingawesome.com
Brought to you by:
Time: 19:00 UTC (2:00 PM NYC) | Presented by: Peter Aiken, PhD
Note: In this .pdf, clicking any webinar title opens the registration link
Event Pricing
© Copyright 2022 by Peter Aiken Slide # 101
https://anythingawesome.com
• 20% off
directly from the publisher
on select titles
• My Book Store @
http://anythingawesome.com
• Enter the code
"anythingawesome" at the
Technics bookstore
checkout where it says to
"Apply Coupon"
anythingawesome
Peter.Aiken@AnythingAwesome.com +1.804.382.5957
Thank You!
© Copyright 2022 by Peter Aiken Slide # 102
Book a call with Peter to discuss anything - https://anythingawesome.com/OfficeHours.html
Critical Design Review?
Hiring Assistance?
Reverse Engineering Expertise?
Executive Data
Literacy Training?
Mentoring?
Tool/automation evaluation?
Use your data more strategically?
Independent
Verification
&
Validation
Collaboration?

Más contenido relacionado

Similar a What’s in Your Data Warehouse?

Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality Right
DATAVERSITY
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
IngridBuenaventura
 
DataEd Slides: Expressing Data Improvements as Business Outcomes
DataEd Slides: Expressing Data Improvements as Business OutcomesDataEd Slides: Expressing Data Improvements as Business Outcomes
DataEd Slides: Expressing Data Improvements as Business Outcomes
DATAVERSITY
 

Similar a What’s in Your Data Warehouse? (20)

Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality Right
 
Logical Data Fabric: Maturing Implementation from Small to Big (APAC)
Logical Data Fabric: Maturing Implementation from Small to Big (APAC)Logical Data Fabric: Maturing Implementation from Small to Big (APAC)
Logical Data Fabric: Maturing Implementation from Small to Big (APAC)
 
Data Architecture vs Data Modeling
Data Architecture vs Data ModelingData Architecture vs Data Modeling
Data Architecture vs Data Modeling
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
 
Data-Centric Approach for Project Delivery
Data-Centric Approach for Project DeliveryData-Centric Approach for Project Delivery
Data-Centric Approach for Project Delivery
 
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)
 
DataEd Slides: Expressing Data Improvements as Business Outcomes
DataEd Slides: Expressing Data Improvements as Business OutcomesDataEd Slides: Expressing Data Improvements as Business Outcomes
DataEd Slides: Expressing Data Improvements as Business Outcomes
 
What Is My Enterprise Data Maturity 2021
What Is My Enterprise Data Maturity 2021What Is My Enterprise Data Maturity 2021
What Is My Enterprise Data Maturity 2021
 
Why Data Modeling Is Fundamental
Why Data Modeling Is FundamentalWhy Data Modeling Is Fundamental
Why Data Modeling Is Fundamental
 
Sami patel full_resume
Sami patel full_resumeSami patel full_resume
Sami patel full_resume
 
DataEd Slides: Data Architecture versus Data Modeling
DataEd Slides:  Data Architecture versus Data ModelingDataEd Slides:  Data Architecture versus Data Modeling
DataEd Slides: Data Architecture versus Data Modeling
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
 
Architecting for the Cloud with TOGAF®
Architecting for the Cloud with TOGAF®Architecting for the Cloud with TOGAF®
Architecting for the Cloud with TOGAF®
 

Más de DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

Más de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 

Último (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

What’s in Your Data Warehouse?

  • 1. What Is In Your Dataaa Warehouse? © Copyright 2022 by Peter Aiken Slide # 1 peter.aiken@anythingawesome.com +1.804.382.5957 Peter Aiken, PhD © Copyright 2022 by Peter Aiken Slide # https://anythingawesome.com Program 2 Of What Do Your Data Warehousing Operations Consist? • Definitions – Data Warehousing – DM BoK guidance – Legacy to digital conversion (cloud) • Integration – Emphasize engineering talent – Incorporate leading not bleeding edge – Requires an adaptive rather than a prescriptive approach • Preparation – Emphasis on storytelling first and visualization second – Analytics is both ubiquitous and not well understood – Keep improvements practically focused by strategy • Best Practices – Cannot use what is not understood-understand what you have – PDCA (plan, do, check, act) – Iteratively refine/cull with respect to strategic direction • Take Aways/References/Q&A
  • 2. Process Data Warehousing IPO Diagram © Copyright 2022 by Peter Aiken Slide # 3 https://anythingawesome.com Inputs Outputs Warehoused Data Marts Diffuse Applications ETL-Extract Transform Load A major source of data structure/ transformation knowledge © Copyright 2022 by Peter Aiken Slide # Metadata Management 4 https://anythingawesome.com Data Management Body of Knowledge (DM BoK V2) from The DAMA Guide to the Data Management Body of Knowledge 2E © 2017 by DAMA International 11 Practice Areas
  • 3. Data Warehousing & Business Intelligence Management © Copyright 2022 by Peter Aiken Slide # 5 https://anythingawesome.com Defining Data Warehousing, BI/Analytics • Data Warehousing – A technology solution supporting … business capabilities such as: query, analysis, reporting and development of these capabilities – Analysis of information not previously integrated – Another, often new, set of organizational capabilities • Business Intelligence (aka. decision support) – Dates at least to 1958 – Support better business decision making – Technologies, applications and practices for the collection, integration, analysis, and presentation of business information – Understanding historical patterns in data to improve future performance – Use of mathematics in business • Analytics (aka.) enterprise decision management, marketing analytics, predictive science, strategy science, credit risk analysis. fraud analytics - often based on computational modeling • Reframing the question … – From: what data warehouse should we build? – To: how can data warehouse-based integration address challenges? © Copyright 2022 by Peter Aiken Slide # 6 https://anythingawesome.com From The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 4. Payroll Application (3rd GL) Payroll Data (database) R& D Applications (researcher supported, no documentation) R & D Data (raw) Mfg. Data (home grown database) Mfg. Applications (contractor supported) Finance Data (indexed) Finance Application (3rd GL, batch system, no source) Marketing Application (4rd GL, query facilities, no reporting, very large) Marketing Data (external database) Personnel App. (20 years old, un-normalized data) Personnel Data (database) Typical System Evolution © Copyright 2022 by Peter Aiken Slide # 7 https://anythingawesome.com Multiple Sources of (for example) Customer Data Payroll Data (database) R & D Data (raw) Mfg. Data (home grown database) Finance Data (indexed) Marketing Data (external database) Personnel Data (database) ... Then Integrate © Copyright 2022 by Peter Aiken Slide # 8 https://anythingawesome.com Organizational Data
  • 5. Payroll Data (database) R & D Data (raw) Mfg. Data (home grown database) Finance Data (indexed) Marketing Data (external database) Personnel Data (database) ... Then Re-architect © Copyright 2022 by Peter Aiken Slide # 9 https://anythingawesome.com Organizational Data Linked Data © Copyright 2022 by Peter Aiken Slide # 10 https://anythingawesome.com Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." linkeddata.org
  • 6. © Copyright 2022 by Peter Aiken Slide # 11 https://anythingawesome.com • EII Enterprise Information Integration – between ETL and EAI - delivers tailored views of information to users at the time that it is required MetaMatrix Integration Example AWS Redshift © Copyright 2022 by Peter Aiken Slide # 12 https://anythingawesome.com
  • 7. Cloud Definition • Location-independent computing, whereby shared servers provide resources, software, and data to computers and other devices on demand, as with the electricity grid. • Cloud computing is a natural evolution of the widespread adoption of virtualization, service-oriented architecture and utility computing. • Details are abstracted from consumers, who no longer have need for expertise in, or control over, the technology infrastructure "in the cloud" that supports them. • Location-independent computing, whereby shared servers provide resources, software, and data to computers and other devices on demand, as with the electricity grid. • Cloud computing is a natural evolution of the widespread adoption of virtualization, service-oriented architecture and utility computing. • Details are abstracted from consumers, who no longer have need for expertise in, or control over, the technology infrastructure "in the cloud" that supports them. © Copyright 2022 by Peter Aiken Slide # 13 https://anythingawesome.com Cloud Computing Cloud Scalability © Copyright 2022 by Peter Aiken Slide # 14 https://anythingawesome.com
  • 8. Cisco's Ladder to the Cloud © Copyright 2022 by Peter Aiken Slide # 15 https://anythingawesome.com Cloud Options © Copyright 2022 by Peter Aiken Slide # 16 https://anythingawesome.com
  • 9. Data in the cloud should have three attributes that data outside the cloud/warehouse should not have. It should be: Sharable-er © Copyright 2022 by Peter Aiken Slide # 17 https://anythingawesome.com Cleaner Smaller Transform Problems with forklifting 1. no basis for decisions made 2. no inclusion of architecture/ engineering concepts 3. no idea that these concepts are missing from the process 4. 80% of organizational data is ROT © Copyright 2022 by Peter Aiken Slide # 18 https://anythingawesome.com Less Cleaner More shareable ... data Making Warehousing Successful -—Cloud——
  • 10. Fixing Data in the Cloud is Like Using A Glovebox © Copyright 2022 by Peter Aiken Slide # 19 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # https://anythingawesome.com Program 20 Of What Do Your Data Warehousing Operations Consist? • Definitions – Data Warehousing – DM BoK guidance – Legacy to digital conversion (cloud) • Integration – Emphasize engineering talent – Incorporate leading not bleeding edge – Requires an adaptive rather than a prescriptive approach • Preparation – Emphasis on storytelling first and visualization second – Analytics is both ubiquitous and not well understood – Keep improvements practically focused by strategy • Best Practices – Cannot use what is not understood-understand what you have – PDCA (plan, do, check, act) – Iteratively refine/cull with respect to strategic direction • Take Aways/References/Q&A
  • 11. © Copyright 2022 by Peter Aiken Slide # https://anythingawesome.com Program 21 Of What Do Your Data Warehousing Operations Consist? • Definitions – Data Warehousing – DM BoK guidance – Legacy to digital conversion (cloud) • Integration – Emphasize engineering talent – Incorporate leading not bleeding edge – Requires an adaptive rather than a prescriptive approach • Preparation – Emphasis on storytelling first and visualization second – Analytics is both ubiquitous and not well understood – Keep improvements practically focused by strategy • Best Practices – Cannot use what is not understood-understand what you have – PDCA (plan, do, check, act) – Iteratively refine/cull with respect to strategic direction • Take Aways/References/Q&A Two Basic Warehouse Purposes • Integration – of disparate data sources for purpose of potential subsequent analyses – Most organization data is never analyzed – Same type of inputs as output – Downsteam knowledge is incorporated upstream • Preparation – Seen as the last mile (preparation) of the data before presentation as part of decision making activities – Closed ended activities – Final possible application of programmatic quality measures © Copyright 2022 by Peter Aiken Slide # 22 https://anythingawesome.com Integration Choose One Only!
  • 12. How many interfaces are required to solve this integration problem? © Copyright 2022 by Peter Aiken Slide # Application 4 Application 5 Application 6 Application 1 Application 2 Application 3 RBC: 200 applications - 4900 batch interfaces 23 https://anythingawesome.com Application 4 Application 5 Application 6 Application 1 Application 2 Application 3 15 Interfaces (N*(N-1))/2 The rapidly increasing cost of complexity © Copyright 2022 by Peter Aiken Slide # 0 10000 20000 1 101 201 Number of Silos Worst case number of interconnections 24 https://anythingawesome.com Silos / Interconnections • 6 / 15 • 60 / 1,770 • 600 / 179,700 • 200 / 19,900 • 200 / 5,000 (actual) X=5,000
  • 13. © Copyright 2022 by Peter Aiken Slide # 25 https://anythingawesome.com from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International Corporate Information Factory Architecture Warehousing applied to a specific challenge © Copyright 2022 by Peter Aiken Slide # 26 https://anythingawesome.com http://www.infosys.com/industries/healthcare/industryofferings/Pages/healthcare-data-warehousing.aspx
  • 14. OPERATIONAL SYSTEM OPERATIONAL SYSTEM FLAT FILES SUM MARY DATA RAW DATA M ETADATA PURCHASING SALES INVENTORY ANALYSIS REPORTING MINING Inmon Implementation/3NF © Copyright 2022 by Peter Aiken Slide # 27 https://anythingawesome.com "A subject oriented, integrated, time variant, and non-volatile collection of summary and detailed historical data used to support the strategic decision-making processes of the organization." Third Normal Form • Each attribute in the relationship is a fact about a key – Highly normalized structure • Use Cases – Transactional Systems – Operational Data Stores © Copyright 2022 by Peter Aiken Slide # 28 https://anythingawesome.com
  • 15. Third Normal Form: Pros and Cons © Copyright 2022 by Peter Aiken Slide # 29 https://anythingawesome.com Neo4j.com • Pros – Easily understood by business and end users – Reduced data redundancy – Enforced referential integrity – Indexed attributes/flexible querying • Cons – Joins can be expensive – Does not scale Engineering Architecture Engineering/Architecting Relationship • Architecting is used to create and build systems too complex to be treated by engineering analysis alone – Require technical details as the exception • Engineers develop the technical designs – Engineering/Crafts-persons deliver components supervised by: • Manufacturer • Building Contractor © Copyright 2022 by Peter Aiken Slide # 30 https://anythingawesome.com
  • 16. © Copyright 2022 by Peter Aiken Slide # 31 https://anythingawesome.com • Solution design is not based on semantic understanding • Focus is often critical due to speed issues • AI/ML focus on learning how the system performs and how to improve it Emphasize Engineering Talent Target Isn't Just Predicting Pregnancies © Copyright 2022 by Peter Aiken Slide # 32 https://anythingawesome.com http://rmportal.performedia.com/node/1373 and http://www.predictiveanalyticsworld.com/patimes/target-really-predict-teens-pregnancy-inside-story/ http://rmportal.performedia.com/rm/paw10/gallery_01#1373
  • 17. Where To Focus Limited Analytics Resources? © Copyright 2022 by Peter Aiken Slide # 33 https://anythingawesome.com Warehoused Data Marts Diffuse Applications Data Management Practices Duplicated but ETLed Data (quality & transformations applied) Learning/ Feedback Analytics Practices Mission • Advance change in America by ensuring equitable access to nutritious food for all in partnership with food banks, policymakers, supporters, and the communities we serve • Great understanding and skill executing rapid response to local needs • What else could be done with their data? • Rerouting busses = eliminating food deserts for thousands of residents © Copyright 2022 by Peter Aiken Slide # 34 https://anythingawesome.com
  • 18. © Copyright 2022 by Peter Aiken Slide # https://anythingawesome.com Program 35 Of What Do Your Data Warehousing Operations Consist? • Definitions – Data Warehousing – DM BoK guidance – Legacy to digital conversion (cloud) • Integration – Emphasize engineering talent – Incorporate leading not bleeding edge – Requires an adaptive rather than a prescriptive approach • Preparation – Emphasis on storytelling first and visualization second – Analytics is both ubiquitous and not well understood – Keep improvements practically focused by strategy • Best Practices – Cannot use what is not understood-understand what you have – PDCA (plan, do, check, act) – Iteratively refine/cull with respect to strategic direction • Take Aways/References/Q&A © Copyright 2022 by Peter Aiken Slide # 36 https://anythingawesome.com Business Intelligence
  • 19. © Copyright 2022 by Peter Aiken Slide # 37 https://anythingawesome.com • Users can "drill" anywhere • Entire collection "cube" is accessible • Summaries to transaction-level detail Basics Sample questions … © Copyright 2022 by Peter Aiken Slide # 38 https://anythingawesome.com Cancer patient revenue across all facilities Revenue for diseases this year versus last year in the NE region Total costs and revenue at top 10 facilities • Emphasis on the "cube" • 'N' dimensions • Permits different users to "slice and dice" subsets of data • Viewing from different perspectives
  • 20. Example: Set Analysis © Copyright 2022 by Peter Aiken Slide # 39 https://anythingawesome.com Portfolio Analysis • Bank accounts are of varying value and risk • Cube by – Social status – Geographical location – Net value, etc. • Strategy: balance return on the loan with risk of default • How to evaluate the portfolio as a whole? – Least risk loan may be to the very wealthy, but there are a very limited number – Many poor customers, but greater risk • Solution may combine types of analyses – When to lend? – Interest rate charged? © Copyright 2022 by Peter Aiken Slide # 40 https://anythingawesome.com
  • 21. Kimball Implementation/Dimensional © Copyright 2022 by Peter Aiken Slide # 41 https://anythingawesome.com "A copy of transaction data specifically structured for query and analysis." • Comprised of “fact tables” that contain quantitative data, and any number of adjoining “dimension” tables • Optimized for business reporting • Use Cases – OLAP (Online Analytic Processing) – BI Star Schema © Copyright 2022 by Peter Aiken Slide # 42 https://anythingawesome.com Wikipedia
  • 22. Star Schema Pros and Cons • Pros – Simple Design – Fast Queries – Most major DBMS are optimized for Star Schema Designs • Cons – Questions must be built into the design – Data marts are often centralized on one fact table © Copyright 2022 by Peter Aiken Slide # 43 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # 44 https://anythingawesome.com • Solution design is based on semantic understanding • Broad category ranging from CRM to analytics • AI/ML focus on data exploration Emphasize Data Architecture Talent
  • 23. Supply/demand for data talent https://www.logianalytics.com/bi-trends/3-keys-understanding-data/ Growth of Data vs. Growth of Data Analysts • Stored data accumulating at 28% annual growth rate • Data analysts in workforce growing at 5.7% growth rate © Copyright 2022 by Peter Aiken Slide # 45 https://anythingawesome.com DataGrowth Data Analysis Capabilities Growth © Copyright 2022 by Peter Aiken Slide # 46 https://anythingawesome.com 0% 25% 50% 75% 100% Data Analysis Data Preparation 20% 80% Everyone wants to do better data analysis … • Some data preparation is inevitable • What would a 'good' ratio be? • "Everyone knows"
  • 24. © Copyright 2022 by Peter Aiken Slide # 47 https://anythingawesome.com 2020 American Airlines market value ~ $6b AAdvantage valued between $19.5-$31.5b 2020 United market value ~ $9b MileagePlus ~ $22b https://www.forbes.com/sites/advisor/2020/07/15/how-airlines-make-billions-from-monetizing-frequent-flyer-programs/?sh=66da87a614e9 Simple Business Case • "Knowledge workers" – The Landmarks of Tomorrow (1957) by Peter Drucker – Examples include programmers, physicians, pharmacists, architects, engineers, scientists, design thinkers, public accountants, lawyers, and academics, and any other white-collar workers, whose line of work requires one to "think for a living". – Think about inputs and make decisions based on the data and their processes © Copyright 2022 by Peter Aiken Slide # 48 https://anythingawesome.com
  • 25. When asked to incorporate data • Data appreciation isn't translating into employee adoption – 48% frequently make gut decisions – 66% for C-suite executives • Lack of data skills is limiting workplace productivity – 36% said they would find an alternative method to complete the task without using data – 14 percent would avoid the task entirely © Copyright 2022 by Peter Aiken Slide # 49 https://anythingawesome.com http://TheDataLiteracyProject.org Avoid Task Entirely 14% Find Alt. Method 36% Incorporate Data 50% Too many organizations have simply put data in the hands of employees and expected them to make a success of it Incorporate Data 52% Defer to Gut 48% Incorporate Data 34% Defer to Gut 66% Data is not broadly or widely understood © Copyright 2022 by Peter Aiken Slide # 50 https://anythingawesome.com adapted from: http://www.dailymirror.lk/print/opinion/editorial-we-need-to-become-channels-of-peace/172-27164 It is like a fan! It is like a snake! It is like a wall! It is like a rope! It is like a tree! Blind Persons and the Elephant It is like a story! It is like a dashboard! It is like a pipes! It is like a warehouse! It is like statistics!
  • 26. Level 5 Level 4 Level 3 Level 2 Level 1 Data professional Data teacher Knowledge worker Adult data spreader Mobile data spreader Data Literacy Levels © Copyright 2022 by Peter Aiken Slide # Acumen-DP Acumen-DT Acumen-KW Proficient Literate 51 https://anythingawesome.com Everything here is required of data scientists and cyber professionals 1,000,000,000 Level 3–KW–Citizen Data Knowledge Areas • Elevator story • Data stewardship • Demonstrating value • Currency • Fiduciary responsibilities • Shared fate – https://www.techtarget.com/searchenterpriseai/definition/data-scientist © Copyright 2022 by Peter Aiken Slide # 52 https://anythingawesome.com
  • 27. Data Things Happen Organizational Things Happen More work required to incorporate greater focus © Copyright 2022 by Peter Aiken Slide # 53 https://anythingawesome.com ≈ ≈ ≈ ≈ ≈ ≈ X $ X $ X $ X $ X $ X $ X $ X $ X $ What is Strategy? • Current use derived from military - a pattern in a stream of decisions [Henry Mintzberg] © Copyright 2022 by Peter Aiken Slide # 54 https://anythingawesome.com A thing
  • 28. Every Day Low Price Former Walmart Business Strategy © Copyright 2022 by Peter Aiken Slide # 55 https://anythingawesome.com Wayne Gretzky’s Strategy © Copyright 2022 by Peter Aiken Slide # He skates to where he thinks the puck will be ... 56 https://anythingawesome.com
  • 29. Strategy Example 3 © Copyright 2022 by Peter Aiken Slide # Good Guys (Us) Bad Guys (Them) 57 https://anythingawesome.com Strategy Example 3 © Copyright 2022 by Peter Aiken Slide # Good Guys (Us) Bad Guys (Them) 58 https://anythingawesome.com
  • 30. Strategy Example 3 © Copyright 2022 by Peter Aiken Slide # Good Guys (Us) Bad Guys (Them) 59 https://anythingawesome.com Strategy Guides Workgroup Activities © Copyright 2022 by Peter Aiken Slide # A pattern in a stream of decisions 60 https://anythingawesome.com
  • 31. Your Data Strategy • Highest level data guidance available ... • Focusing data activities on business-goal achievement ... • Providing guidance when faced with a stream of decisions or uncertainties • Data strategy most usefully articulates how data can be best used to support organizational strategy • This usually involves a balance of remediation and proactive measures © Copyright 2022 by Peter Aiken Slide # 61 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # https://anythingawesome.com Program 62 Of What Do Your Data Warehousing Operations Consist? • Definitions – Data Warehousing – DM BoK guidance – Legacy to digital conversion (cloud) • Integration – Emphasize engineering talent – Incorporate leading not bleeding edge – Requires an adaptive rather than a prescriptive approach • Preparation – Emphasis on storytelling first and visualization second – Analytics is both ubiquitous and not well understood – Keep improvements practically focused by strategy • Best Practices – Cannot use what is not understood-understand what you have – PDCA (plan, do, check, act) – Iteratively refine/cull with respect to strategic direction • Take Aways/References/Q&A
  • 32. From: How shall we build this data warehouse? • (or worse) … What should go into this warehouse? To: How can warehousing capabilities solve this business challenge? • (better still) … How can warehousing capabilities solve this class of business challenges? Other examples • Are you ready for warehousing? – Foundational practices – Project deliverables • Will you get it right the first time? – Is the business environment constantly evolving? – Do you have an agreed upon enterprise-wide vocabulary? • Is your data warehouse intended to be the enterprise audit-able system of record? – Extract, transform and load requirements – Data transformation requirements – How fast do you need results? – Performance of inserts vs reads Reframing the question © Copyright 2022 by Peter Aiken Slide # 63 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # 64 https://anythingawesome.com Analytics
  • 33. Business Analytics, or simply analytics, is: • is the use of data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 65 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4. Analytics © Copyright 2022 by Peter Aiken Slide # 66 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4. • is the use of data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software
  • 34. Analytics • is the use of data, hardware/software, statistical analysis, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 67 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4. Analytics • is the use of data, hardware/software, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add- ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 68 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
  • 35. Analytics • is the use of data, hardware/software, quantitative methods, and models to help managers gain improved insight about their business operations and make better, fact-based decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add- ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 69 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4. Analytics • is the use of data, hardware/software, quantitative methods, and models to help managers make better decisions. • is “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” • supported by various tools such as Microsoft Excel and various Excel add- ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 70 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4.
  • 36. Analytics • is the use of data, hardware/software, quantitative methods, and models to help managers make better decisions. • the process of transforming data into actions through analysis to solve problems • supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software package such as SAS or Minitab and more complete business intelligence suites that integrate data with analytical software © Copyright 2022 by Peter Aiken Slide # 71 https://anythingawesome.com Business Analytics: Methods, Models, and Decisions, Second Edition by James R. Evans, page 4. Analytics (31 words almost ⅓ of original) • is the use of data, hardware/software, quantitative methods, and models to help managers make better decisions- the process of transforming data into actions through analysis to solve problems using tools © Copyright 2022 by Peter Aiken Slide # 72 https://anythingawesome.com Data Action Analysis Business context is missing
  • 37. © Copyright 2022 by Peter Aiken Slide # 73 https://anythingawesome.com Data Analysis Understanding/Data Architectures: here, whether you like it or not © Copyright 2022 by Peter Aiken Slide # 74 https://anythingawesome.com deviantart.com • All organizations have architectures – Business – Process – Systems – Security – Technical – Data/Information • Some are better understood and documented (and therefore more useful to the organization) than others • A specific definition – 'Understanding an architecture' – Documented and articulated as a (digital) blueprint illustrating the commonalities and interconnections among the architectural components – Ideally the understanding is shared by systems and humans
  • 38. Data Vault Implementation © Copyright 2022 by Peter Aiken Slide # 75 https://anythingawesome.com This is always the best starting architecture- especially if the ultimate outcome is uncertain Data Vault © Copyright 2022 by Peter Aiken Slide # 76 https://anythingawesome.com Bukhantsov.org • Designed to facilitate long-term historical storage, focusing on ease of implementation • Retains data lineage information (source/date) • “All the data, all the time” - hybrid approach of Inmon and Kimball. • Comprised of - Hubs (which contain a list of business keys that do not change) - Links (Associations/transactions between hubs) - Satellites (descriptive attributes associated with hubs and links)
  • 39. Data Vault Pros and Cons © Copyright 2022 by Peter Aiken Slide # 77 https://anythingawesome.com • Pros - Simple integration - Houses immense amounts of data with excellent performance - Full data lineage captured • Cons - Complication is pushed to the “back end” - Can be difficult to setup for many data workers - No widespread support for ETL tools yet Initial Design Starting Point Comparison © Copyright 2022 by Peter Aiken Slide # 78 https://anythingawesome.com
  • 40. Source:http://dmreview.com/article_sub.cfm?articleID=1000941 Meta Data Models © Copyright 2022 by Peter Aiken Slide # 79 https://anythingawesome.com Marco & Jennings's Metadata Model © Copyright 2022 by Peter Aiken Slide # 80 https://anythingawesome.com
  • 41. ETL Metadata Data Model SCREEN ELEMENT screen element id # data item id # screen element descr. INTERFACE ELEMENT interface element id # data item id # interface element descr. INPUT ELEMENT input element id # data item id # input element descr. OUTPUT ELEMENT output element id # data item id # output element descr. MODEL VIEW model view element id # data item id # model view element des. DEPENDENCY dependency elem id # data item id # process id # dependency description CODE code id # data item id # stored data item # code location INFORMATION information id # data item id # information descr. information request PROCESS process id # data item id # process description USER TYPE user type id # data item id # information id # user type description LOCATION location id # information id # printout element id # process id # stored data items id # user type id # location description PRINTOUT ELEMENT printout element id # data item id # printout element descr. STORED DATA ITEM stored data item id # data item id # location id # stored data description DATA ITEM data item id # data item description © Copyright 2022 by Peter Aiken Slide # 81 https://anythingawesome.com https://anythingawesome.com/reverseengineeringofdata.html Warehouse Process Warehouse Operation Transformation XML Record- Oriented Multi Dimensional Relational Business Information Software Deployment ObjectModel (Core, Behavioral, Relationships, Instance) Warehouse Management Resources Analysis Object- Oriented (ObjectModel) Foundation OLAP Data Mining Information Visualization Business Nomenclature Data Types Expressions Keys Index Type Mapping Overview of CWM Metamodel http://www.omg.org/technology/documents/modeling_spec_catalog.htm © Copyright 2022 by Peter Aiken Slide # 82 https://anythingawesome.com
  • 42. https://giphy.com/gifs/xUPGcoTKhfDvStjxMA/html5 Focus on Knowledge Worker Productivity 1. Remove 1 click from some repetitive process 2. Tally the number of clicks 3. Accurately describe the impact of the potential improvement © Copyright 2022 by Peter Aiken Slide # 83 https://anythingawesome.com https://giphy.com/gifs/latenightseth-lol-seth-meyers-lnsm-l0XtbC8EniiuwAEOQn https://tenor.com/search/counting-money-gifs • A management paradigm that views any manageable system as being limited in achieving more of its goals by a small number of constraints (Eliyahu M. Goldratt) • There is always at least one constraint, and TOC uses a focusing process to identify the constraint and restructure the rest of the organization to address it • TOC adopts the common idiom "a chain is no stronger than its weakest link," processes, organizations, etc., are vulnerable because the weakest component can damage or break them or at least adversely affect the outcome © Copyright 2022 by Peter Aiken Slide # 84 https://anythingawesome.com https://en.wikipedia.org/wiki/Theory_of_constraints (TOC)
  • 43. The PDCA Cycle © Copyright 2022 by Peter Aiken Slide # 85 https://anythingawesome.com • Deming cycle • "Plan-do-study-act" or "plan-do-check-act" – Identifying data issues that are critical to the achievement of business objectives – Defining business requirements for data – Identifying key data items – Defining business rules critical to ensuring correct data https://deming.org/explore/pdsa/ How to Tell Your Story with Data © Copyright 2022 by Peter Aiken Slide # 86 https://anythingawesome.com https://visage.co/how-to-tell-a-story-with-data/ 1. Find an irresistible story 2. Remember your audience 3. Research to find data 4. Vet your data sources 5. Filter your findings 6. Decide on a data visualization 7. Craft the story 8. Gather feedback
  • 44. © Copyright 2022 by Peter Aiken Slide # 87 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # Data Mapping 12 Mental illness Deploy ments Work History Soldier Legal Issues Abuse Suicide Analysis FAP DMSS G1 DMDC CID Data objects complete? All sources identified? Best source for each object? How reconcile differences between sources? MDR 88 https://anythingawesome.com
  • 45. z © Copyright 2022 by Peter Aiken Slide # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 89 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # 90 https://anythingawesome.com
  • 46. Senior Army Official • Room full of Stewards • A very heavy dose of management support • Advised the group of his opinion on the matter • Any questions as to future direction – "They should make an appointment to speak directly with me!" • Empower the team – The conversation turned from "can this be done?" to "how are we going to accomplish this?" – Mistakes along the way would be tolerated – Implement a workable solution in prototype form © Copyright 2022 by Peter Aiken Slide # 91 https://anythingawesome.com © Copyright 2022 by Peter Aiken Slide # 92 https://anythingawesome.com 6 Jan
  • 47. Parler Users at the Capitol (Photo metadata) © Copyright 2022 by Peter Aiken Slide # 93 https://anythingawesome.com https://gizmodo.com/parler-users-breached-deep-inside-u-s-capitol-building-1846042905?rev=1610480731991 Parler Users at the Capitol (Multimedia) © Copyright 2022 by Peter Aiken Slide # 94 https://anythingawesome.com https://thepatr10t.github.io/yall-Qaeda/
  • 48. Videos Timeline © Copyright 2022 by Peter Aiken Slide # 95 https://anythingawesome.com https://projects.propublica.org/parler-capitol-videos/ Illegally Obtained Data © Copyright 2022 by Peter Aiken Slide # 96 https://anythingawesome.com https://www.nytimes.com/2021/02/05/opinion/capitol-attack-cellphone-data.html https://www.washingtonpost.com/politics/2021/01/12/cybersecurity-202-parler-scrape-puts-some-capitol-rioters-legal-jeopardy/
  • 49. © Copyright 2022 by Peter Aiken Slide # 97 https://anythingawesome.com • As of Oct 2022 – 880 charged – 272 assaulting or impeding officers – 95 charged with using a deadly or dangerous weapons – 294 felony obstruction of an official proceeding – 100 have pleaded guilty to felonies – 313 have pleaded guilty to misdemeanors – 21 found guilty at trial – 360 not yet found https://abcnews.go.com/Politics/jan-prosecutions-numbers-21-months-880-charged/story?id=91391596 (16) Causes of Data Warehousing Failure • The project is over budget • Slipped schedule • Unimplemented functions and capabilities • Unhappy users • Unacceptable performance • Poor availability • Inability to expand • Poor quality data/reports • Too complicated for users • Project not cost justified • Poor quality data • Many more values of gender code than (M/F) • Incorrectly structured data • Provides correct answer to wrong question • Bad warehouse design • Overly complex © Copyright 2022 by Peter Aiken Slide # 98 https://anythingawesome.com from The Data Administration Newsletter, www.tdan.com and The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International X
  • 50. Warehouse Requirements are Largely Use Case Driven Use Case • A usage scenario for a piece of software; often used in the plural to suggest situations where a piece of software may be useful. • A potential scenario in which a system receives an external request (such as user input) and responds to it. https://en.wikipedia.org/wiki/Use_case • Difficult to holistically evaluate without integrated glossary • Unable to capture non-functional requirements • The plan for implementing non- functional requirements is detailed in the system architecture, because they are usually architecturally significant requirements. • The average data warehouse is rebuilt 7 times before it is considered useful © Copyright 2022 by Peter Aiken Slide # 99 https://anythingawesome.com Trusted Catalog/Glossary/Dictionary/Encyclopedia Upcoming Events A 12-Step Program for Improving Organizational Data Literacy 13 December 2022 (08:30 AM - 11:45 AM) Data Management Best Practices 13 December 2022 Data Strategy Best Practices 10 January 2023 Data Modeling Fundamentals 14 February 2023 © Copyright 2022 by Peter Aiken Slide # 100 https://anythingawesome.com Brought to you by: Time: 19:00 UTC (2:00 PM NYC) | Presented by: Peter Aiken, PhD Note: In this .pdf, clicking any webinar title opens the registration link
  • 51. Event Pricing © Copyright 2022 by Peter Aiken Slide # 101 https://anythingawesome.com • 20% off directly from the publisher on select titles • My Book Store @ http://anythingawesome.com • Enter the code "anythingawesome" at the Technics bookstore checkout where it says to "Apply Coupon" anythingawesome Peter.Aiken@AnythingAwesome.com +1.804.382.5957 Thank You! © Copyright 2022 by Peter Aiken Slide # 102 Book a call with Peter to discuss anything - https://anythingawesome.com/OfficeHours.html Critical Design Review? Hiring Assistance? Reverse Engineering Expertise? Executive Data Literacy Training? Mentoring? Tool/automation evaluation? Use your data more strategically? Independent Verification & Validation Collaboration?