SlideShare una empresa de Scribd logo
1 de 90
Descargar para leer sin conexión
Agile Data Warehouse Design with Big Data

John DiPietro & Jim Stagnitto
!1
Agenda
•

Introduction / a2c Overview

•

Modeling for End Users

•

Role of Dimensional Models in Big Data

•

Example: eCommerce
•

Structured Data: Sales

•

Semi-structured Data: Clickstream

•

Agile Dimensional Modeling Overview

•

Case Study Review

•

Q&A

!2
Introduction
•

a2c
•

•

Data Warehousing

•

Master Data Management

•

Closed Look Analytics and Visualization

•
•

Boutique EDM (Enterprise Data Management)
consultancy firm:

Data & Application Architecture

John DiPietro
•

•

Principal, Chief Technology Officer

Jim Stagnitto
•

Data Warehouse & MDM Architect

!3
a2c Corporate Overview
& Industry Experience

!4
Company Overview
•

Technology Solution Consultancy headquartered in Philadelphia with
regional offices in New York and Boston

•

Servicing Healthcare, Life Science, Tel-Com and Financial Services
industries with recent obtainment of our GSA schedule to pursue Federal
Government opportunities

•

Consultant base of over 2500 proven IT professionals throughout the North
East Region with a recruiting network which provides national coverage

•

Flexible approach to helping our clients with their initiatives
•

Project-based Solutions

•

Staff Augmentation

•

Managed Service Offerings – “On-Shore QA , Development & Application Support”

•

Executive & Professional Search

!5
Competitive Advantage
•

Founders of a2c were part of the fastest growing privately held IT consulting and staff
augmentation firm in the US from 1994-2002. Our Executive Management Team has over a
100 years collective experience and been responsible for delivering over a half-billion dollars
of IT Consulting and staff augmentation revenue from 1994 through to the present day.

•

a2c’s Recruiting Engine and Methodology is one of the best in the industry, capable of
producing quality results, on-demand for our clients
•

Resource Managers continually “Silo” disciplines with available candidates whom have
proven their abilities with us over the last 10 years

•

Our solutions organization is instrumentally involved during the screening and selection
process to ensure that candidates submitted to our clients are an ideal match

•

a2c’s Culture provides an ability to attract and retain the best talent in the industry and fosters
creativity, integrity, growth and teamwork

•

a2c provides our clients with an alternative solution to a “Big 4” consultancy at substantial
savings for projects that are between $500K and $5M due to our flexibility, agility and focus

!6
Representative Clients

03/19/12

!7
a2c Solution Engagement Structures
•

Technology Strategy & Roadmap Formulation

•

Needs & Readiness Assessment

•

Package & Platform Selections

•

Proof of Concept Implementation

•

Requirements Discovery & Specifications

•

Program/Project Management

•

Full Life Cycle & Application Development

•

Infrastructure & Facilities Initiatives

•

Managed Services & Maintenance Support

!8
a2c Solutions Capabilities
•

Enterprise Data Management Practice helps clients manage their complete Information
Lifecycle from their On-line Transactional systems to their Data Warehousing, Enterprise
Reporting, Data Migration, Back-Up and Recovery Strategies (See Slide 7)

•

Business Architecture & Optimization Practice utilizes “Six Sigma Lean” methodologies to
analyze, re-engineer and automate our client’s business processes to leverage human
workflow and business rules engine technologies to create efficiencies and provide
business unit owners with the necessary metrics to continually improve performance

•

Program Management Office oversees all aspects of solutions planning and delivery
across client engagement teams and provides the methodology and frameworks which
are based on PMI® industry standards

•

Application Development & Managed Services Practice helps clients architect, implement
and deploy the latest Microsoft and Enterprise Java based applications which are built on
proven frameworks and architectures for the enterprise

•

a2c's SDLC Delivery Model is comprised of over 20 years collective best practices and
industry proven methodologies that allow our delivery teams to rapidly design, develop
and implement solutions. Our SDLC model has been designed to complement our project
management methodology, utilizing iterative development cycles that enable project
teams to provide consistently high quality, on-time deliverables, regardless of technology
platform
!9
Agile DW Design
Overview

!10
Modeling for End Users
•

How to Design to Answer
Business Questions?
•

Think about how questions are articulated

•

And how the answers should be
deliveredIdentify a common question
framework

•

Design an architecture that
embraces and leverages this
common question framework

•

Utilize the best designs and
technologies to:
•

(a) derive the answers

•

(b) present them in compelling ways that
lead to the next interesting question!
!11
How Do We Ask Questions?
Who

What

When

“How do this quarter’s sales by sales rep of
electronic products that we promoted to retail
customers in the east compare with last year’s?

What

Who

Where

Why

!12

When
How Do We Ask Questions?
•

Events / Transactions
•
•

•

e.g. Sale
a immutable "fact" that occurs in a time and (typically a)
place

Interrogatives:
•

Who, What, When, Where, Why

•

Descriptive context that fully describes the event

•

a set of “dimensions" that describe events
!13
Dimensional Value Proposition
•

It makes sense to present answers to people using the same
taxonomy of events and interrogatives (aka: facts and dimensions
- dimensional structure) that they use when forming questions

•

Events are instances of processes :

•

It’s best to present information to people who will ask the system
questions in dimensional form

•

This is true regardless of the type of information being
interrogated, it’s source, or IT stuff (like database technologies
utilized)

•

It’s best to model this presentation layer based on the events (aka:
business processes) that underlie the questions

!14
How

Wh

ho
W

en
How
Many

Wh

at

Why

h
W

re
e

!15
Scenarios
•

A brief discussion of how and where
dimensional modeling and/or
databases fit within common and
emerging “big data” data
warehousing architectures

!16
Kimball Dimensional DW
Dimensional BI Semantic Layer
Dimensional Data Warehouse
Data Movement / Integration
Source Data
(Structured)

!17
Kimball with Big Data
Dimensional BI Semantic Layer
Dimensional Data Warehouse

Big Data
Capture

Big Data
Discovery

(e.g. HDFS)

(e.g. MR)

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

!18
Corporate Information Factory (CIF)
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Corporate Information Factory 3NF DW

Data Movement / Integration
Source Data
(Structured)

!19
CIF with Big Data
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Big Data
Capture

Big Data
Discovery

(e.g. HDFS)

(e.g. MR)

Corporate Information
Factory 3NF DW

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

!20
Data Vault
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Data Vault

Data Movement / Integration
Source Data
(Structured)

!21
Data Vault with Big Data
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Big Data
Capture

Big Data
Discovery

(e.g. HDFS)

(e.g. MR)

Data Vault

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

!22
Etc.

!23
Common Framework
Dimensional BI Semantic Layer
Dimensional Tier
[Physical (Kimball) or Virtual (CIF or Data Vault)
Persistant Un/
Semi-Structured
Staging Area

Unstructured ->
Structured
Data Discovery
Processing

Persistent Structured Data
Repository
(not needed for Kimball)

Un/Semi-Structured Data
Movement

Structured Data Movement

Un/Semi-Structured Source Data

Structured Source Data
(Structured)
!24

Insight
Generation /
Data Mining
Common Framework
Dining Room
Readily Accessible to End Users
(and BI Developers)
Safe, Hospital Environment
Data Assets “Ready for Primetime”
Dimensionally Structured

Dimensional BI Semantic Layer
Dimensional Tier
[Physical (Kimball) or Virtual (CIF or Data Vault)

Persistant Un/
Semi-Structured
Staging Area

Unstructured ->
Structured Data
Discovery
Processing

Persistent Structured Data
Repository

Kitchen

(not needed for Kimball)

Un/Semi-Structured Data Movement

Structured Data Movement

Un/Semi-Structured Source Data

Structured Source Data
(Structured)

Clickstream Data

Off Limits to End Users
Data Professionals Only Please
Dangerous / Inhospitable Environment
Data Assets “Not Ready for Primetime”
Structured Variably For Data Processing

eCommerce Sale

eCommerce Example

!25
eCommerce Example: Clickstream
Semi-Structured
Recording of every page request
made by a user
Includes some structural elements –
such as when the request was
made and who the user is
Requires significant prep work in
order to fit into a traditional rowbased relational database
Apples and Oranges: PreSessionized Page Visits, Detailed
Product Views, Catalogue
Requests, Shopping Cart Adds /
Deletes / Abandons, etc.
Needs to be converted into
seperate-but-relatable dimensional
facts - with many shared
(conformed) dimensions
!26

Raw Clickstream Data!
25 52 164 240 274 328 368 448 538 561 630 687 730 775 825
834
39 120 124 205 401 581 704 814 825 834
35 249 674 712 733 759 854 950
39 422 449 704 825 857 895 937 954 964
15 229 262 283 294 352 381 708 738 766 853 883 966 978
26 104 143 320 569 620 798
7 185 214 350 529 658 682 782 809 849 883 947 970 979
227 390
71 192 208 272 279 280 300 333 496 529 530 597 618 674 675
720 855 914 932
183 193 217 256 276 277 374 474 483 496 512 529 626 653 706
878 939
161 175 177 424 490 571 597 623 766 795 853 910 960
125 130 327 698 699 839
392 461 569 801 862
27 78 104 177 733 775 781 845 900 921 938
101 147 229 350 411 461 572 579 657 675 778 803 842 903
71 208 217 266 279 290 458 478 523 614 766 853 888 944 969
43 70 176 204 227 334 369 480 513 703 708 835 874 895
25 52 278 730
151 432 504 830 890
71 73 118 274 310 327 388 419 449 469 484 706 722 795 810
844 846 918
130 274 432 528 967
188 307 326 381 403 523 526 722 774 788 789 834 950 975
89 116 198 201 333 395 653 720 846
70 171 227 289 462 538 541 623 674 701 805 946 964
143 192 317 471 487 631 638 640 678 735 780 865 888 935
17 242 471 758 763 837 956
52 145 161 283 375 385 676 721 731 790 792 885
182 229 276 529
43 522 565 617 859
Typical Clickstream “Page View” Dimensional
Model
What

When

What

Who

Why

!27
eCommerce Example: Web Sales
•

•

Time

•

Customer

•

•

Referring URL / Search
Phrase

•

Promotion / Campaign

•

The Sale Transaction
typically carries all
fundamental dimensions:

Purchase and/or Shipment
(Geo or URL) Locations

•

Fully Structured

•

•

Etc.

And “How Many”
Measures
•

•

!28

Discount Amounts

•

Product

Unit and Price Quantities /
Amounts

Etc
eCommerce Dimensionality
Facts (below) &
Dimensions (right)
Page Visit
Detailed Product
View
Shopping Cart
Activity

Time!
(When)
View Start
View End
Session
Start
Session End
View Start
View End
Session
Start
Session End
Activity Start
Activity End

Customer! Web Page!
(Who)
(Where)

Visitor

Current

Previous
Next

Prospect

Current

Previous
Next

Product!
(What)

Referring
URL!
(Where)

Promotion
/
Campaign
(Why)

Activity
Type
(How)

✔

✔

✔

Prospect

✔

✔

✔

✔

✔

✔

✔

Sale (Checkout)

Sale Start
Sale End

Customer

✔

Shipment / Delivery

Shipment
Delivery

Customer
Delivery
Recipient

✔

!29
Agile DW Design
Overview

!30
The first dimensional modeler:

Rudyard Kipling
Ralph Kimball?
R.K.

!31
I keep six honest serving-men

(They taught me all I knew);

Their names are What and Why and When 

And How and Where and Who…
–Rudyard Kipling

!32

!32
Who
!33
What
!34
When
!35
Where
!36
Why
!37
How
!38
How Many
!39
The

7Ws
Framework
How	

Many

Why

e
r
e
h
W

How

Wh
en

o
h
W
Wh
at
How did we get here?
DW Architectures: A Brief History
Corporate Information
Factory	

!
Data-Driven Analysis

Undisciplined Dimensional	

!
Report-Driven Analysis

Dimensional Bus
Architecture	

!
Process-Driven Analysis
7Ws Dimensional Model
When	


Who	


Time	


Customer	


Day	


How – Facts:	


Employee	


Month	


Much	


Third Party	


Fiscal Period

Many	


Organization

Often	


£$€
Where	


What	


Location	


Product	


??

Why	


Service	


Store	


Causal	


Transactions

Ship To	


Promotion	


Hospital

Reason	


Geographic	


Weather	

Competition
How	

Many

o
Wh
Wh
at

Why

re
he
W

How

BEAM

Wh
en

Business Event Analysis & Modeling
How
do you design a data warehouse?
Tech Design Artifacts?
CALENDAR

PRODUCT

Date Key

Product Key

Date
Day
Day in Week
Day in Month
Day in Qtr
Day in Year
Month
Qtr
Year
Weekday Flag
Holiday Flag

Product Code
Product Description
Product Type
Brand
Subcategory
Category

SALES FACT
Date Key
Product Key
Store Key
Promotion Key
Quantity Sold
Revenue
Cost
Basket Count

STORE

PROMOTION

Store Key

Promotion Key

Store Code
Store Name
URL
Store Manager
Region
Country

Promotion Code
Promotion Name
Promotion Type
Discount Type
Ad Type
OK, Now Validate with
Why
Agile Data Warehousing?
Waterfall BI/DW
Limited Stakeholder interaction

Analysis
Design
Development
This Year

BDUF

Stakeholder	

 Requirements
Input

Data	

Model

Next Year

Test
Release

ETL

BI

DATA

VALUE?
Agile DW/BI Development
Stakeholder interaction

?

JEDUF

BI	

Prototyping

ETL

Review	

Release

This Year

Next Year

Iteration 1

VALUE?

Iteration 2

ETL
BI
Iteration 3Rev

ADM

VALUE

Iteration …

VALUE!

DATA

Iteration n

VALUE!

VALUE!
State of The
DW Field
Solid:
Dimensional Data Warehouse Design is Mature
Proven Design Patterns Exist for Common
Requirements
Hit or Miss:
Collecting Unambiguous and Thorough
Requirements
Slotting Requirements into Proven Design
Patterns
End-User Ownership and Validation
Too Often: Snatching Defeat from the Jaws of
Victory
!52
Modelstorming
Quick

Inclusive

Data

Modeler

Interactive

BI Stakeholders

Fun
BEAM✲ Methodology
Structured, non-technical, collaborative working
conversation directly with BI Users

BEAM✲
BI User’s Business
Process, Organizational,
Hierarchical, and Data
Knowledge
• Focused Data Profiling
•

	


Data

Modeler

BI Stakeholders

• Logical and Physical
(Kimball-esque)
Dimensional Data Models	

• Example data	

• Detailed and Testable ETL
Specification	

• Instantiated DW
Prototype
Requirements =
Design
55
Collaboration at Every
Step
Agile Data Modeling Requirements
•

Techniques for encouraging interaction

•

Must use simple, inclusive notation and tools

•

Must be quick: hours rather than days – modelstorming

•

Balance ‘just in time’ (JIT) and ‘just enough design up
front’ (JEDUF) to reduce design rework

•

DW designers must embrace data model change, allow models
to evolve, avoid generic data models; need design patterns they
can trust to represent tomorrow’s BI requirements tomorrow

•

ETL and BI developers must embrace database change; need
tool support
!57
What
kind of model?
CALENDAR

PRODUCT

Date Key

Product Key

Date	

Day	

Day in Week	

Day in Month	

Day in Qtr	

Day in Year	

Month	

Qtr	

Year	

Weekday Flag	

Holiday Flag

Product Code	

Product Description	

Product Type 	

Brand 	

Subcategory 	

Category

SALES FACT
Date Key	

Product Key	

Store Key	

Promotion Key
Quantity Sold
Revenue	

Cost	

Basket Count

	


STORE

PROMOTION

Store Key

Promotion Key

Store Code	

Store Name	

URL	

Store Manager	

Region 	

Country

Promotion Code	

Promotion Name	

Promotion Type	

Discount Type	

Ad Type
Customer Type

Holiday Type

Month

Country
Calendar

Customer
Sales Fact

Store

Product

City

Category
Store Type

Product Type
Modeling by Abstraction
Modeling by Example
Agile DW Design
Process
64
Collaborative / Conversational Design

Who does what?
“Customers buy products”
BEAM✲
Modeler

Subjects Verb Objects

BI Users
Design Using Natural Language
•

Verbs – Events – Relationships – Fact Tables

•

Nouns – Details – Entities – Dimensions

•

Main Clause – Subject-Verb-Object

•

Prepositions – connect additional details to the
main clause

•

Interrogatives – The 7Ws – Dimension Types

•

Business Vocabulary - no IT-Speak
!66
“Spreadsheet”-like Models
Event Table Name (filled in later)

Subject Column Name
Verb
Object Column Name

Interrogative

Details
Example Data (4-6
rows)
Straightforward Methodology
1
1
1
1
1
1

Subject-Verb-Object

1
1
3
1
1
1

Who

1
1
4
1
1
1

What

1
1
5
1
1
1

When

1
1
2
1
1
1

Declare Event Type
Where

1
1
6
1
1
1

How
(many)

Why

Sufficient Detail Fact
Granularity

1
1
7
1
1
1
1
1
8
1
1
1

How

1
1
9
1
1
1

Initial Data Examples

Quantities - Facts
Capture Example Data
verb

on/at/every

SUBJECT

OBJECT

EVENT 

DATE

[who]

[what]

[when]

[where]

[how many]

[why]

[how]

Typical

Typical/Popular

Typical

Typical

Typical/Average

Typical/Normal

Typical/Normal

Different

Different

Different

Different

Different

Different

Different

Repeat

Repeat

Repeat

Repeat

Repeat

Repeat

Repeat

Missing

Missing

Missing

Missing

Missing

Missing

Missing

Group

Multiple/Bundle

Old, Low

Old, Low Value

Oldest needed

Near

Min, Negative, 0

New, High

New, High

Most Recent, Future

Far

Max, Precision

Multi-Level

Engage business users
Clarify definitions / Conform Dimensions
Illustrate exceptions
Drive out uniqueness
“Show and tell”

Multiple Values

Exceptional

Exceptional
Thoughtful Example Data

Detailed ETL
Specification
Identify Event Type Early
Adjust Conversation Based on Event Type
•

Discrete Event - Transaction
•

•

Recurring Event - Periodic Snapshot – measurement
•

•

Instantaneous/short duration, irregularly occurring events or
transactions

Regularly occurring events, ongoing processes, typically use to
measure cumulative of discrete events

Evolving Event - Accumulating Snapshot – timeline
•

Non-instantaneous/longer duration, irregularly occurring events or
transactions

•

Represents current status - reflects adjustments

!72
Capture When Details
When do Customers order Products?

BEAM✲
Modeler

“On the Order Date”
BI Users
Any other Whens?
Any other Whos?
And so on...
Model How Many Measures
•

Additive – can be summed up over any combination
of dimensions. No special rules

•

Non-additive – can not be summed over any
dimension e.g. unit price or temperature
•
•

•

Must be aggregated in other ways e.g. average, min, max
Degenerate Dimensions – transaction #, timestamps, flags

Semi-additive – can not be summed across at least
one dimension e.g. balances can not be summed
over time
!77
Modeling Dimensions
Annotate w Targeted Data Profiling
Proceed Through the Business Process Value Chain
Collaborative Dimension Conformance

Sales

Campaigns

Plant

Response

Product

Promotion

Dimensions

Customer

Shipper

Time
Identify Hierarchy Types
Balanced

Simple

Complex

Ragged

Variable
Depth
Graphically Depict Hierarchies
Visualize The Hierarchies
Paint The Organization
Prototype! Not “Data Model Review”
Recap
•

Collaborative and Agile
•
•

Data Sourcing

•

•

Data Modeling

Data Conformance

Requirements = Design
•

•

Slots directly into proven and mature dimensional data warehousing
design patterns

Validation through Prototyping
•

Semi-automated build of dimensional data warehouse

•

Perfect compliment to Agile BI Tools and Methods (e.g. Pentaho)

!87
If you have been affected by

any of the issues raised

in this presentation
!
Agile Data Warehouse Design

Lawrence Corr, Jim Stagnitto, Decision Press, November 2011	


!
Questions / Comments

Más contenido relacionado

La actualidad más candente

Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachDATAVERSITY
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?Precisely
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesLars E Martinsson
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best PracticesDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as ProductDATAVERSITY
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleDatabricks
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...Jochem van Grondelle
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...Denodo
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 

La actualidad más candente (20)

Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected Approach
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 

Destacado

Aeromexico and Adyen - Transformation of E-Commerce Payments
Aeromexico and Adyen - Transformation of E-Commerce PaymentsAeromexico and Adyen - Transformation of E-Commerce Payments
Aeromexico and Adyen - Transformation of E-Commerce PaymentsBrian Gross
 
Product Brochure: Adyen Company Profile 2015: Online Payment Services
Product Brochure: Adyen Company Profile 2015: Online Payment ServicesProduct Brochure: Adyen Company Profile 2015: Online Payment Services
Product Brochure: Adyen Company Profile 2015: Online Payment ServicesyStats.com
 
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...Marcos Ortiz Valmaseda
 
Improving the customer experience using big data customer-centric measurement...
Improving the customer experience using big data customer-centric measurement...Improving the customer experience using big data customer-centric measurement...
Improving the customer experience using big data customer-centric measurement...Vishal Kumar
 
Big Data has Big Implications for Customer Experience Management
Big Data has Big Implications for Customer Experience ManagementBig Data has Big Implications for Customer Experience Management
Big Data has Big Implications for Customer Experience ManagementVishal Kumar
 
Adyen - NOAH15 Berlin
Adyen - NOAH15 BerlinAdyen - NOAH15 Berlin
Adyen - NOAH15 BerlinNOAH Advisors
 
Adyen mobile payment - Mobile First event
Adyen mobile payment - Mobile First event Adyen mobile payment - Mobile First event
Adyen mobile payment - Mobile First event Mobylizr
 
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01Bobhayestcebigdatawebinar03272013 130417142258-phpapp01
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01Vishal Kumar
 
Sample Report: Adyen Company Profile 2015: Online Payment Services
Sample Report: Adyen Company Profile 2015: Online Payment ServicesSample Report: Adyen Company Profile 2015: Online Payment Services
Sample Report: Adyen Company Profile 2015: Online Payment ServicesyStats.com
 
Customer Experience Management for Startups
Customer Experience Management for StartupsCustomer Experience Management for Startups
Customer Experience Management for StartupsVishal Kumar
 
Linkedin Series B Pitch Deck August 2004
Linkedin Series B Pitch Deck August 2004Linkedin Series B Pitch Deck August 2004
Linkedin Series B Pitch Deck August 2004Vishal Kumar
 
Adyen - NOAH16 Berlin
Adyen - NOAH16 BerlinAdyen - NOAH16 Berlin
Adyen - NOAH16 BerlinNOAH Advisors
 
Big Data Introduction to D3
Big Data Introduction to D3Big Data Introduction to D3
Big Data Introduction to D3Vishal Kumar
 
Dropbox: Building Business Through Lean Startup Principles
Dropbox: Building Business Through Lean Startup PrinciplesDropbox: Building Business Through Lean Startup Principles
Dropbox: Building Business Through Lean Startup PrinciplesVishal Kumar
 
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...J. Skyler Fernandes
 

Destacado (18)

Aeromexico and Adyen - Transformation of E-Commerce Payments
Aeromexico and Adyen - Transformation of E-Commerce PaymentsAeromexico and Adyen - Transformation of E-Commerce Payments
Aeromexico and Adyen - Transformation of E-Commerce Payments
 
Product Brochure: Adyen Company Profile 2015: Online Payment Services
Product Brochure: Adyen Company Profile 2015: Online Payment ServicesProduct Brochure: Adyen Company Profile 2015: Online Payment Services
Product Brochure: Adyen Company Profile 2015: Online Payment Services
 
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...
A Partnership with Adyen is Equal to Exponential Growth: 17 Payments Experts ...
 
Improving the customer experience using big data customer-centric measurement...
Improving the customer experience using big data customer-centric measurement...Improving the customer experience using big data customer-centric measurement...
Improving the customer experience using big data customer-centric measurement...
 
Uber Pitch Deck
Uber Pitch DeckUber Pitch Deck
Uber Pitch Deck
 
Big Data has Big Implications for Customer Experience Management
Big Data has Big Implications for Customer Experience ManagementBig Data has Big Implications for Customer Experience Management
Big Data has Big Implications for Customer Experience Management
 
Adyen - NOAH15 Berlin
Adyen - NOAH15 BerlinAdyen - NOAH15 Berlin
Adyen - NOAH15 Berlin
 
Adyen mobile payment - Mobile First event
Adyen mobile payment - Mobile First event Adyen mobile payment - Mobile First event
Adyen mobile payment - Mobile First event
 
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01Bobhayestcebigdatawebinar03272013 130417142258-phpapp01
Bobhayestcebigdatawebinar03272013 130417142258-phpapp01
 
Airbnb Pitch Deck
Airbnb Pitch DeckAirbnb Pitch Deck
Airbnb Pitch Deck
 
Sample Report: Adyen Company Profile 2015: Online Payment Services
Sample Report: Adyen Company Profile 2015: Online Payment ServicesSample Report: Adyen Company Profile 2015: Online Payment Services
Sample Report: Adyen Company Profile 2015: Online Payment Services
 
Customer Experience Management for Startups
Customer Experience Management for StartupsCustomer Experience Management for Startups
Customer Experience Management for Startups
 
Linkedin Series B Pitch Deck August 2004
Linkedin Series B Pitch Deck August 2004Linkedin Series B Pitch Deck August 2004
Linkedin Series B Pitch Deck August 2004
 
Adyen - NOAH16 Berlin
Adyen - NOAH16 BerlinAdyen - NOAH16 Berlin
Adyen - NOAH16 Berlin
 
Big Data Introduction to D3
Big Data Introduction to D3Big Data Introduction to D3
Big Data Introduction to D3
 
Dropbox: Building Business Through Lean Startup Principles
Dropbox: Building Business Through Lean Startup PrinciplesDropbox: Building Business Through Lean Startup Principles
Dropbox: Building Business Through Lean Startup Principles
 
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...
The Best Startup Investor Pitch Deck & How to Present to Angels & Venture Cap...
 
Square Pitch Deck
Square Pitch DeckSquare Pitch Deck
Square Pitch Deck
 

Similar a Agile Data Warehouse Design for Big Data Analytics

BizTrans SysTech_Analytics_Serv_SAP_v1.0
BizTrans SysTech_Analytics_Serv_SAP_v1.0BizTrans SysTech_Analytics_Serv_SAP_v1.0
BizTrans SysTech_Analytics_Serv_SAP_v1.0BizTrans SysTech
 
Cubodrom profile
Cubodrom profileCubodrom profile
Cubodrom profilecubodrom
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Building a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSBuilding a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSPerficient, Inc.
 
The Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentThe Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentDenodo
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with MicrosoftCaserta
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?Nicolas Georgeault
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan JonesAIIM International
 
WebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress.IN
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesCisco Canada
 
Sami patel full_resume
Sami patel full_resumeSami patel full_resume
Sami patel full_resumeJignesh Shah
 
Integrating Advanced Analytics with Autodesk Solutions
Integrating Advanced Analytics with Autodesk SolutionsIntegrating Advanced Analytics with Autodesk Solutions
Integrating Advanced Analytics with Autodesk SolutionsRich Hanapole
 
Neo4j GraphDay Tel Aviv - Graphs in Action
Neo4j GraphDay Tel Aviv - Graphs in ActionNeo4j GraphDay Tel Aviv - Graphs in Action
Neo4j GraphDay Tel Aviv - Graphs in ActionNeo4j
 
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONS
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONSTRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONS
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONSTaction Software LLC
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesDATAVERSITY
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverSeeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverInside Analysis
 
OpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesOpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesDatavail
 

Similar a Agile Data Warehouse Design for Big Data Analytics (20)

BizTrans SysTech_Analytics_Serv_SAP_v1.0
BizTrans SysTech_Analytics_Serv_SAP_v1.0BizTrans SysTech_Analytics_Serv_SAP_v1.0
BizTrans SysTech_Analytics_Serv_SAP_v1.0
 
Cubodrom profile
Cubodrom profileCubodrom profile
Cubodrom profile
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Building a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICSBuilding a 360 Degree View of Your Customers on BICS
Building a 360 Degree View of Your Customers on BICS
 
The Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentThe Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail Environment
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with Microsoft
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?
CRM-UG Summit Phoenix 2018 - What is Common Data Model and how to use it?
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
 
WebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress Business Intelligence Capability
WebXpress Business Intelligence Capability
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business Outcomes
 
Sami patel full_resume
Sami patel full_resumeSami patel full_resume
Sami patel full_resume
 
Integrating Advanced Analytics with Autodesk Solutions
Integrating Advanced Analytics with Autodesk SolutionsIntegrating Advanced Analytics with Autodesk Solutions
Integrating Advanced Analytics with Autodesk Solutions
 
Neo4j GraphDay Tel Aviv - Graphs in Action
Neo4j GraphDay Tel Aviv - Graphs in ActionNeo4j GraphDay Tel Aviv - Graphs in Action
Neo4j GraphDay Tel Aviv - Graphs in Action
 
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONS
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONSTRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONS
TRANSFORM DATA WITH INSIGHTFUL ANALYTICS - BUSINESS INTELLIGENCE SOLUTIONS
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use Cases
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverSeeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing Forever
 
KEDAR_TERDALKAR
KEDAR_TERDALKARKEDAR_TERDALKAR
KEDAR_TERDALKAR
 
OpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesOpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case Studies
 

Más de Vishal Kumar

Zenefits Sales Deck
Zenefits Sales DeckZenefits Sales Deck
Zenefits Sales DeckVishal Kumar
 
Talentbin Sales Deck
Talentbin Sales DeckTalentbin Sales Deck
Talentbin Sales DeckVishal Kumar
 
Future datascientist0714
Future datascientist0714Future datascientist0714
Future datascientist0714Vishal Kumar
 
Here is a gift that keeps on giving in 2018 & beyond!
Here is a gift that keeps on giving in 2018 & beyond!Here is a gift that keeps on giving in 2018 & beyond!
Here is a gift that keeps on giving in 2018 & beyond!Vishal Kumar
 
Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Vishal Kumar
 
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and How
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and HowTotal Customer Experience Management Overview #TCE #CEM -- The Why, What and How
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and HowVishal Kumar
 
Global wireless network operator and mobile satisfaction / customer loyalty s...
Global wireless network operator and mobile satisfaction / customer loyalty s...Global wireless network operator and mobile satisfaction / customer loyalty s...
Global wireless network operator and mobile satisfaction / customer loyalty s...Vishal Kumar
 
Sample letter of intent
Sample letter of intentSample letter of intent
Sample letter of intentVishal Kumar
 
The Bootstrap Bible by Seth Godin
The Bootstrap Bible by Seth GodinThe Bootstrap Bible by Seth Godin
The Bootstrap Bible by Seth GodinVishal Kumar
 
Tce lab crdi half page handout summer 2012
Tce lab crdi half page handout summer 2012Tce lab crdi half page handout summer 2012
Tce lab crdi half page handout summer 2012Vishal Kumar
 

Más de Vishal Kumar (13)

Zenefits Sales Deck
Zenefits Sales DeckZenefits Sales Deck
Zenefits Sales Deck
 
Reddit Sales Deck
Reddit Sales DeckReddit Sales Deck
Reddit Sales Deck
 
Zuora Sales Deck
Zuora Sales DeckZuora Sales Deck
Zuora Sales Deck
 
Talentbin Sales Deck
Talentbin Sales DeckTalentbin Sales Deck
Talentbin Sales Deck
 
Future datascientist0714
Future datascientist0714Future datascientist0714
Future datascientist0714
 
Here is a gift that keeps on giving in 2018 & beyond!
Here is a gift that keeps on giving in 2018 & beyond!Here is a gift that keeps on giving in 2018 & beyond!
Here is a gift that keeps on giving in 2018 & beyond!
 
Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)
 
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and How
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and HowTotal Customer Experience Management Overview #TCE #CEM -- The Why, What and How
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and How
 
Global wireless network operator and mobile satisfaction / customer loyalty s...
Global wireless network operator and mobile satisfaction / customer loyalty s...Global wireless network operator and mobile satisfaction / customer loyalty s...
Global wireless network operator and mobile satisfaction / customer loyalty s...
 
Sample letter of intent
Sample letter of intentSample letter of intent
Sample letter of intent
 
Yammer Pitch Deck
Yammer Pitch DeckYammer Pitch Deck
Yammer Pitch Deck
 
The Bootstrap Bible by Seth Godin
The Bootstrap Bible by Seth GodinThe Bootstrap Bible by Seth Godin
The Bootstrap Bible by Seth Godin
 
Tce lab crdi half page handout summer 2012
Tce lab crdi half page handout summer 2012Tce lab crdi half page handout summer 2012
Tce lab crdi half page handout summer 2012
 

Último

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 

Último (20)

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 

Agile Data Warehouse Design for Big Data Analytics

  • 1. Agile Data Warehouse Design with Big Data John DiPietro & Jim Stagnitto !1
  • 2. Agenda • Introduction / a2c Overview • Modeling for End Users • Role of Dimensional Models in Big Data • Example: eCommerce • Structured Data: Sales • Semi-structured Data: Clickstream • Agile Dimensional Modeling Overview • Case Study Review • Q&A !2
  • 3. Introduction • a2c • • Data Warehousing • Master Data Management • Closed Look Analytics and Visualization • • Boutique EDM (Enterprise Data Management) consultancy firm: Data & Application Architecture John DiPietro • • Principal, Chief Technology Officer Jim Stagnitto • Data Warehouse & MDM Architect !3
  • 4. a2c Corporate Overview & Industry Experience !4
  • 5. Company Overview • Technology Solution Consultancy headquartered in Philadelphia with regional offices in New York and Boston • Servicing Healthcare, Life Science, Tel-Com and Financial Services industries with recent obtainment of our GSA schedule to pursue Federal Government opportunities • Consultant base of over 2500 proven IT professionals throughout the North East Region with a recruiting network which provides national coverage • Flexible approach to helping our clients with their initiatives • Project-based Solutions • Staff Augmentation • Managed Service Offerings – “On-Shore QA , Development & Application Support” • Executive & Professional Search !5
  • 6. Competitive Advantage • Founders of a2c were part of the fastest growing privately held IT consulting and staff augmentation firm in the US from 1994-2002. Our Executive Management Team has over a 100 years collective experience and been responsible for delivering over a half-billion dollars of IT Consulting and staff augmentation revenue from 1994 through to the present day. • a2c’s Recruiting Engine and Methodology is one of the best in the industry, capable of producing quality results, on-demand for our clients • Resource Managers continually “Silo” disciplines with available candidates whom have proven their abilities with us over the last 10 years • Our solutions organization is instrumentally involved during the screening and selection process to ensure that candidates submitted to our clients are an ideal match • a2c’s Culture provides an ability to attract and retain the best talent in the industry and fosters creativity, integrity, growth and teamwork • a2c provides our clients with an alternative solution to a “Big 4” consultancy at substantial savings for projects that are between $500K and $5M due to our flexibility, agility and focus !6
  • 8. a2c Solution Engagement Structures • Technology Strategy & Roadmap Formulation • Needs & Readiness Assessment • Package & Platform Selections • Proof of Concept Implementation • Requirements Discovery & Specifications • Program/Project Management • Full Life Cycle & Application Development • Infrastructure & Facilities Initiatives • Managed Services & Maintenance Support !8
  • 9. a2c Solutions Capabilities • Enterprise Data Management Practice helps clients manage their complete Information Lifecycle from their On-line Transactional systems to their Data Warehousing, Enterprise Reporting, Data Migration, Back-Up and Recovery Strategies (See Slide 7) • Business Architecture & Optimization Practice utilizes “Six Sigma Lean” methodologies to analyze, re-engineer and automate our client’s business processes to leverage human workflow and business rules engine technologies to create efficiencies and provide business unit owners with the necessary metrics to continually improve performance • Program Management Office oversees all aspects of solutions planning and delivery across client engagement teams and provides the methodology and frameworks which are based on PMI® industry standards • Application Development & Managed Services Practice helps clients architect, implement and deploy the latest Microsoft and Enterprise Java based applications which are built on proven frameworks and architectures for the enterprise • a2c's SDLC Delivery Model is comprised of over 20 years collective best practices and industry proven methodologies that allow our delivery teams to rapidly design, develop and implement solutions. Our SDLC model has been designed to complement our project management methodology, utilizing iterative development cycles that enable project teams to provide consistently high quality, on-time deliverables, regardless of technology platform !9
  • 11. Modeling for End Users • How to Design to Answer Business Questions? • Think about how questions are articulated • And how the answers should be deliveredIdentify a common question framework • Design an architecture that embraces and leverages this common question framework • Utilize the best designs and technologies to: • (a) derive the answers • (b) present them in compelling ways that lead to the next interesting question! !11
  • 12. How Do We Ask Questions? Who What When “How do this quarter’s sales by sales rep of electronic products that we promoted to retail customers in the east compare with last year’s? What Who Where Why !12 When
  • 13. How Do We Ask Questions? • Events / Transactions • • • e.g. Sale a immutable "fact" that occurs in a time and (typically a) place Interrogatives: • Who, What, When, Where, Why • Descriptive context that fully describes the event • a set of “dimensions" that describe events !13
  • 14. Dimensional Value Proposition • It makes sense to present answers to people using the same taxonomy of events and interrogatives (aka: facts and dimensions - dimensional structure) that they use when forming questions • Events are instances of processes : • It’s best to present information to people who will ask the system questions in dimensional form • This is true regardless of the type of information being interrogated, it’s source, or IT stuff (like database technologies utilized) • It’s best to model this presentation layer based on the events (aka: business processes) that underlie the questions !14
  • 16. Scenarios • A brief discussion of how and where dimensional modeling and/or databases fit within common and emerging “big data” data warehousing architectures !16
  • 17. Kimball Dimensional DW Dimensional BI Semantic Layer Dimensional Data Warehouse Data Movement / Integration Source Data (Structured) !17
  • 18. Kimball with Big Data Dimensional BI Semantic Layer Dimensional Data Warehouse Big Data Capture Big Data Discovery (e.g. HDFS) (e.g. MR) Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) !18
  • 19. Corporate Information Factory (CIF) Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Corporate Information Factory 3NF DW Data Movement / Integration Source Data (Structured) !19
  • 20. CIF with Big Data Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Big Data Capture Big Data Discovery (e.g. HDFS) (e.g. MR) Corporate Information Factory 3NF DW Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) !20
  • 21. Data Vault Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Data Vault Data Movement / Integration Source Data (Structured) !21
  • 22. Data Vault with Big Data Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Big Data Capture Big Data Discovery (e.g. HDFS) (e.g. MR) Data Vault Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) !22
  • 24. Common Framework Dimensional BI Semantic Layer Dimensional Tier [Physical (Kimball) or Virtual (CIF or Data Vault) Persistant Un/ Semi-Structured Staging Area Unstructured -> Structured Data Discovery Processing Persistent Structured Data Repository (not needed for Kimball) Un/Semi-Structured Data Movement Structured Data Movement Un/Semi-Structured Source Data Structured Source Data (Structured) !24 Insight Generation / Data Mining
  • 25. Common Framework Dining Room Readily Accessible to End Users (and BI Developers) Safe, Hospital Environment Data Assets “Ready for Primetime” Dimensionally Structured Dimensional BI Semantic Layer Dimensional Tier [Physical (Kimball) or Virtual (CIF or Data Vault) Persistant Un/ Semi-Structured Staging Area Unstructured -> Structured Data Discovery Processing Persistent Structured Data Repository Kitchen (not needed for Kimball) Un/Semi-Structured Data Movement Structured Data Movement Un/Semi-Structured Source Data Structured Source Data (Structured) Clickstream Data Off Limits to End Users Data Professionals Only Please Dangerous / Inhospitable Environment Data Assets “Not Ready for Primetime” Structured Variably For Data Processing eCommerce Sale eCommerce Example !25
  • 26. eCommerce Example: Clickstream Semi-Structured Recording of every page request made by a user Includes some structural elements – such as when the request was made and who the user is Requires significant prep work in order to fit into a traditional rowbased relational database Apples and Oranges: PreSessionized Page Visits, Detailed Product Views, Catalogue Requests, Shopping Cart Adds / Deletes / Abandons, etc. Needs to be converted into seperate-but-relatable dimensional facts - with many shared (conformed) dimensions !26 Raw Clickstream Data! 25 52 164 240 274 328 368 448 538 561 630 687 730 775 825 834 39 120 124 205 401 581 704 814 825 834 35 249 674 712 733 759 854 950 39 422 449 704 825 857 895 937 954 964 15 229 262 283 294 352 381 708 738 766 853 883 966 978 26 104 143 320 569 620 798 7 185 214 350 529 658 682 782 809 849 883 947 970 979 227 390 71 192 208 272 279 280 300 333 496 529 530 597 618 674 675 720 855 914 932 183 193 217 256 276 277 374 474 483 496 512 529 626 653 706 878 939 161 175 177 424 490 571 597 623 766 795 853 910 960 125 130 327 698 699 839 392 461 569 801 862 27 78 104 177 733 775 781 845 900 921 938 101 147 229 350 411 461 572 579 657 675 778 803 842 903 71 208 217 266 279 290 458 478 523 614 766 853 888 944 969 43 70 176 204 227 334 369 480 513 703 708 835 874 895 25 52 278 730 151 432 504 830 890 71 73 118 274 310 327 388 419 449 469 484 706 722 795 810 844 846 918 130 274 432 528 967 188 307 326 381 403 523 526 722 774 788 789 834 950 975 89 116 198 201 333 395 653 720 846 70 171 227 289 462 538 541 623 674 701 805 946 964 143 192 317 471 487 631 638 640 678 735 780 865 888 935 17 242 471 758 763 837 956 52 145 161 283 375 385 676 721 731 790 792 885 182 229 276 529 43 522 565 617 859
  • 27. Typical Clickstream “Page View” Dimensional Model What When What Who Why !27
  • 28. eCommerce Example: Web Sales • • Time • Customer • • Referring URL / Search Phrase • Promotion / Campaign • The Sale Transaction typically carries all fundamental dimensions: Purchase and/or Shipment (Geo or URL) Locations • Fully Structured • • Etc. And “How Many” Measures • • !28 Discount Amounts • Product Unit and Price Quantities / Amounts Etc
  • 29. eCommerce Dimensionality Facts (below) & Dimensions (right) Page Visit Detailed Product View Shopping Cart Activity Time! (When) View Start View End Session Start Session End View Start View End Session Start Session End Activity Start Activity End Customer! Web Page! (Who) (Where) Visitor Current
 Previous Next Prospect Current
 Previous Next Product! (What) Referring URL! (Where) Promotion / Campaign (Why) Activity Type (How) ✔ ✔ ✔ Prospect ✔ ✔ ✔ ✔ ✔ ✔ ✔ Sale (Checkout) Sale Start Sale End Customer ✔ Shipment / Delivery Shipment Delivery Customer Delivery Recipient ✔ !29
  • 31. The first dimensional modeler: Rudyard Kipling Ralph Kimball? R.K. !31
  • 32. I keep six honest serving-men
 (They taught me all I knew);
 Their names are What and Why and When 
 And How and Where and Who… –Rudyard Kipling !32 !32
  • 42. How did we get here?
  • 43. DW Architectures: A Brief History Corporate Information Factory ! Data-Driven Analysis Undisciplined Dimensional ! Report-Driven Analysis Dimensional Bus Architecture ! Process-Driven Analysis
  • 44. 7Ws Dimensional Model When Who Time Customer Day How – Facts: Employee Month Much Third Party Fiscal Period Many Organization Often £$€ Where What Location Product ?? Why Service Store Causal Transactions Ship To Promotion Hospital Reason Geographic Weather Competition
  • 46. How do you design a data warehouse?
  • 47. Tech Design Artifacts? CALENDAR PRODUCT Date Key Product Key Date Day Day in Week Day in Month Day in Qtr Day in Year Month Qtr Year Weekday Flag Holiday Flag Product Code Product Description Product Type Brand Subcategory Category SALES FACT Date Key Product Key Store Key Promotion Key Quantity Sold Revenue Cost Basket Count STORE PROMOTION Store Key Promotion Key Store Code Store Name URL Store Manager Region Country Promotion Code Promotion Name Promotion Type Discount Type Ad Type
  • 50. Waterfall BI/DW Limited Stakeholder interaction Analysis Design Development This Year BDUF Stakeholder Requirements Input Data Model Next Year Test Release ETL BI DATA VALUE?
  • 51. Agile DW/BI Development Stakeholder interaction ? JEDUF BI Prototyping ETL Review Release This Year Next Year Iteration 1 VALUE? Iteration 2 ETL BI Iteration 3Rev ADM VALUE Iteration … VALUE! DATA Iteration n VALUE! VALUE!
  • 52. State of The DW Field Solid: Dimensional Data Warehouse Design is Mature Proven Design Patterns Exist for Common Requirements Hit or Miss: Collecting Unambiguous and Thorough Requirements Slotting Requirements into Proven Design Patterns End-User Ownership and Validation Too Often: Snatching Defeat from the Jaws of Victory !52
  • 54. BEAM✲ Methodology Structured, non-technical, collaborative working conversation directly with BI Users BEAM✲ BI User’s Business Process, Organizational, Hierarchical, and Data Knowledge • Focused Data Profiling • Data
 Modeler BI Stakeholders • Logical and Physical (Kimball-esque) Dimensional Data Models • Example data • Detailed and Testable ETL Specification • Instantiated DW Prototype
  • 57. Agile Data Modeling Requirements • Techniques for encouraging interaction • Must use simple, inclusive notation and tools • Must be quick: hours rather than days – modelstorming • Balance ‘just in time’ (JIT) and ‘just enough design up front’ (JEDUF) to reduce design rework • DW designers must embrace data model change, allow models to evolve, avoid generic data models; need design patterns they can trust to represent tomorrow’s BI requirements tomorrow • ETL and BI developers must embrace database change; need tool support !57
  • 59.
  • 60. CALENDAR PRODUCT Date Key Product Key Date Day Day in Week Day in Month Day in Qtr Day in Year Month Qtr Year Weekday Flag Holiday Flag Product Code Product Description Product Type Brand Subcategory Category SALES FACT Date Key Product Key Store Key Promotion Key Quantity Sold Revenue Cost Basket Count STORE PROMOTION Store Key Promotion Key Store Code Store Name URL Store Manager Region Country Promotion Code Promotion Name Promotion Type Discount Type Ad Type
  • 61. Customer Type Holiday Type Month Country Calendar Customer Sales Fact Store Product City Category Store Type Product Type
  • 65. Collaborative / Conversational Design Who does what? “Customers buy products” BEAM✲ Modeler Subjects Verb Objects BI Users
  • 66. Design Using Natural Language • Verbs – Events – Relationships – Fact Tables • Nouns – Details – Entities – Dimensions • Main Clause – Subject-Verb-Object • Prepositions – connect additional details to the main clause • Interrogatives – The 7Ws – Dimension Types • Business Vocabulary - no IT-Speak !66
  • 67. “Spreadsheet”-like Models Event Table Name (filled in later) Subject Column Name Verb Object Column Name Interrogative Details Example Data (4-6 rows)
  • 68. Straightforward Methodology 1 1 1 1 1 1 Subject-Verb-Object 1 1 3 1 1 1 Who 1 1 4 1 1 1 What 1 1 5 1 1 1 When 1 1 2 1 1 1 Declare Event Type Where 1 1 6 1 1 1 How (many) Why Sufficient Detail Fact Granularity 1 1 7 1 1 1 1 1 8 1 1 1 How 1 1 9 1 1 1 Initial Data Examples Quantities - Facts
  • 69. Capture Example Data verb on/at/every SUBJECT OBJECT EVENT 
 DATE [who] [what] [when] [where] [how many] [why] [how] Typical Typical/Popular Typical Typical Typical/Average Typical/Normal Typical/Normal Different Different Different Different Different Different Different Repeat Repeat Repeat Repeat Repeat Repeat Repeat Missing Missing Missing Missing Missing Missing Missing Group Multiple/Bundle Old, Low Old, Low Value Oldest needed Near Min, Negative, 0 New, High New, High Most Recent, Future Far Max, Precision Multi-Level Engage business users Clarify definitions / Conform Dimensions Illustrate exceptions Drive out uniqueness “Show and tell” Multiple Values Exceptional Exceptional
  • 70. Thoughtful Example Data Detailed ETL Specification
  • 72. Adjust Conversation Based on Event Type • Discrete Event - Transaction • • Recurring Event - Periodic Snapshot – measurement • • Instantaneous/short duration, irregularly occurring events or transactions Regularly occurring events, ongoing processes, typically use to measure cumulative of discrete events Evolving Event - Accumulating Snapshot – timeline • Non-instantaneous/longer duration, irregularly occurring events or transactions • Represents current status - reflects adjustments !72
  • 73. Capture When Details When do Customers order Products? BEAM✲ Modeler “On the Order Date” BI Users
  • 77. Model How Many Measures • Additive – can be summed up over any combination of dimensions. No special rules • Non-additive – can not be summed over any dimension e.g. unit price or temperature • • • Must be aggregated in other ways e.g. average, min, max Degenerate Dimensions – transaction #, timestamps, flags Semi-additive – can not be summed across at least one dimension e.g. balances can not be summed over time !77
  • 79. Annotate w Targeted Data Profiling
  • 80. Proceed Through the Business Process Value Chain
  • 86. Prototype! Not “Data Model Review”
  • 87. Recap • Collaborative and Agile • • Data Sourcing • • Data Modeling Data Conformance Requirements = Design • • Slots directly into proven and mature dimensional data warehousing design patterns Validation through Prototyping • Semi-automated build of dimensional data warehouse • Perfect compliment to Agile BI Tools and Methods (e.g. Pentaho) !87
  • 88. If you have been affected by
 any of the issues raised
 in this presentation
  • 89. ! Agile Data Warehouse Design
 Lawrence Corr, Jim Stagnitto, Decision Press, November 2011 !