SlideShare una empresa de Scribd logo
1 de 25
The Application of Data Vault to DW2.0 © Dan Linstedt, 2011-2012 all rights reserved
A bit about me… 2 Author, Inventor, Speaker – and part time photographer… 25+ years in the IT industry Worked in DoD, US Gov’t, Fortune 50, and so on… Find out more about the Data Vault: http://www.youtube.com/LearnDataVault http://LearnDataVault.com Full profile on http://www.LinkedIn.com/dlinstedt
Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 3
DW2.0 Architecture 10/5/2011 Do Not Duplicate Without Written Permission 4 Enterprise Service Bus ESB Connectivity: ,[object Object]
EII
ETL / ELT
Web ServicesCube  Processing Temporal Indexing Semantic Management Active  Data Mining Transformation Active Cleansing Unstructured Data: ,[object Object]
Plain Text
Word Docs
ImagesM E T A D A T A Interactive Tactical Data Models Must be consistently applied throughout all layers. Integrated Strategic ESB Management: ,[object Object]
Email
Spread Sheets
Transaction
Structured InformationNear-Line Extended Archival Historical Enterprise Data Warehouse
DW2.0 Drivers for Data Modeling 10/5/2011 Do Not Duplicate Without Written Permission 5 Technical Drivers Business Drivers Flexibility Compliance Volume Frequency Data Model Data Model Understandability Granularity Data Models are one of the main integration points between Technical and Business drivers. Business Keys drive understandability, and granularity Normalization drives flexibility, and frequency of load Raw data sets in the EDW/ADW drive compliance and volume
Divergence of Data Models over Time Data models (both logical and physical) have diverged from business drivers and direction over time. The Data Models have driven towards physical improvements instead of towards business improvements. The Data Vault Architecture drives data modeling back to the business sides of the house. 10/5/2011 Do Not Duplicate Without Written Permission 6
Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 7 Image is from - What The Bleep Do We Know?
Defining the Data Vault 10/5/2011 Do Not Duplicate Without Written Permission 8 The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business.  It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses. Defining the Data Vault TDAN.com Article
What Does One Look Like? 10/5/2011 Do Not Duplicate Without Written Permission 9 Records a history of the interaction Account Information Sat Sat Sat Link Account F(x) F(x) Sat Sat Invoice ID Sat F(x) Sat Invoice / Billing Information Customer Information Sat Elements: ,[object Object]
Link
SatelliteSat Customer F(x) Sat The impact of linking disparate systems together, is inside the shaded area.
Modeling in DW2.0 Bill Says: DW2.0 must be brought down to a very finite level of detail. The starting point for DW2.0 is the modeling process. The data model applies to the integrated sector, the near line sector, and the archival sector. The way that data warehouses are built is in an incremental manner The Data Vault specializes in: Providing finite grain at the lowest level possible, Mapping business process models to data models Existing in all sectors simultaneously without changes. Flexibility and managing change so that impacts are not a mile-wide and 10 miles deep. 10/5/2011 Do Not Duplicate Without Written Permission 10
Elements in a Data Vault Hub Unique List of Business Keys, tracked by the first time the warehouse saw them appear. Link Relationships between business keys, also representing a grain shift, or a hierarchical roll-up. Satellite Data over time, granular, and descriptive about the business key.  Also setup according to type of information, and rate of change. 10/5/2011 Do Not Duplicate Without Written Permission 11
Applying the Data Vault to Global DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 12 Manufacturing EDW  in China Planning in Brazil Hub Hub Link Sat Sat Link Sat Sat Link Hub Link Hub Hub Sat Sat Sat Sat Sat Sat Sat Sat Base EDW Created in Corporate Financials in USA
Applying the Data Vault to Time-Value DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 13 Satellite Data Over Time Row 1 Row 2 Row 3 Row 4 Satellite entities in the Data Vault house data over time.  They are split by type of information and rate of change.  This is an example set of data for a customer name satellite.

Más contenido relacionado

La actualidad más candente

Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
Agile Data Mining with Data Vault 2.0 (english)
Agile Data Mining with Data Vault 2.0 (english)Agile Data Mining with Data Vault 2.0 (english)
Agile Data Mining with Data Vault 2.0 (english)Michael Olschimke
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingDaniel Upton
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceAlation
 
Cloud Data Warehouses
Cloud Data WarehousesCloud Data Warehouses
Cloud Data WarehousesAsis Mohanty
 
Modern Data Architecture
Modern Data Architecture Modern Data Architecture
Modern Data Architecture Mark Hewitt
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault ModelingKent Graziano
 

La actualidad más candente (20)

Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Agile Data Mining with Data Vault 2.0 (english)
Agile Data Mining with Data Vault 2.0 (english)Agile Data Mining with Data Vault 2.0 (english)
Agile Data Mining with Data Vault 2.0 (english)
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and Modelstorming
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
Data mesh
Data meshData mesh
Data mesh
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Cloud Data Warehouses
Cloud Data WarehousesCloud Data Warehouses
Cloud Data Warehouses
 
Modern Data Architecture
Modern Data Architecture Modern Data Architecture
Modern Data Architecture
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
 

Destacado

IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyEmpowered Holdings, LLC
 
Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Empowered Holdings, LLC
 
Best Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementBest Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementEmpowered Holdings, LLC
 
Oracle Database Vault
Oracle Database VaultOracle Database Vault
Oracle Database VaultKhalid ALLILI
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopjohannesvdb
 
Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2atul randive
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourHans Hultgren
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultDaniel Upton
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneHans Hultgren
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoHans Hultgren
 
Data Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeData Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeHans Hultgren
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesCGI
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Andreas Buckenhofer
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieAndreas Buckenhofer
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileDaniel Upton
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 

Destacado (20)

Data vault what's Next: Part 2
Data vault what's Next: Part 2Data vault what's Next: Part 2
Data vault what's Next: Part 2
 
IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And Methodology
 
Data vault: What's Next
Data vault: What's NextData vault: What's Next
Data vault: What's Next
 
Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Présentation data vault et bi v20120508
Présentation data vault et bi v20120508
 
Best Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementBest Practices: Data Admin & Data Management
Best Practices: Data Admin & Data Management
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
 
Oracle Database Vault
Oracle Database VaultOracle Database Vault
Oracle Database Vault
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshop
 
Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part Four
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data Vault
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part One
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part Two
 
Data Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeData Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part Three
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
 
Big Data Modeling
Big Data ModelingBig Data Modeling
Big Data Modeling
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 

Similar a Data Vault and DW2.0

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Denodo
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijevIlja Dmitrijevs
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Denodo
 
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Denodo
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesDenodo
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementVictor Olex
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionKlaudiia Jacome
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?Denodo
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lakeCapgemini
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaJeffrey T. Pollock
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Data Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourData Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourWhereScape
 
Sql server briefing sept
Sql server briefing septSql server briefing sept
Sql server briefing septMark Kromer
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data LakeIRJET Journal
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 

Similar a Data Vault and DW2.0 (20)

Data vault
Data vaultData vault
Data vault
 
Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of Engagement
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and vision
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lake
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Data Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourData Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast Tour
 
Sql server briefing sept
Sql server briefing septSql server briefing sept
Sql server briefing sept
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 

Último

International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
8447779800, Low rate Call girls in Dwarka mor Delhi NCR
8447779800, Low rate Call girls in Dwarka mor Delhi NCR8447779800, Low rate Call girls in Dwarka mor Delhi NCR
8447779800, Low rate Call girls in Dwarka mor Delhi NCRashishs7044
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in PhilippinesDavidSamuel525586
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...ssuserf63bd7
 
Send Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSendBig4
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditNhtLNguyn9
 

Último (20)

International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
8447779800, Low rate Call girls in Dwarka mor Delhi NCR
8447779800, Low rate Call girls in Dwarka mor Delhi NCR8447779800, Low rate Call girls in Dwarka mor Delhi NCR
8447779800, Low rate Call girls in Dwarka mor Delhi NCR
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in Philippines
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
 
Send Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.com
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal audit
 

Data Vault and DW2.0

  • 1. The Application of Data Vault to DW2.0 © Dan Linstedt, 2011-2012 all rights reserved
  • 2. A bit about me… 2 Author, Inventor, Speaker – and part time photographer… 25+ years in the IT industry Worked in DoD, US Gov’t, Fortune 50, and so on… Find out more about the Data Vault: http://www.youtube.com/LearnDataVault http://LearnDataVault.com Full profile on http://www.LinkedIn.com/dlinstedt
  • 3. Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 3
  • 4.
  • 5. EII
  • 7.
  • 10.
  • 11. Email
  • 14. Structured InformationNear-Line Extended Archival Historical Enterprise Data Warehouse
  • 15. DW2.0 Drivers for Data Modeling 10/5/2011 Do Not Duplicate Without Written Permission 5 Technical Drivers Business Drivers Flexibility Compliance Volume Frequency Data Model Data Model Understandability Granularity Data Models are one of the main integration points between Technical and Business drivers. Business Keys drive understandability, and granularity Normalization drives flexibility, and frequency of load Raw data sets in the EDW/ADW drive compliance and volume
  • 16. Divergence of Data Models over Time Data models (both logical and physical) have diverged from business drivers and direction over time. The Data Models have driven towards physical improvements instead of towards business improvements. The Data Vault Architecture drives data modeling back to the business sides of the house. 10/5/2011 Do Not Duplicate Without Written Permission 6
  • 17. Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 7 Image is from - What The Bleep Do We Know?
  • 18. Defining the Data Vault 10/5/2011 Do Not Duplicate Without Written Permission 8 The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses. Defining the Data Vault TDAN.com Article
  • 19.
  • 20. Link
  • 21. SatelliteSat Customer F(x) Sat The impact of linking disparate systems together, is inside the shaded area.
  • 22. Modeling in DW2.0 Bill Says: DW2.0 must be brought down to a very finite level of detail. The starting point for DW2.0 is the modeling process. The data model applies to the integrated sector, the near line sector, and the archival sector. The way that data warehouses are built is in an incremental manner The Data Vault specializes in: Providing finite grain at the lowest level possible, Mapping business process models to data models Existing in all sectors simultaneously without changes. Flexibility and managing change so that impacts are not a mile-wide and 10 miles deep. 10/5/2011 Do Not Duplicate Without Written Permission 10
  • 23. Elements in a Data Vault Hub Unique List of Business Keys, tracked by the first time the warehouse saw them appear. Link Relationships between business keys, also representing a grain shift, or a hierarchical roll-up. Satellite Data over time, granular, and descriptive about the business key. Also setup according to type of information, and rate of change. 10/5/2011 Do Not Duplicate Without Written Permission 11
  • 24. Applying the Data Vault to Global DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 12 Manufacturing EDW in China Planning in Brazil Hub Hub Link Sat Sat Link Sat Sat Link Hub Link Hub Hub Sat Sat Sat Sat Sat Sat Sat Sat Base EDW Created in Corporate Financials in USA
  • 25. Applying the Data Vault to Time-Value DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 13 Satellite Data Over Time Row 1 Row 2 Row 3 Row 4 Satellite entities in the Data Vault house data over time. They are split by type of information and rate of change. This is an example set of data for a customer name satellite.
  • 26. Batch and Real-Time Data Arrival 10/5/2011 Do Not Duplicate Without Written Permission 14 All Inserts All the time Transaction ID Date Stamp Customer Account # Amount Sat Transaction Type Hub Customer Link Transaction Hub Acct Sat Customer Sat Acct 3, 6 or 12 Hr Load Window Batch Load Customer Info Acct Data
  • 27. Star Schema Real-Time Data Issues 10/5/2011 Do Not Duplicate Without Written Permission 15 Updates are REQUIRED! Transaction ID Date Stamp Customer Account # Amount Type 3, 6 or 12 Hr Load Window Dimension Customer Fact Transaction Dimension Account Batch Load Customer Info Acct Data Cleansing & Quality must occur before the data can reach the target tables, cleansing and quality introduce unwanted latency!
  • 28. Compliance in DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 16 Changes to Source Information Source Systems EDW / ADW Data Vault Data Marts Data Delivery Raw Detail = auditable Loads in Real-Time or in Batch Integrated by Business Key Flexible, allows business changes (with little to no impact) No delay in loading data Data type conformity Semantic Integration True Marts Raw Integration Business Rules User or Auditor Continuous Data Improvement Error Mart Quality Direction of Information Flow Master Data (Operational)
  • 29. Applying the Data Vault to System Of Record 10/5/2011 Do Not Duplicate Without Written Permission 17 Master Data or Conformed Dimensions Normalized EDW Source Systems SOR Definition 2 SOR Definition 3 SOR Definition 1 SOR 1 Data Capture, Data Produced by system algorithms SOR 2 Raw Detailed Integrated Data over time, Integrated by Horizontal (functional) Business Key. Auditable. SOR 3 Current view of the business, merged, quality cleansed, single copy, single source, feeds operational systems.
  • 30. DW2.0 Paradoxes DW2.0 incorporates: Unstructured, Semi-Structured, Real-Time, and Batch Data Global views All of which drive volumes of data. Volume causes latency in transformation. Volume is directly proportional to transformation complexity. Real-Time data arrival is inversely proportional to complexity and volume. Time for “quality, cleansing, and transformation” on the way in to the EDW diminishes as near-real-time is approached, or massive volumes of batch data are found within a shrinking batch window. Transformation can destroy data audit ability and compliance of the EDW / ADW. 10/5/2011 Do Not Duplicate Without Written Permission 18
  • 31. DW2.0 Paradoxes - Imagery 10/5/2011 Do Not Duplicate Without Written Permission 19 Drives DW2.0 Real-Time Transactions Unstructured Data Low-Level Grain Pushes Increases Low Latency Volume Fights Requires Merging, Quality, Cleansing Fights Data Model Denormalization Fights Data Model Normalization & Raw Details Inhibits Requires Inhibits Auditability & Compliance Provides
  • 32. DW2.0 Paradox Hypothesis As we reach near-real time, the ability to transform data and “wait” for parent dependencies directly decreases, the data decay rates increase, and therefore can cause data death if not processed in time. Normalization of the data model increases flexibility, and scalability. The closer we get to near-real-time, the more normalized the data model in the EDW/ADW must become. In order to process high volumes of batch data extremely fast, the “business transformations” must be removed from the load stream of the EDW. 10/5/2011 Do Not Duplicate Without Written Permission 20
  • 33. Data Vault Volumetrics 10/5/2011 Do Not Duplicate Without Written Permission 21 Volumetrics (10% null Data) Upon Initial Investigation, the 12 month growth rate for new customers is 197.4 MB per year…. Now let’s factor in the DELTA’s.
  • 34. Data Vault Growth 10/5/2011 Do Not Duplicate Without Written Permission 22 Volumetrics (10% null Data) – Delta Growth Only Original Dimension: 497.16 MB per Year New Data Vault:317.03 MB Per Year
  • 35. Data Vault VS Dimension Growth 10/5/2011 Do Not Duplicate Without Written Permission 23 How does the extensive growth rate affect queries?
  • 36. Summarization Business: Lack of a single view of a customer, product, service, etc... Lack of visibility into ALL information across the enterprise. Competition does it better, faster, cheaper. Unable to identify and forecast business trends and their impacts. WHERE’S THE KNOWLEDGE? OR IS IT JUST ALL DATA? 10/5/2011 Do Not Duplicate Without Written Permission 24 Technical: Near-Real-Time (Active) Huge Data Volumes Massive Data Dis-Integration Spread-Marts Convergence of Operational and Strategic Questions Duplication of data in the ODS, Warehouse, and Data Marts! Dimension-itis!! ODS Ulcer! Fact Table Granularity JUNK tables, Helper Tables
  • 37. Where To Learn More The Technical Modeling Book: http://LearnDataVault.com The Discussion Forums: & eventshttp://LinkedIn.com – Data Vault Discussions Contact me:http://DanLinstedt.com - web siteDanLinstedt@gmail.com - email World wide User Group (Free)http://dvusergroup.com 25