SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
BIG DATA MODELING
Hans Hultgren
RMDC Fall 2016
Welcome
• Big Data1
• Data Modeling2
• Big Data Modeling3
AGENDA
Session Objectives
• Big Data Fundamentals
– Components of Big Data
– Structure & Schemas
– Tools & Architecture
• Data Modeling
– Integration & History
– Data Warehousing & BI
– Conceptual to Physical
• Big Data Modeling
– Focus on Meaning
• Ensemble Modeling
– The Blended Architecture
BIG DATA
Big Data
“Huge” Data Volumes
n-Structured & Very Complex
Streaming & Shape-Shifting
Typical Data
v v
v v
v v
v v
Typical Data Big Data
A
B
C
Big Data
• Volume
Huge Volumes of Data
• Velocity
Drinking from a Fire Hose
• Variety
n-Structured Data
• Veracity
Quality, Accuracy, Reliability, Trustworthiness
• Value
Business Value and Value Potential
Big Data Architecture
• To deal with the features of Big Data,
supporting architectural components are
based on:
–Data distribution, and
–Late Binding of Schemas
KVP
Modeling and Understanding
• Schema on Write
• Schema on Read
• Dismantled Schema on Write
• Schema on Focus
• Schema on Leverage
9
LOAD
MODEL APPLY
EXPLORE
Modeling and Understanding
• Big Data
Possibilities
10
LOAD
MODEL APPLY
EXPLORE
Inconvenient Truth about BIG DATA
http://community.embarcadero.com/blogs/entry/the-hidden-elephant-in-big-data-modeling
DATA MODELING
Data Modeling
Mans Search for Meaning…
• Conceptual Modeling
• Logical Modeling
• Information Modeling
• Physical Data Modeling
Ensemble Modeling™
14
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole.
• With Ensemble Modeling the Core Business Concepts that we define and
model are represented as a whole – an ensemble – including all of the
component parts. An Ensemble is typically based on all things defining a
Core Business Concept that can be uniquely and specifically said for one
instance of that Concept.
E M F
Forms of Modeling & Ensemble
15
Ensemble
Anchor Focal Point Data Vault
DV2.02G
Hyper Agility
Temporal
6NF, etc.
Matter
EDW
Data
Mart
Data
Mart
Data
Mart
ERP
Acctg
Sales
3NF Dimensional
E M F
The Data Vault Ensemble
16
• The Data Vault Ensemble conforms to a single key – embodied
in the Hub construct.
• The component parts for the Data Vault Ensemble include:
– Hub The Natural Business Key
– Link The Natural Business Relationships
– Satellite All Context, Descriptive Data and History
Ensemble means thinking differently
17
Customer
Customer
• The minimal construct then for an “entity”
such as “Customer” is now (in data vault) a
Hub with a set of Satellites
Applying data vault modeling pattern
18
Data Vault Ensemble Modeling Process
1) Identify and Model the Core Business Concepts
• Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Find best/target Natural Business Key
19
Data Vault Ensemble Modeling Process
2) Identify and Model the Natural Business Relationships
• Specific Unique Relationships
• Be considerate of the Unit of Work and Grain
20
Data Vault Ensemble Modeling Process
3) Analyze and Design the Context Satellites
• Consider Rate of Change, Type of Data
and also the Sources
21
BIG DATA
MODELING
Logical business model
• Leveraged for all logical
model needs including
the data warehouse, big
data lake, master data
management (MDM) and
operational integration
initiatives
• Closely aligned to DV
physical model
Ensemble Logical Form ( )
23
Customer
Region Store
Sale
Vendor
Product
Sale LI
Employee
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
Ensemble Logical Form
24
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
ELF Modeling maintained in:
* Metadata
* Logical Data Model
* Data Modeling Tools
* Virtual Schemas
* Other Tools or Artifacts
Map to Context Data stored in:
* JSON Docs
* XML (w/ XSD or Not)
* Blobs (Free Form Text)
* Big Data Platforms
* Hadoop
* In the Cloud
Three Paths for Modeling
Structured / Known
• CBC
• NBR
• Attribution
• Columns
Results in a backbone
model with attributes
in defined columns
N-Structured / NVP
• CBC
• NBR
• Attribution
Results in a backbone
modes with
known/expected
attribute names/tags
N-Structured / KVP
• CBC
• NBR
Results in a backbone
model with capacity
to capture unknown
attribution either
named/tagged or not
APPLYING THE ENSEMBLE
Integration
across
Platforms
Expanded Applications
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
Summary
Ensemble in the Big Data World
• Conceptual Modeling
• Logical Modeling
• Information Modeling
• Physical Data Modeling
• Integration Platform
+
+
+
-
+ + +
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
gohansgo
Hans@GeneseeAcademy.com
HansHultgren.WordPress.com
HansHultgren
Online, On-Demand Video Lessons
DataVaultAcademy.com
DataVaultAcademy
29
e-Book: Book:
ModelingtheAgile DataWarehousewithDataVault ModelingtheAgile DataWarehousewithDataVault

Más contenido relacionado

La actualidad más candente

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentationDavid Rice
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Kent Graziano
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfIlham31574
 

La actualidad más candente (20)

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
 
Data Vault and DW2.0
Data Vault and DW2.0Data Vault and DW2.0
Data Vault and DW2.0
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentation
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdf
 

Similar a Big Data Modeling

Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseRussel Chowdhury
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Data Warehouse approaches with Dynamics AX
Data Warehouse  approaches with Dynamics AXData Warehouse  approaches with Dynamics AX
Data Warehouse approaches with Dynamics AXAlvin You
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationSunderland City Council
 
Lesson 3 - The Kimbal Lifecycle.pptx
Lesson 3 - The Kimbal Lifecycle.pptxLesson 3 - The Kimbal Lifecycle.pptx
Lesson 3 - The Kimbal Lifecycle.pptxcalf_ville86
 
L’architettura di classe enterprise di nuova generazione
L’architettura di classe enterprise di nuova generazioneL’architettura di classe enterprise di nuova generazione
L’architettura di classe enterprise di nuova generazioneMongoDB
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland Bouman
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBigDataExpo
 
The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3Terry Bunio
 
The final frontier
The final frontierThe final frontier
The final frontierTerry Bunio
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneMongoDB
 
Data Warehouse Logical Design Guide
Data Warehouse Logical Design GuideData Warehouse Logical Design Guide
Data Warehouse Logical Design GuideAndy Yuan
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware OverviewChristalin Nelson
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldKaren Lopez
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...Big Data Value Association
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OnePanchaleswar Nayak
 
chapter9-220725121547-5ed13e4d.pdf
chapter9-220725121547-5ed13e4d.pdfchapter9-220725121547-5ed13e4d.pdf
chapter9-220725121547-5ed13e4d.pdfMahmoudSOLIMAN380726
 

Similar a Big Data Modeling (20)

Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional Database
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Data Warehouse approaches with Dynamics AX
Data Warehouse  approaches with Dynamics AXData Warehouse  approaches with Dynamics AX
Data Warehouse approaches with Dynamics AX
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data Visualisation
 
Lesson 3 - The Kimbal Lifecycle.pptx
Lesson 3 - The Kimbal Lifecycle.pptxLesson 3 - The Kimbal Lifecycle.pptx
Lesson 3 - The Kimbal Lifecycle.pptx
 
L’architettura di classe enterprise di nuova generazione
L’architettura di classe enterprise di nuova generazioneL’architettura di classe enterprise di nuova generazione
L’architettura di classe enterprise di nuova generazione
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
 
The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3
 
The final frontier
The final frontierThe final frontier
The final frontier
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova Generazione
 
Data Warehouse Logical Design Guide
Data Warehouse Logical Design GuideData Warehouse Logical Design Guide
Data Warehouse Logical Design Guide
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
chapter9-220725121547-5ed13e4d.pdf
chapter9-220725121547-5ed13e4d.pdfchapter9-220725121547-5ed13e4d.pdf
chapter9-220725121547-5ed13e4d.pdf
 

Último

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Último (20)

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Big Data Modeling

  • 1. BIG DATA MODELING Hans Hultgren RMDC Fall 2016
  • 3. • Big Data1 • Data Modeling2 • Big Data Modeling3 AGENDA
  • 4. Session Objectives • Big Data Fundamentals – Components of Big Data – Structure & Schemas – Tools & Architecture • Data Modeling – Integration & History – Data Warehousing & BI – Conceptual to Physical • Big Data Modeling – Focus on Meaning • Ensemble Modeling – The Blended Architecture
  • 6. Big Data “Huge” Data Volumes n-Structured & Very Complex Streaming & Shape-Shifting Typical Data v v v v v v v v Typical Data Big Data A B C
  • 7. Big Data • Volume Huge Volumes of Data • Velocity Drinking from a Fire Hose • Variety n-Structured Data • Veracity Quality, Accuracy, Reliability, Trustworthiness • Value Business Value and Value Potential
  • 8. Big Data Architecture • To deal with the features of Big Data, supporting architectural components are based on: –Data distribution, and –Late Binding of Schemas KVP
  • 9. Modeling and Understanding • Schema on Write • Schema on Read • Dismantled Schema on Write • Schema on Focus • Schema on Leverage 9 LOAD MODEL APPLY EXPLORE
  • 10. Modeling and Understanding • Big Data Possibilities 10 LOAD MODEL APPLY EXPLORE
  • 11. Inconvenient Truth about BIG DATA http://community.embarcadero.com/blogs/entry/the-hidden-elephant-in-big-data-modeling
  • 13. Data Modeling Mans Search for Meaning… • Conceptual Modeling • Logical Modeling • Information Modeling • Physical Data Modeling
  • 14. Ensemble Modeling™ 14 All the parts of a thing taken together, so that each part is considered only in relation to the whole. • The constellation of component parts acts as a whole. • With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. An Ensemble is typically based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept. E M F
  • 15. Forms of Modeling & Ensemble 15 Ensemble Anchor Focal Point Data Vault DV2.02G Hyper Agility Temporal 6NF, etc. Matter EDW Data Mart Data Mart Data Mart ERP Acctg Sales 3NF Dimensional E M F
  • 16. The Data Vault Ensemble 16 • The Data Vault Ensemble conforms to a single key – embodied in the Hub construct. • The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History
  • 17. Ensemble means thinking differently 17 Customer Customer • The minimal construct then for an “entity” such as “Customer” is now (in data vault) a Hub with a set of Satellites
  • 18. Applying data vault modeling pattern 18
  • 19. Data Vault Ensemble Modeling Process 1) Identify and Model the Core Business Concepts • Business Interviews is at the heart of this step What do you do? What are the main things you work with? • Find best/target Natural Business Key 19
  • 20. Data Vault Ensemble Modeling Process 2) Identify and Model the Natural Business Relationships • Specific Unique Relationships • Be considerate of the Unit of Work and Grain 20
  • 21. Data Vault Ensemble Modeling Process 3) Analyze and Design the Context Satellites • Consider Rate of Change, Type of Data and also the Sources 21
  • 23. Logical business model • Leveraged for all logical model needs including the data warehouse, big data lake, master data management (MDM) and operational integration initiatives • Closely aligned to DV physical model Ensemble Logical Form ( ) 23 Customer Region Store Sale Vendor Product Sale LI Employee Customer Region Store Sale Vendor Product Sale LI Employee Customer Region Store Sale Vendor Product Sale LI Employee
  • 24. Ensemble Logical Form 24 Customer Region Store Sale Vendor Product Sale LI Employee ELF Modeling maintained in: * Metadata * Logical Data Model * Data Modeling Tools * Virtual Schemas * Other Tools or Artifacts Map to Context Data stored in: * JSON Docs * XML (w/ XSD or Not) * Blobs (Free Form Text) * Big Data Platforms * Hadoop * In the Cloud
  • 25. Three Paths for Modeling Structured / Known • CBC • NBR • Attribution • Columns Results in a backbone model with attributes in defined columns N-Structured / NVP • CBC • NBR • Attribution Results in a backbone modes with known/expected attribute names/tags N-Structured / KVP • CBC • NBR Results in a backbone model with capacity to capture unknown attribution either named/tagged or not
  • 28. Summary Ensemble in the Big Data World • Conceptual Modeling • Logical Modeling • Information Modeling • Physical Data Modeling • Integration Platform + + + - + + +
  • 29. Links and Information CDVDM Training & Certification www.GeneseeAcademy.com gohansgo Hans@GeneseeAcademy.com HansHultgren.WordPress.com HansHultgren Online, On-Demand Video Lessons DataVaultAcademy.com DataVaultAcademy 29 e-Book: Book: ModelingtheAgile DataWarehousewithDataVault ModelingtheAgile DataWarehousewithDataVault