SlideShare una empresa de Scribd logo
1 de 63
Chapter 1: Data Warehousing 1.Basic Concepts of data warehousing 2.Data warehouse architectures 3.Some characteristics of data warehouse data 4.The reconciled data layer 5.Data transformation 6.The derived data layer 7. The user interface HCMC UT, 2008
Motivation ,[object Object],[object Object],[object Object],[object Object]
Definition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse—Subject-Oriented ,[object Object],[object Object],[object Object]
Data Warehouse - Integrated ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse -Time Variant ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Warehouse - Non Updatable ,[object Object],[object Object],[object Object],[object Object],[object Object]
Need for Data Warehousing ,[object Object],[object Object],Table 11-1: comparison of operational and informational systems
Need to separate operational and information systems ,[object Object],[object Object],[object Object],[object Object]
Data Warehouse Architectures ,[object Object],[object Object],[object Object],[object Object],[object Object],All involve some form of  extraction ,  transformation  and  loading  ( ETL )
Figure 11-2: Generic two-level architecture E T L One, company-wide warehouse Periodic extraction    data is not completely current in warehouse
Figure 11-3: Independent Data Mart Data marts: Mini-warehouses, limited in scope E T L Separate ETL for each  independent  data mart Data access complexity due to  multiple  data marts
Independent Data mart ,[object Object]
Figure 11-4:  Dependent  data mart with  operational data store E T L Single ETL for  enterprise data warehouse (EDW) Simpler data access ODS  provides option for obtaining  current  data Dependent  data marts loaded from EDW
Dependent data mart-  Operational data store ,[object Object],[object Object]
Figure 11-5:  Logical data mart and @ctive data warehouse E T L Near real-time ETL for  @active Data Warehouse ODS  and  data warehouse  are one and the same Data marts are NOT separate databases, but logical  views  of the data warehouse    Easier to create new data marts
@ctive data warehouse ,[object Object]
Table 11-2: Data Warehouse vs. Data Mart Source : adapted from Strange (1997).
Figure 11-6: Three-layer architecture
Three-layer architecture   Reconciled and derived data ,[object Object],[object Object],[object Object]
Data Characteristics Status vs. Event Data Figure 11-7:  Example of  DBMS log entry Event =  a database action (create/update/delete) that results from a transaction Status Status
Data Characteristics Transient vs. Periodic Data Figure 11-8:  Transient operational data Changes to existing records are written over previous records, thus destroying the previous data content
Data Characteristics Transient vs. Periodic Data Figure 11-9:  Periodic warehouse data Data are never physically altered or deleted once they have been added to the store
Other data warehouse changes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Reconciliation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The ETL Process ,[object Object],[object Object],[object Object],[object Object],ETL = Extract, transform, and load
Figure 11-10: Steps in data reconciliation Static extract  = capturing a snapshot of the source data at a point in time Incremental extract  = capturing changes that have occurred since the last static extract Capture = extract…obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse
Figure 11-10: Steps in data reconciliation (continued) Scrub = cleanse…uses pattern recognition and AI techniques to upgrade data quality Fixing errors:  misspellings, erroneous dates, incorrect field usage, mismatched addresses, missing data, duplicate data, inconsistencies Also:  decoding, reformatting, time stamping, conversion, key generation, merging, error detection/logging, locating missing data
Figure 11-10: Steps in data reconciliation (continued) Transform = convert data from format of operational system to format of data warehouse Record-level: Selection  – data partitioning Joining  – data combining Aggregation  – data summarization Field-level:   single-field  – from one field to one field multi-field  – from many fields to one, or one field to many
Figure 11-10: Steps in data reconciliation (continued) Load/Index= place transformed data into the warehouse and create indexes Refresh mode:  bulk rewriting of target data at periodic intervals Update mode:  only changes in source data are written to data warehouse
Data Transformation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Record-level functions &  Field-level functions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Figure 11-11: Single-field transformation In general – some transformation function translates data from old form to new form Algorithmic  transformation uses a formula or logical expression Table   lookup  – another approach
Figure 11-12: Multifield transformation M:1 –from many source fields to one target field 1:M –from one source field to many target fields
Derived Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Most common data model =  star schema (also called “dimensional model”)
The Star Schema ,[object Object],[object Object]
Figure 11-13: Components of a  star schema Fact tables  contain factual or quantitative data Dimension tables  contain descriptions about the subjects of the business  1:N relationship between dimension tables and fact tables  Excellent for ad-hoc queries,  but bad for online transaction processing Dimension tables are denormalized to maximize performance
Figure 11-14: Star schema example Fact table  provides statistics for sales broken down by product, period and store dimensions
Figure 11-15: Star schema with sample data
Issues Regarding Star Schema ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Figure 11-16: Modeling dates Fact tables contain time-period data    Date dimensions are important
Variations of the Star Schema ,[object Object],[object Object],[object Object],[object Object]
Multiple Fact tables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Factless Fact Tables ,[object Object],[object Object],[object Object],[object Object]
Factless fact table showing occurrence of an event.
Factless fact table showing coverage
Normalizing dimension tables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multivalued dimension
Snowflake schema ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example of snowflake schema Sales Fact Table time_key item_key branch_key location_key units_sold dollars_sold avg_sales Measures time_key day day_of_the_week month quarter year time location_key street city_key location item_key item_name brand type supplier_key item branch_key branch_name branch_type branch supplier_key supplier_type supplier city_key city province_or_street country city
The User Interface ,[object Object],[object Object],[object Object],[object Object],[object Object]
Role of Metadata (data catalog) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Querying Tools ,[object Object],[object Object],[object Object],[object Object]
On-Line Analytical Processing (OLAP) ,[object Object],[object Object],[object Object],[object Object],[object Object]
From tables to data cubes ,[object Object],[object Object],[object Object],[object Object]
MOLAP Operations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Figure 11-22: Slicing a data cube
Figure 11-23:  Example of drill-down Summary report Drill-down with color added
Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Visualization ,[object Object]
OLAP tool Vendors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To AnalyticsAlex Meadows
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceAlmog Ramrajkar
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applicationsraj
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Database auditing essentials
Database auditing essentialsDatabase auditing essentials
Database auditing essentialsCraig Mullins
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Big data visualization
Big data visualizationBig data visualization
Big data visualizationAnurag Gupta
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceRonan Soares
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Harish Chand
 
Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsSystems Limited
 

La actualidad más candente (20)

Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To Analytics
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applications
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Database auditing essentials
Database auditing essentialsDatabase auditing essentials
Database auditing essentials
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Big data visualization
Big data visualizationBig data visualization
Big data visualization
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Ppt
PptPpt
Ppt
 
Active database system
Active database systemActive database system
Active database system
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data Analytics
 

Destacado

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleSajjad Zaheer
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
Data base management system
Data base management systemData base management system
Data base management systemouvesh
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDhilsath Fathima
 
Group Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGroup Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGaurav Paliwal
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkDr. Sunil Kr. Pandey
 
Data Base Management ! Batra Computer Centre
Data Base Management ! Batra Computer Centre Data Base Management ! Batra Computer Centre
Data Base Management ! Batra Computer Centre jatin batra
 
Etmam logistics & Riyadh Warehouse Presentation
Etmam logistics & Riyadh Warehouse PresentationEtmam logistics & Riyadh Warehouse Presentation
Etmam logistics & Riyadh Warehouse PresentationNabeel Ahmed
 
Lecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data WarehouseLecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data Warehousephanleson
 
Bài thuyết trình quản trị cung ứng đề tài maersk logistics quốc tế và việt nam
Bài thuyết trình quản trị cung ứng   đề tài maersk logistics quốc tế và việt namBài thuyết trình quản trị cung ứng   đề tài maersk logistics quốc tế và việt nam
Bài thuyết trình quản trị cung ứng đề tài maersk logistics quốc tế và việt namhttps://www.facebook.com/garmentspace
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Warehouse management and operations rfid
Warehouse management and operations rfidWarehouse management and operations rfid
Warehouse management and operations rfidSopagna Chan
 

Destacado (20)

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with Example
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data base management system
Data base management systemData base management system
Data base management system
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousing
 
Group Presentation on Bussiness Intelligence
Group Presentation on Bussiness IntelligenceGroup Presentation on Bussiness Intelligence
Group Presentation on Bussiness Intelligence
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural Framework
 
Data Base Management ! Batra Computer Centre
Data Base Management ! Batra Computer Centre Data Base Management ! Batra Computer Centre
Data Base Management ! Batra Computer Centre
 
Quản trị cung ứng công ty coca cola việt nam
Quản trị cung ứng công ty coca cola việt namQuản trị cung ứng công ty coca cola việt nam
Quản trị cung ứng công ty coca cola việt nam
 
Etmam logistics & Riyadh Warehouse Presentation
Etmam logistics & Riyadh Warehouse PresentationEtmam logistics & Riyadh Warehouse Presentation
Etmam logistics & Riyadh Warehouse Presentation
 
Lecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data WarehouseLecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data Warehouse
 
Bài thuyết trình quản trị cung ứng đề tài maersk logistics quốc tế và việt nam
Bài thuyết trình quản trị cung ứng   đề tài maersk logistics quốc tế và việt namBài thuyết trình quản trị cung ứng   đề tài maersk logistics quốc tế và việt nam
Bài thuyết trình quản trị cung ứng đề tài maersk logistics quốc tế và việt nam
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housingData ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housing
 
Data-ware Housing
Data-ware HousingData-ware Housing
Data-ware Housing
 
Warehouse management and operations rfid
Warehouse management and operations rfidWarehouse management and operations rfid
Warehouse management and operations rfid
 

Similar a Data Warehouse

Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptMutiaSari53
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
The Database Environment Chapter 11
The Database Environment Chapter 11The Database Environment Chapter 11
The Database Environment Chapter 11Jeanie Arnoco
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4ambujm
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3ambujm
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxnikshaikh786
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingsumit621
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxDURGADEVIL
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 

Similar a Data Warehouse (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
DW 101
DW 101DW 101
DW 101
 
The Database Environment Chapter 11
The Database Environment Chapter 11The Database Environment Chapter 11
The Database Environment Chapter 11
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Chpt2.ppt
Chpt2.pptChpt2.ppt
Chpt2.ppt
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Unit 1
Unit 1Unit 1
Unit 1
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 

Más de Samir Sabry

Keyboard symbols
Keyboard symbolsKeyboard symbols
Keyboard symbolsSamir Sabry
 
2010 Calendriersexy
2010 Calendriersexy2010 Calendriersexy
2010 CalendriersexySamir Sabry
 
Sample Test Word Intermediate Mulitple Choice
Sample Test Word Intermediate Mulitple ChoiceSample Test Word Intermediate Mulitple Choice
Sample Test Word Intermediate Mulitple ChoiceSamir Sabry
 
Computer Fundamentals Test
Computer Fundamentals TestComputer Fundamentals Test
Computer Fundamentals TestSamir Sabry
 
Database Management System And Design Questions
Database Management System And Design QuestionsDatabase Management System And Design Questions
Database Management System And Design QuestionsSamir Sabry
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image ProcessingSamir Sabry
 

Más de Samir Sabry (15)

Mapping rules
Mapping rulesMapping rules
Mapping rules
 
Mapping example
Mapping exampleMapping example
Mapping example
 
Mapping example
Mapping exampleMapping example
Mapping example
 
Normalization
NormalizationNormalization
Normalization
 
Keyboard symbols
Keyboard symbolsKeyboard symbols
Keyboard symbols
 
Xhtml
XhtmlXhtml
Xhtml
 
Normlaization
NormlaizationNormlaization
Normlaization
 
Mapping
MappingMapping
Mapping
 
Data mining
Data miningData mining
Data mining
 
2010 Calendriersexy
2010 Calendriersexy2010 Calendriersexy
2010 Calendriersexy
 
Sample Test Word Intermediate Mulitple Choice
Sample Test Word Intermediate Mulitple ChoiceSample Test Word Intermediate Mulitple Choice
Sample Test Word Intermediate Mulitple Choice
 
Computer Fundamentals Test
Computer Fundamentals TestComputer Fundamentals Test
Computer Fundamentals Test
 
Database Management System And Design Questions
Database Management System And Design QuestionsDatabase Management System And Design Questions
Database Management System And Design Questions
 
Test In Word
Test In WordTest In Word
Test In Word
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 

Último

VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Tina Ji
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdftbatkhuu1
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfPaul Menig
 
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...noida100girls
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 

Último (20)

VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdf
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 

Data Warehouse

  • 1. Chapter 1: Data Warehousing 1.Basic Concepts of data warehousing 2.Data warehouse architectures 3.Some characteristics of data warehouse data 4.The reconciled data layer 5.Data transformation 6.The derived data layer 7. The user interface HCMC UT, 2008
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Figure 11-2: Generic two-level architecture E T L One, company-wide warehouse Periodic extraction  data is not completely current in warehouse
  • 12. Figure 11-3: Independent Data Mart Data marts: Mini-warehouses, limited in scope E T L Separate ETL for each independent data mart Data access complexity due to multiple data marts
  • 13.
  • 14. Figure 11-4: Dependent data mart with operational data store E T L Single ETL for enterprise data warehouse (EDW) Simpler data access ODS provides option for obtaining current data Dependent data marts loaded from EDW
  • 15.
  • 16. Figure 11-5: Logical data mart and @ctive data warehouse E T L Near real-time ETL for @active Data Warehouse ODS and data warehouse are one and the same Data marts are NOT separate databases, but logical views of the data warehouse  Easier to create new data marts
  • 17.
  • 18. Table 11-2: Data Warehouse vs. Data Mart Source : adapted from Strange (1997).
  • 19. Figure 11-6: Three-layer architecture
  • 20.
  • 21. Data Characteristics Status vs. Event Data Figure 11-7: Example of DBMS log entry Event = a database action (create/update/delete) that results from a transaction Status Status
  • 22. Data Characteristics Transient vs. Periodic Data Figure 11-8: Transient operational data Changes to existing records are written over previous records, thus destroying the previous data content
  • 23. Data Characteristics Transient vs. Periodic Data Figure 11-9: Periodic warehouse data Data are never physically altered or deleted once they have been added to the store
  • 24.
  • 25.
  • 26.
  • 27. Figure 11-10: Steps in data reconciliation Static extract = capturing a snapshot of the source data at a point in time Incremental extract = capturing changes that have occurred since the last static extract Capture = extract…obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse
  • 28. Figure 11-10: Steps in data reconciliation (continued) Scrub = cleanse…uses pattern recognition and AI techniques to upgrade data quality Fixing errors: misspellings, erroneous dates, incorrect field usage, mismatched addresses, missing data, duplicate data, inconsistencies Also: decoding, reformatting, time stamping, conversion, key generation, merging, error detection/logging, locating missing data
  • 29. Figure 11-10: Steps in data reconciliation (continued) Transform = convert data from format of operational system to format of data warehouse Record-level: Selection – data partitioning Joining – data combining Aggregation – data summarization Field-level: single-field – from one field to one field multi-field – from many fields to one, or one field to many
  • 30. Figure 11-10: Steps in data reconciliation (continued) Load/Index= place transformed data into the warehouse and create indexes Refresh mode: bulk rewriting of target data at periodic intervals Update mode: only changes in source data are written to data warehouse
  • 31.
  • 32.
  • 33. Figure 11-11: Single-field transformation In general – some transformation function translates data from old form to new form Algorithmic transformation uses a formula or logical expression Table lookup – another approach
  • 34. Figure 11-12: Multifield transformation M:1 –from many source fields to one target field 1:M –from one source field to many target fields
  • 35.
  • 36.
  • 37. Figure 11-13: Components of a star schema Fact tables contain factual or quantitative data Dimension tables contain descriptions about the subjects of the business 1:N relationship between dimension tables and fact tables Excellent for ad-hoc queries, but bad for online transaction processing Dimension tables are denormalized to maximize performance
  • 38. Figure 11-14: Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions
  • 39. Figure 11-15: Star schema with sample data
  • 40.
  • 41.
  • 42. Figure 11-16: Modeling dates Fact tables contain time-period data  Date dimensions are important
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. Factless fact table showing occurrence of an event.
  • 48. Factless fact table showing coverage
  • 49.
  • 51.
  • 52. Example of snowflake schema Sales Fact Table time_key item_key branch_key location_key units_sold dollars_sold avg_sales Measures time_key day day_of_the_week month quarter year time location_key street city_key location item_key item_name brand type supplier_key item branch_key branch_name branch_type branch supplier_key supplier_type supplier city_key city province_or_street country city
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59. Figure 11-22: Slicing a data cube
  • 60. Figure 11-23: Example of drill-down Summary report Drill-down with color added
  • 61.
  • 62.
  • 63.