Enviar búsqueda
Cargar
Data-Ed Engineering Solutions to Data Quality Challenges
•
4 recomendaciones
•
2,343 vistas
DATAVERSITY
Seguir
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 75
Descargar ahora
Descargar para leer sin conexión
Recomendados
Data Catalog as a Business Enabler
Data Catalog as a Business Enabler
Srinivasan Sankar
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
DATAVERSITY
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
DMBOK - Chapter 1 Summary
DMBOK - Chapter 1 Summary
Nicolas Ruslim
Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape
CCG
Chapter 8: Reference and Master Data Management
Chapter 8: Reference and Master Data Management
Ahmed Alorage
Recomendados
Data Catalog as a Business Enabler
Data Catalog as a Business Enabler
Srinivasan Sankar
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
DATAVERSITY
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
Data Quality Best Practices
Data Quality Best Practices
DATAVERSITY
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
DMBOK - Chapter 1 Summary
DMBOK - Chapter 1 Summary
Nicolas Ruslim
Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape
CCG
Chapter 8: Reference and Master Data Management
Chapter 8: Reference and Master Data Management
Ahmed Alorage
Chapter 2: Data Management Overviews
Chapter 2: Data Management Overviews
Ahmed Alorage
DAMA Feb2015 Mastering Master Data
DAMA Feb2015 Mastering Master Data
Mary Levins, PMP
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Ahmed Alorage
Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DATAVERSITY
Chapter 13: Professional Development
Chapter 13: Professional Development
Ahmed Alorage
Data Governance Best Practices
Data Governance Best Practices
Boris Otto
Data Governance Best Practices
Data Governance Best Practices
DATAVERSITY
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
DATAVERSITY
Tips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonization
Verdantis
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Precisely
Chapter 3: Data Governance
Chapter 3: Data Governance
Ahmed Alorage
Data Governance
Data Governance
Rob Lux
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
DATAVERSITY
Data Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
Ike Ellis
Data, Information And Knowledge Management Framework And The Data Management ...
Data, Information And Knowledge Management Framework And The Data Management ...
Alan McSweeney
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
Chapter 9: Data Warehousing and Business Intelligence Management
Chapter 9: Data Warehousing and Business Intelligence Management
Ahmed Alorage
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Más contenido relacionado
La actualidad más candente
Chapter 2: Data Management Overviews
Chapter 2: Data Management Overviews
Ahmed Alorage
DAMA Feb2015 Mastering Master Data
DAMA Feb2015 Mastering Master Data
Mary Levins, PMP
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Ahmed Alorage
Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DATAVERSITY
Chapter 13: Professional Development
Chapter 13: Professional Development
Ahmed Alorage
Data Governance Best Practices
Data Governance Best Practices
Boris Otto
Data Governance Best Practices
Data Governance Best Practices
DATAVERSITY
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
DATAVERSITY
Tips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonization
Verdantis
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Precisely
Chapter 3: Data Governance
Chapter 3: Data Governance
Ahmed Alorage
Data Governance
Data Governance
Rob Lux
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
DATAVERSITY
Data Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
Ike Ellis
Data, Information And Knowledge Management Framework And The Data Management ...
Data, Information And Knowledge Management Framework And The Data Management ...
Alan McSweeney
Best Practices in Metadata Management
Best Practices in Metadata Management
DATAVERSITY
Introduction to Data Governance
Introduction to Data Governance
John Bao Vuu
Chapter 9: Data Warehousing and Business Intelligence Management
Chapter 9: Data Warehousing and Business Intelligence Management
Ahmed Alorage
La actualidad más candente
(20)
Chapter 2: Data Management Overviews
Chapter 2: Data Management Overviews
DAMA Feb2015 Mastering Master Data
DAMA Feb2015 Mastering Master Data
Chapter 12: Data Quality Management
Chapter 12: Data Quality Management
Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
Chapter 13: Professional Development
Chapter 13: Professional Development
Data Governance Best Practices
Data Governance Best Practices
Data Governance Best Practices
Data Governance Best Practices
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
Tips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonization
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Chapter 3: Data Governance
Chapter 3: Data Governance
Data Governance
Data Governance
Exploring Levels of Data Literacy
Exploring Levels of Data Literacy
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
Data Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
Data, Information And Knowledge Management Framework And The Data Management ...
Data, Information And Knowledge Management Framework And The Data Management ...
Best Practices in Metadata Management
Best Practices in Metadata Management
Introduction to Data Governance
Introduction to Data Governance
Chapter 9: Data Warehousing and Business Intelligence Management
Chapter 9: Data Warehousing and Business Intelligence Management
Similar a Data-Ed Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data Blueprint
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Sybase Türkiye
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
David Walker
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
David Walker
Big Data For Investment Research Management
Big Data For Investment Research Management
IDT Partners
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
David Linthicum
Martin Wildberger Presentation
Martin Wildberger Presentation
Mauricio Godoy
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
Corvinno Technology Transfer Center Nonprofit Public Ltd.
NASA Facilities GIS
NASA Facilities GIS
rjinterr
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Cambridge Semantics
Data Mining
Data Mining
swami920
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Will Gardella
Physical Database Requirements.pdf
Physical Database Requirements.pdf
seifusisay06
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
The METL Process in Investment Banking
The METL Process in Investment Banking
Antony Benzing
SAP EIM
SAP EIM
Sybase Türkiye
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Mark Ginnebaugh
Anexinet Big Data Solutions
Anexinet Big Data Solutions
Mark Kromer
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
RakeshKumar145431
Similar a Data-Ed Engineering Solutions to Data Quality Challenges
(20)
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Data-Ed Online: Engineering Solutions to Data Quality Challenges
Sybase whats in_your_architecture_wp
Sybase whats in_your_architecture_wp
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
Big Data For Investment Research Management
Big Data For Investment Research Management
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
Martin Wildberger Presentation
Martin Wildberger Presentation
ICT for Governance and Policy Modelling
ICT for Governance and Policy Modelling
NASA Facilities GIS
NASA Facilities GIS
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Data Mining
Data Mining
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
Physical Database Requirements.pdf
Physical Database Requirements.pdf
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
The METL Process in Investment Banking
The METL Process in Investment Banking
SAP EIM
SAP EIM
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
Anexinet Big Data Solutions
Anexinet Big Data Solutions
1.1 Data Modelling - Part I (Understand Data Model).pdf
1.1 Data Modelling - Part I (Understand Data Model).pdf
Más de DATAVERSITY
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
Make Data Work for You
Make Data Work for You
DATAVERSITY
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
Data Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
Data Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
Data Management Best Practices
Data Management Best Practices
DATAVERSITY
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
DATAVERSITY
Más de DATAVERSITY
(20)
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Make Data Work for You
Make Data Work for You
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Último
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Alan Dix
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
DianaGray10
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
LoriGlavin3
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Stephanie Beckett
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Enterprise Knowledge
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Zilliz
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Último
(20)
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Data-Ed Engineering Solutions to Data Quality Challenges
1.
Data Quality Engineering
TITLE This presentation provides guidance to organizations considering data quality initiatives or preparing for data quality initiatives. This talk will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor data quality. Showing how data quality can be engineered provides a useful framework in which to develop an organizational approach. This in turn will allow organizations to more quickly identify data problems caused by structural issues versus practice-oriented defects. Participants will also Starting learn the importance of practicing data quality point for new system Metadata Creation • Define Data Architecture • Define Data Model Structures Metadata Refinement • Correct Structural Defects • Update Implementation engineering quantification. development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data Date: October 9, 2012 Data Creation architecture and data models facts & Metadata & Data Storage data performance metadata Data Assessment meanings Time: 2:00 PM ET • Create Data • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Presented by: Dr. Peter Aiken Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 1 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
2.
Get Social With
Us! TITLE Live Twitter Feed Like Us on Facebook Join the Group Join the conversation! www.facebook.com/ Data Management & Follow us: datablueprint Business Intelligence @datablueprint Post questions and Ask questions, gain insights comments and collaborate with fellow @paiken Find industry news, insightful data management Ask questions and submit content professionals your comments: #dataed and event updates. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 2 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
3.
Meet Your Presenter:
Dr. Peter Aiken • Internationally recognized thought- leader in the data management field - 30 years of experience – Recipient of multiple international awards – Founder, Data Blueprint (http://datablueprint.com) • 7 books and dozens of articles • Experienced w/ 500+ data management practices in 20 countries • Multi-year immersions with organizations as diverse as the US DoD, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia and Walmart 3 - datablueprint.com 10/11/2012 © Copyright this and previous years by Data Blueprint - all rights reserved!
4.
Data Quality
Engineering Data Quality Engineering DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12
5.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 5 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
6.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Published by DAMA International • The professional association for Data Managers (40 chapters worldwide) DMBoK organized around • Primary data management functions focused around data delivery to the organization • Organized around several environmental elements Data Management Functions PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 6 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
7.
TITLE
The DAMA Guide to the Data Management Body of Knowledge Amazon: http:// www.amazon.com/ DAMA-Guide- Management- Knowledge-DAMA- DMBOK/dp/ 0977140083 Or enter the terms "dama dm bok" at the Amazon search engine Environmental Elements PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 7 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
8.
TITLE
What is the CDMP? • Certified Data Management Professional • DAMA International and ICCP • Membership in a distinct group made up of your fellow professionals • Recognition for your specialized knowledge in a choice of 17 specialty areas • Series of 3 exams • For more information, please visit: – http://www.dama.org/i4a/pages/ index.cfm?pageid=3399 – http://iccp.org/certification/ designations/cdmp #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 5/15/2012 8 © Copyright this and previous years by Data Blueprint - all rights reserved!
9.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 9 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
10.
TITLE
Data Management Manage data coherently. Data Program Coordination Share data across boundaries. Organizational Data Integration Data Stewardship Data Development Assign responsibilities for data. Engineer data delivery systems. Data Support Operations Maintain data availability. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
11.
TITLE
Data Management PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 11 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
12.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 12 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
13.
TITLE
Overview: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 13 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
14.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 14 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
15.
TITLE
Definitions Data Quality Management • Planning, implementation and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use • Entails the establishment and deployment of roles, responsibilities concerning the acquisition, maintenance, dissemination, and disposition of data.” http://www2.sas.com/proceedings/sugi29/098-29.pdf • Critical support process in organizational change management • Continuous process for defining the parameters for specifying acceptable levels of data quality to meet business needs and for ensuring that data quality meets these levels Data Quality • Synonymous with information quality, since poor data quality results in inaccurate information and poor business performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 15 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
16.
TITLE
Overview: DQM Concepts and Activities 1) Data Quality Management Approach 2) Develop and promote data quality awareness 3) Define data quality requirements 4) Profile, analyze and assess data quality 5) Define data quality metrics 6) Define data quality business rules 7) Test and validate data quality requirements 8) Set and evaluate data quality service levels 9) Measure and monitor data quality 10) Manage data quality issues 11) Clean and correct data quality defects 12) Design and implement operational DQM procedures 13) Monitor operational DQM procedures and performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 16 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
17.
TITLE
Concepts and Activities Data quality expectations provide the inputs necessary to define the data quality framework: – Requirements – Inspection policies – Measures, and monitors that reflect changes in data quality and performance • The data quality framework requirements reflect 3 aspects of business data expectations 1) A manner to record the expectation in business rules 2) A way to measure the quality of data within that dimension 3) An acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 17 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
18.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 18 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
19.
TITLE
The DQM Cycle The general approach to DQM is a version of the Deming cycle. Deming proposes a problem–solving model known as “plan-do-study-act” or “plan-do-check-act” The cycle begins by: 1) Identifying data issues that are critical to the achievement of business objectives 2) Defining business requirements for data quality 3) Identifying key data quality dimensions 4) Defining business rules critical to ensuring high quality data from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 19 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
20.
TITLE
The DQM Cycle: (1) Plan Plan for the assessment of the current state and identification of key metrics for measuring quality • The data quality team assesses the scope of known issues • This involves: – Determining cost and impact – Evaluating alternatives for addressing them from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 20 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
21.
TITLE
The DQM Cycle: (2) Deploy Deploy processes for measuring and improving the quality of data: • Data profiling • Institute inspections and monitors to identify data issues when they occur • Fix flawed processes that are the root cause of data errors or correct errors downstream • When it is not possible to correct errors at their source, correct them at their earliest point in the data flow from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 21 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
22.
TITLE
The DQM Cycle: (3) Monitor Monitor the quality of data as measured against the defined business rules • If data quality meets defined thresholds for acceptability, the processes are in control and the level of data quality meets the business requirements • If data quality falls below acceptability thresholds, notify data stewards so they can take action during the next stage from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 22 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
23.
TITLE
The DQM Cycle: (4) Act Act to resolve any identified issues to improve data quality and better meet business expectations • New cycles begin as new data sets come under investigation or as new data quality requirements are identified for existing data sets from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 23 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
24.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 24 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
25.
TITLE
Develop and Promote DQ Awareness • Promoting data quality awareness is essential to ensure buy-in of necessary stakeholders in the organization • Ensure that the right people in the organization are aware of the existence of data quality issues • Awareness increases the chance of success of any DQM program • Awareness includes: – Relating material impacts to data issues – Ensuring systematic approaches to regulators – Oversight of the quality of organizational data – Socializing the concept that data quality problems cannot be solely addressed by technology solutions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 25 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
26.
TITLE
Polling Question #1 Which is not a step to promote data quality awareness? a) Training on the core concepts of data quality b) Establish data governance framework for data quality c) Create a data architecture map PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 26 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
27.
TITLE
Develop and Promote DQ Awareness: Steps 1) Training on the core concepts of data quality 2) Establish data governance framework for data quality 3) Create a data quality oversight board that has a reporting hierarchy associated with the different data governance roles from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 27 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
28.
TITLE
Define DQ Requirements • Data quality must be understood within the context of ‘fitness for use’ • Data quality requirements are often hidden within defined business policies • Incremental detailed review and iterative refinement of business policies helps to identify those information requirements which become data quality rules • Steps for incremental detailed review: – Identify key data components associated with business policies – Determine how identified data assertions affect the business – Evaluate how data errors are categorized within a set of data quality dimensions – Specify the business rules that measure the occurrence of data errors – Provide a means for implementing measurement processes that assess conformance to those business rules from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 28 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
29.
TITLE
Data Quality Dimensions from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 29 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
30.
TITLE
Profile, Analyze and Assess DQ Data assessment using 2 different approaches: 1) Bottom-up 2) Top-down Bottom-up assessment: • Inspection and evaluation of the data sets • Highlight potential issues based on the results of automated processes Top-down assessment: • Engage business users to document their business processes and the corresponding critical data dependencies • Understand how their processes consume data and which data elements are critical to the success of the business application from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 30 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
31.
TITLE
Define DQ Metrics • Metrics development occurs as part of the strategy/design/plan step • Process for defining data quality metrics: 1) Select one of the identified critical business impacts 2) Evaluate the dependent data elements, create and update processes associate with that business impact 3) List any associated data requirements 4) Specify the associated dimension of data quality and one or more business rules to use to determine conformance of the data to expectations 5) Describe the process for measuring conformance 6) Specify an acceptability threshold from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 31 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
32.
TITLE
Test and Validate DQ Requirements • Data profiling tools analyze data to find potential anomalies • Use the same tools for rule validation • Rules discovered or defined during the data quality assessment phase are referenced in measuring conformance as part of the operational process from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 32 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
33.
TITLE
Set and Evaluate DQ Service Levels • Data quality inspection and monitoring are used to measure and monitor compliance with defined data quality rules • Data quality SLAs specify the organization’s expectations for response and remediation • Operational data quality control defined in data quality SLAs includes: – Data elements covered by the agreement – Business impacts associated with data flaws – Data quality dimensions associated with each data element – Quality expectations for each data element of the indentified dimensions in each application for system in the value chain – Methods for measuring against those expectations – (…) from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 33 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
34.
TITLE
Measure and Monitor DQ • DQM procedures depend on available data quality measuring and monitoring services • 2 contexts for control/measurement of conformance to data quality business rules exist: – In-stream: collect in-stream measurements while creating data – In batch: perform batch activities on collections of data instances assembled in a data set • Apply measurements at 3 levels of granularity: – Data element value – Data instance or record – Data set from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 34 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
35.
Clean & Correct
Manage DQ Issues DQ Defects • Supporting the enforcement of Perform data correction the data quality SLA requires a mechanism for reporting and in 3 ways: tracking data quality incidents 1) Automated correction and activities for researching 2) Manual directed correction and resolving those incidents 3) Manual correction • A data quality incident reporting system can provide this capability • It can log the evaluation, initial diagnosis, and actions associated with data quality events from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 35 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
36.
Manage DQ Issues:
Example TITLE Data quality incident tracking focuses on training staff to recognize when data issues appear and how they are to be classified, logged and tracked according to the data quality SLA from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 36 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
37.
Design and Implement
Monitor Operational Operational DQM DQM Procedures and Procedures Performances 1) Inspection and monitoring 1) Accountability is critical 2) Diagnosis and evaluation to governance of remediation protocols overseeing alternatives data quality control 3) Resolve issues 2) All issues must be 4) Reporting assigned 3) The tracking process should specify and document the ultimate issue accountability from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 37 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
38.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 38 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
39.
TITLE
Example: Data Quality Interview Session Summary • During mid-February, the Data Governance Team and Data Blueprint conducted ten qualitative interview sessions with groups of individuals who interact with data on regular basis • A series of patterns emerged as participants shared stories about the impact of poor data quality on the client, its products, and its customers • These patterns highlight gaps in best practices for ensuring data quality, i.e. the extent to which data is “fit for use” • Our preliminary analysis evaluated these stories against attributes of four data quality dimensions • At this early stage of the post-interview process, we are seeking confirmation of our assumptions and method PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 39 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
40.
TITLE
Which Activities Support Quality Data? • Data quality best practices depend on both – Practice-oriented activities – Structure-oriented activities Quality Practice-oriented Data Structure-oriented activities focus on activities focus on the capture and the data manipulation of data implementation PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 40 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
41.
TITLE
Quality Dimensions Practice-oriented causes • Stem from a failure to rigor when capturing and manipulating data such as: – Edit masking – Range checking of input data – CRC-checking of transmitted data Structure-oriented causes • Occur because of data and metadata that has been arranged imperfectly. For example: – When the data is in the system but we just can't access it; – When a correct data value is provided as the wrong response to a query; or – When data is not provided because it is unavailable or inaccessible to the customer • Developer focus within system boundaries instead of within organization boundaries PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 41 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
42.
TITLE
Practice-Oriented Activities • Affect the Data Value Quality and Data Representation Quality • Examples of improper practice-oriented activities: – Allowing imprecise or incorrect data to be collected when requirements specify otherwise – Presenting data out of sequence • Typically diagnosed in bottom-up manner: find and fix the resulting problem • Addressed by imposing more rigorous data-handling governance Practice-oriented activities Quality of Data Quality of Data Values Representa2on PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 42 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
43.
TITLE
Structure-Oriented Activities • Affect the Data Model Quality and Data Architecture Quality • Examples of improper structure-oriented activities: – Providing a correct response but incomplete data to a query because the user did not comprehend the system data structure – Costly maintenance of inconsistent data used by redundant systems • Typically diagnosed in top-down manner: root cause fixes • Addressed through fundamental data structure governance Structure-oriented activities Quality of Quality of Data Models Data Architecture PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 43 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
44.
TITLE
4 Dimensions of Data Quality An organization’s overall data quality is a function of four distinct components, each with its own attributes: • Data Value: the quality of data as stored & maintained in the system Practice- oriented • Data Representation – the quality of representation for stored values; perfect data values stored in a system that are inappropriately represented can be harmful • Data Model – the quality of data logically representing user requirements related to data entities, associated attributes, and their relationships; essential for effective Structure-‐ communication among data suppliers and consumers oriented • Data Architecture – the coordination of data management activities in cross-functional system development and operations PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/2012 10/09/12 44 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
45.
TITLE
Effective Data Quality Engineering • Data quality engineering has been focused on operational problem correction – Directing attention to practice-oriented data imperfections • Data quality engineering is more effective when also focused on structure-oriented causes – Ensuring the quality of shared data across system boundaries (closer to the user) (closer to the architect) Data Data Value Data Model Data Architecture Representa9on Quality Quality Quality Quality As an As understood by organiza9onal As presented to As maintained in developers asset the user the system PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 45 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
46.
TITLE
Full Set of Data Quality Attributes PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 46 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
47.
TITLE
Data Value Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 47 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
48.
TITLE
Data Representation Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 48 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
49.
TITLE
Data Model Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 49 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
50.
TITLE
Data Architecture Quality PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 50 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
51.
TITLE
Extended data life cycle model with metadata sources and uses Starting point Metadata Refinement Metadata Creation for new • Define Data Architecture • Correct Structural Defects system • Update Implementation • Define Data Model Structures development architecture data architecture refinements Metadata Structuring Data Refinement • Implement Data Model Views • Correct Data Value Defects • Populate Data Model Views corrected • Re-store Data Values data data architecture and Metadata & data models Data Storage data performance metadata Data Creation facts & Data Assessment • Create Data meanings • Assess Data Values • Verify Data Values • Assess Metadata shared data updated data Starting point for existing Data Utilization Data Manipulation systems • Inspect Data • Manipulate Data • Present Data • Updata Data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 51 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
52.
TITLE
Data Quality Engineering ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü ü from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 52 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
53.
Goals and Principles
TITLE § To measurably improve the quality of data in relation to defined business expectations § To define requirements and specifications for integrating data quality control into the system development life cycle § To provide defined processes for measuring, monitoring, and reporting conformance to acceptable levels of data quality from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 53 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
54.
TITLE
Activities • Develop and Promote Data Quality Awareness • Set and Evaluate Data Quality Service Levels • Test and Validate Data Quality Requirements • Profile, Analyze, and Assess Data Quality • Continuously Measure and Monitor Data Quality • Monitor Operational DQM Procedures and Performance • Define Data Quality Business Rules • Define Data Quality Metrics • Manage Data Quality Issues • Clean and Correct Data Quality Defects • Define Data Quality Requirements • Design and Implement Operational DQM Procedures from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 54 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
55.
TITLE
Primary Deliverables • Improved Quality Data • Data Management Operational Analysis • Data profiles • Data Quality Certification Reports • Data Quality Service Level Agreements from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 55 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
56.
TITLE
Roles and Responsibilities Suppliers: § External Sources § Regulatory Bodies § Business Subject Matter Experts § Information Consumers § Data Producers § Data Architects § Data Modelers § Data Stewards Participants: Consumers: § Data Quality Analysts § Data Stewards § Data Analysts § Data Professionals § Database Administrators § Other IT Professionals § Data Stewards § Knowledge Workers § Other Data Professionals § Managers and § DRM Director Executives § Data Stewardship Council § Customers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 56 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
57.
TITLE
Polling Question #2 What is one guiding principle for data quality? a. Business process owners will agree to and abide by data quality SLAs a. IdenDfy a blue record for all data elements a. Upstream data consumers specific data quality expectaDons PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 57 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
58.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 58 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
59.
TITLE
Technology • Data Profiling Tools • Statistical Analysis Tools • Data Cleansing Tools • Data Integration Tools • Issue and Event Management Tools from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 59 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
60.
TITLE
Overview: Data Quality Tools 4 categories of Principal tools: activities: 1) Data Profiling 1) Analysis 2) Parsing and 2) Cleansing Standardization 3) Enhancement 3) Data Transformation 4) Monitoring 4) Identity Resolution and Matching 5) Enhancement 6) Reporting from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 60 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
61.
TITLE
DQ Tool #1: Data Profiling • Data profiling is the assessment of value distribution and clustering of values into domains • Need to be able to distinguish between good and bad data before making any improvements • Data profiling is a set of algorithms for 2 purposes: – Statistical analysis and assessment of the data quality values within a data set – Exploring relationships that exist between value collections within and across data sets • At its most advanced, data profiling takes a series of prescribed rules from data quality engines. It then assesses the data, annotates and tracks violations to determine if they comprise new or inferred data quality rules PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 61 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
62.
TITLE
DQ Tool #1: Data Profiling, cont’d • Data profiling vs. data quality-business context and semantic/logical layers – Data quality is concerned with proscriptive rules – Data profiling looks for patterns when rules are adhered to and when rules are violated; able to provide input into the business context layer • Incumbent that data profiling services notify all concerned parties of whatever is discovered • Profiling can be used to… – …notify the help desk that valid changes in the data are about to case an avalanche of “skeptical user” calls – …notify business analysts of precisely where they should be working today in terms of shifts in the data PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 62 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
63.
TITLE
DQ Tool #2: Parsing & Standardization • Data parsing tools enable the definition of patterns that feed into a rules engine used to distinguish between valid and invalid data values • Actions are triggered upon matching a specific pattern • When an invalid pattern is recognized, the application may attempt to transform the invalid value into one that meets expectations • Data standardization is the process of conforming to a set of business rules and formats that are set up by data stewards and administrators • Data standardization example: – Brining all the different formats of “street” into a single format, e.g. “STR”, “ST.”, “STRT”, “STREET”, etc. PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 63 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
64.
TITLE
DQ Tool #3: Data Transformation • Upon identification of data errors, trigger data rules to transform the flawed data • Perform standardization and guide rule-based transformations by mapping data values in their original formats and patterns into a target representation • Parsed components of a pattern are subjected to rearrangement, corrections, or any changes as directed by the rules in the knowledge base PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 64 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
65.
TITLE
DQ Tool #4: Identify Resolution & Matching • Data matching enables analysts to identify relationships between records for de-duplication or group-based processing • Matching is central to maintaining data consistency and integrity throughout the enterprise • The matching process should be used in the initial data migration of data into a single repository 2 basic approaches to matching: • Deterministic – Relies on defined patterns/rules for assigning weights and scores to determine similarity – Predictable – Dependent on rules developers anticipations • Probabilistic – Relies on statistical techniques for assessing the probability that any pair of record represents the same entity – Not reliant on rules – Probabilities can be refined based on experience -> matchers can improve precision as more data is analyzed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 65 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
66.
TITLE
DQ Tool #5: Enhancement Definition: Examples of data • A method for adding value to enhancements: information by accumulating • Time/date stamps additional information about a • Auditing information base set of entities and then merging all the sets of • Contextual information information to provide a focused • Geographic information view. Improves master data. • Demographic information Benefits: • Psychographic information • Enables use of third party data sources • Allows you to take advantage of the information and research carried out by external data vendors to make data more meaningful and useful PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 66 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
67.
TITLE
DQ Tool #6: Reporting • Good reporting supports: – Inspection and monitoring of conformance to data quality expectations – Monitoring performance of data stewards conforming to data quality SLAs – Workflow processing for data quality incidents – Manual oversight of data cleansing and correction • Data quality tools provide dynamic reporting and monitoring capabilities • Enables analyst and data stewards to support and drive the methodology for ongoing DQM and improvement with a single, easy-to-use solution • Associate report results with: – Data quality measurement – Metrics – Activity PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 67 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
68.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 68 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
69.
Guiding Principles
TITLE 1) Manage data as a core organizational asset. 2) Identify a gold record for all data elements 3) All data elements will have a standardized data definition, data type, and acceptable value domain 4) Leverage data governance for the control and performance of DQM 5) Use industry and international data standards whenever possible 6) Downstream data consumers specify data quality expectations 7) Define business rules to assert conformance to data quality expectations 8) Validate data instances and data sets against defined business rules 9) Business process owners will agree to and abide by data quality SLAs 10) Apply data corrections at the original source if possible 11) If it is not possible to correct data at the source, forward data corrections to the owner of the original source. Influence on data brokers to conform to local requirements may be limited 12) Report measured levels of data quality to appropriate data stewards, business process owners, and SLA managers from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 69 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
70.
TITLE
Interdependencies - Tools alone cannot do the job! Education and Training (People) Data Cleansing and Prevention Data Quality Tools (Process) (Technology) PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
71.
TITLE
Summary: Data Quality Engineering from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 71 1/26/2010 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
72.
TITLE
Outline 1. Data Management Introduction 2. Data Quality Definitions & Overview 3. DQM Cycle 4. DQ Awareness & Requirements 5. DQ Dimensions 6. Data Quality Tools 7. Guiding Principles Tweeting now: 8. References and Q&A #dataed PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 72 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
73.
TITLE
Recommended Reading PRODUCED BY CLASSIFICATION DATE SLIDE DATA BLUEPRINT 10124-C W. BROAD ST, GLEN ALLEN, VA 23060 EDUCATION 10/09/12 73 10/04/12 © Copyright this and previous years by Data Blueprint - all rights reserved!
Descargar ahora