Tackling data quality problems requires more than a series of tactical, one off improvement projects. By their nature, many data quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process and technology. Join Donna Burbank and Nigel Turner as they provide practical ways to control data quality issues in your organization.
1. Copyright Global Data Strategy, Ltd. 2021
Data Quality Best Practices
Donna Burbank and Nigel Turner
Global Data Strategy, Ltd.
August 26th, 2021
Follow on Twitter @donnaburbank, @nigelturner8
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
2. Global Data Strategy, Ltd. 2021
Donna Burbank
2
• Recognized industry expert in information
management with over 25 years of
experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture
• Managing Director at Global Data Strategy,
Ltd., an international information
management consulting company that
specializes in the alignment of business
drivers with data-centric technology
• Worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks
regularly at industry conferences
• Excellence in Data Management Award
from DAMA International
• Past President and Advisor to the DAMA
Rocky Mountain chapter
• Co-author of several books on data
management
• Regular contributor to industry
publications
• She can be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, US
Follow on Twitter @donnaburbank
@GlobalDataStrat
3. Global Data Strategy, Ltd. 2021
Nigel Turner
• Worked in Information Management
(IM) and related areas for over 25
years. Experience has embraced Data
Governance, Information Strategy,
Data Quality, Data Governance, Master
Data Management & Business
Intelligence.
• Spent much of his career in British
Telecommunications Group (BT)
where he led a series of enterprise-
wide IM & data governance initiatives.
• Also been VP of Information
Management Strategy at Harte Hanks
Trillium Software, and Principal
Consultant at FromHereOn and IPL.
• Nigel is very active in professional Data
Management organizations and is an
elected Data Management Association
(DAMA) UK Committee member.
• He was the joint winner of DAMA
International’s 2015 Community Award
for the work he initiated and led in
setting up a mentoring scheme in the
UK where experienced DAMA
professionals coach and support newer
data management professionals.
• Nigel is based in Cardiff, Wales, UK.
Follow on Twitter @NigelTurner8
Today’s hashtag: # DAStrategies
4. Global Data Strategy, Ltd. 2021
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Data Modeling Case Study – Business Data Modeling at Kiewit
• April Master Data Management – Aligning Data, Process, and Governance
• May Data Architecture, Solution Architecture, Platform Architecture – What’s the Difference?
• June Enterprise Architecture vs. Data Architecture
• July Best Practices in Metadata Management
• August Data Quality Best Practices (with guest Nigel Turner)
• September Data Modeling Techniques
• October Data Governance: Aligning Technical & Business Approaches
• December Data Architecture for Digital Transformation
4
This Year’s Lineup
5. Global Data Strategy, Ltd. 2021 5
What We’ll Cover Today
• Tackling data quality problems requires more than a
series of tactical, one off improvement projects.
• By their nature, many data quality problems extend
across and often beyond an organization.
• Addressing these issues requires a holistic architectural
approach combining people, process and technology.
6. Global Data Strategy, Ltd. 2021
Agenda
6
• Discuss how to deliver data quality improvements in the Baseline & Develop
phases of the A2E methodology
• Highlight the critical role of Business Rules in improving Data Quality
• Illustrate why getting Business Rules right is critical
• Outline how to use Business Rules to correct poor data quality and sustain
improved data quality
7. Global Data Strategy, Ltd. 2021 7
A Successful Data Strategy links Business Goals with Technology Solutions
“Top-Down” alignment with
business priorities
“Bottom-Up” management &
inventory of data sources
Managing the people, process,
policies & culture around data
Coordinating & integrating
disparate data sources
Leveraging & managing data for
strategic advantage
Data Quality is Part of a Wider Data Strategy
www.globaldatastrategy.com
8. Global Data Strategy, Ltd. 2021
Tackling Data Quality: the A2E approach
8
Assess
Baseline
Converge
Develop
Evaluate
Cycle of Continuous
Data Quality Improvement
Step Purpose
Assess Business
Usage
Understand what data exists and how it is used
within the organization
Baseline Data
Sources
Baseline the current quality of the data and
assess how well it is meeting business needs
Converge on
Business Critical Areas
Focus priorities to optimise early business
benefits and set ‘fit for purpose’ quality targets
to guide improvement activities
Develop
Improvements
Design & deploy improvement initiatives
(encompassing people, process, and technology)
and measure the impact against targets
Evaluate Benefits &
ROI
Regularly measure the data and continue to
improve it so that it continues to meet current
and future business needs
9. Global Data Strategy, Ltd. 2021
Data Quality Improvement: The Importance of Business Rules
9
”A Business Rule is a criterion
used to guide day-to-day
business activity, shape
operational business judgments,
or make operational business
decisions.”
Ronald Ross, quoted in
architectureandgovernance.com
• In a data context, business rules are used to define and
enforce the standards that data must conform to
• Have a key role in assessing, baselining and improving data
quality
• Can be used to:
• Cleanse and enhance existing data
• Become standards which new data must conform to
• Guide data design in new developments
• Enforce data standards in existing applications and platforms
• Stop poor quality data being entered at source, e.g. via drop
down lists, screen entry validation etc.
10. Global Data Strategy, Ltd. 2021
How Do You Classify Business Rules?
• Many different ways to classify business rules – can be very complex
• A simple classification is:
10
FORMAT BUSINESS RULES CONTENT BUSINESS RULES
Specify the format standards data
should comply with
Include:
• Field length
(fixed, variable etc.)
• Character format
(e.g. Alphabetic, Numeric,
Alphanumeric etc.)
Specify the allowable content
of records or fields
Include:
• Allowable values
• Whether mandatory or
optional
• Relationships with other
fields or records
11. Global Data Strategy, Ltd. 2021
Example Data Related Business Rules
11
FORMAT RULES
• A UK National Insurance Number must be in the format: aa nn nn nn a
• An employee must have a unique Employee ID in the format: aa nnnn
• Date of birth should be in North American format of MM/DD/YYYY
• A full US zip code must be in the format nnnnn-nnnn
• Internet router identifier must be in the format Aaa_Nan_Naa
12. Global Data Strategy, Ltd. 2021
Example Data Related Business Rules
12
CONTENT RULES
• Every Sales Representative must be assigned to one and only one Sales Region
• A valid email address must be entered by a customer to enable a customer’s
order to be accepted
• Gender codes must have the valid value of Male, Female or Unknown
• A supplier must have at least one associated geographical address
• Product Price should be Product Unit Cost + 25%
CONTENT
13. Global Data Strategy, Ltd. 2021
How Do You Identify Business Rules?
• Business rules can be discovered or derived from:
• Data models (Business / Logical / Physical)
• Business documentation (e.g. Process Descriptions, User Instructions)
• IT Documentation (e.g. requirements specifications, system manuals)
• Source code (e.g. If ‘A Then B’ statements)
• Master and / or Reference Data Sources (e.g. currency codes, product
master data)
• Documented metadata (e.g. Business Glossaries, Data Dictionaries,
Metadata Repositories)
• Data profiling outputs
• Talking to key stakeholders:
• Data owners and data stewards (if in place)
• Data producers and consumers
• Other business and IT subject matter experts
13
VITAL IMPORTANCE OF STAKEHOLDER
ENGAGEMENT:
• Business rules are frequently implicit (i.e. locked
in people’s heads) and not formally documented
• Where business rules are documented,
documentation is often out of date and not
updated in line with system changes
14. Global Data Strategy, Ltd. 2021
Data Models Describe the Organization
• Relationships define the data-centric Business Rules of an organization
• You should be able to “read” a data model like a sentence
• The Entities / Concepts are the “nouns” – the boxes on a data model
• It’s often helpful to start by taking some text describing the organization (or transcripts
from stakeholder interviews) and draw boxes around the nouns to find the core entities
• An employee can work for more than one department.
• A customer can have more than one account.
• A department can contain more than one employee.
Customer
Employee
Account
Department
14
BUSINESS
RULES
15. Global Data Strategy, Ltd. 2021
Deriving Business Rules: Business Data Model
• A business data
model provides
core definitions
of key data
objects.
• It also shows key
relationships
between data
objects.
• Even a simple
diagram as the
one on the right
can tell a
powerful “story”
…. And
uncover key
business rules
• Communication & definition of core data concepts & their definitions
BUSINESS RULE:
A COMPANY must
contain 1 or more
customers with an
active account
BUSINESS RULE:
An EMPLOYEE must be
on the active payroll
BUSINESS RULE:
A CUSTOMER is a
current or former client
who must have had an
account active within
the last 6 months
16. Global Data Strategy, Ltd. 2021 16
REAL
QUALITY
DATA
LIFE
STORIES
HORROR
2021
When Business Rules Go Wrong or Go Missing
17. Global Data Strategy, Ltd. 2021
Why Do Business Rules Matter? DQ ‘Short’comings
• Liam Thorp made headline news in the UK in Feb 2021
• Received a priority invite for a Covid-19 vaccination because
he was medically classed as ‘morbidly obese’
• The reason – his local health board had recorded his height as
6.2 centimetres and not his real height of 6 feet 2 inches
• This made his Body Mass Index (BMI) 28,000, calculated by his
weight / height ratio
• A BMI of 40 and above is classed as ‘morbidly obese’
• Now corrected, and he was put back in his rightful place in the
vaccine queue
17
Liam Thorp
32 years old
Liverpool
resident
“I can see the funny
side of this story but
also recognise there is
an important issue for
us to address”
Chair of the Liverpool
Clinical Commissioning
Group (leading the city’s
vaccine roll out)
Beatles statue
City of Liverpool
KEY PROBLEM - ABSENCE
OF BUSINESS RULES TO
SPECIFY:
• Minimum Height
• Maximum BMI
(Content)
18. Global Data Strategy, Ltd. 2021
Why Do Business Rules Matter? ‘Miss’ing weight
• UK Air Accidents Investigation Branch (AAIB) report (April 2021)
declared a ‘Serious Incident’ at Birmingham airport, UK
• Report highlighted that 3 flights to Europe in July 2020 had taken off with
the weight of the plane load underestimated by an average 1,200kg
• This miscalculation could have caused a ‘serious incident’ on take off as it
determines take off speed, thrust etc.
• Problem happened because all passengers with the title ‘Miss’ were
automatically assumed by outsourced IT suppliers to be children and not
adults
• A child’s standard estimated weight is 35kg; an adult 69kg
• The airline described it as ‘ a simple flaw in its IT system’
• In reality, there was a serious problem with its business rules!
• The airline has now introduced manual validation of all passengers at
check in to ensure adults titled ‘Miss’ are changed to ‘Ms’ on the
passenger roster (?)
KEY PROBLEMS:
• Reliance on IT, and not the business,
to specify the business rules
• Making cultural assumptions that
were incorrect
19. Global Data Strategy, Ltd. 2021
Four Step Process: Using Business Rules for Data Quality Improvement
19
STEP 1:
Profile
data
sources
STEP 2:
Agree
priority DQ
problems &
design
Business
Rules
STEP 3:
Deploy
Business
Rules
STEP 4:
Monitor &
report
adherence
to Business
Rules
CYCLE OF CONTINUOUS
DATA QUALITY
IMPROVEMENT
20. Global Data Strategy, Ltd. 2021
Step 1: Quantifying Data Problems - The Value of Data Profiling
20
• The benefits of data profiling include:
• Checks conformance of the dataset with
business rules
• Enables fact-based discussion of the causes and
impacts of data problems
• Great starting point for Data Quality
improvement workshops
• Automatic generation of metadata
• Supports both data quality focus &
improvement and metadata capture
• Data profiling tools automate the process
of assessing and reporting on the quality
of data sources
• Data profiling can also be done via SQL,
without purchasing a tool
Example partial Data Profiling report
21. Global Data Strategy, Ltd. 2021
Step 1: An Alternative Approach to Quantifying Data Problems
21
Source:
Only 3% of Companies’ Data
Meets Basic Quality Standards
Tadhg Nagle, Thomas C. Redman
& David Sammon
Harvard Business Review
September 11 2017
21
22. Global Data Strategy, Ltd. 2021
EMPLOYEE NO SURNAME FIRST NAME GENDER DATE OF BIRTH
ROLE
CODE
802540 Smith Brian Female 31/01/56 PM16
YN4176B Gregg Male 07/09/80 9999
811609 Patel Priya XXXX 25/12/78 AL60
22298 Bothroyd Bridget Female 28/08/09 TBD
802540 Smith Bryan Male 31/01/56 PM10
855265 Hayes Leslie Female 00/00/00 AL76
Taylor Kevin Unknown 12/30/69 US18
22
Note: Records extracted and anonymized from an actual HR database
Step 1: Data Profiling & Potential Data Quality Problem Identification
23. Global Data Strategy, Ltd. 2021
EMPLOYEE NO SURNAME FIRST NAME GENDER
DATE OF
BIRTH
ROLE CODE
802540 Smith Brian Female 31/01/56 PM16
YN4176B Gregg Male 07/09/80 9999
811609 Patel Priya XXXX 25/12/78 AL60
22298 Bothroyd Bridget Female 28/08/09 TBD
802540 Smith Bryan Male 31/01/56 PM10
855265 Hayes Leslie Female 00/00/00 AL76
Taylor Kevin Unknown 12/30/69 US18
ANSWER: Total number of potential Data Quality problems is 13 or 19, depending on
whether Smith is a duplicate record
23
23
Step 1: Data Profiling & Potential DQ Problem Identification
Key:
Potential
Duplicate
Record
Potential
Data Quality
Problem
24. Global Data Strategy, Ltd. 2021
Step 2: Business Review & Validation
• Data profiling findings should be reviewed by appropriate business & IT
stakeholders
• If formal Data Governance in place, this should ideally led by the Data Stewards
responsible for the specific data domains
• Aim to reach consensus on what the business impact is
• Ways of doing this:
• Workshops and / or meetings (virtual or F2F)
• By workflows, seeking views on the potential problem areas
• For priority areas, agree Business Rules which should be in place to drive and
enforce data quality improvement
• Create and deploy Business Rules
• Test rules first in case of unforeseen downstream impacts
• Embed in appropriate operational systems or Data Quality Rules Engine (see later)
24
25. Global Data Strategy, Ltd. 2021
Step 3: Using Business Rules to steer and enforce Data Quality standards
25
Example potential format
business rules
Example potential
content business rules
Employee No. must be in format
nnnnnn. Blank Employee Numbers
are allowed if new starter awaiting
Emp. No. allocation
Gender should align with First
Name derived from Common
Names Reference file
First Name must not be blank Allowable Genders are FEMALE,
MALE, SELF-DETERMINED or
UNKNOWN
Role code must be in format AAnn Date of Birth must be expressed
as DD/MM/YY and in the range
01/01/1940 to 12/12/2005
Date of Birth must be in format
nn/nn/nn
Employee No. should be unique.
Only one Emp. No. should be
allocated to any individual
employee
26. Global Data Strategy, Ltd. 2021
Step 3: Deploying Business Rules - Approaches
26
Data Quality Tool:
DQ Business Rules
Engine
Master & Reference
Data Management
Application Code
(e.g. data input
validation)
Data Entry
Guidelines,
Business Glossary
& Training
27. Global Data Strategy, Ltd. 2021
Step 3: Automating Data Quality Business Rules via a DQ Rules Engine
DATA
INPUT
DATA
WAREHOUSE
STAGING / ETL
LAYER
SOURCE
SYSTEMS
REPORTING
LAYER
DATA
MARTS
Real Time Data Validation
Batch
Validation
DATA QUALITY
RULES ENGINE
28. Global Data Strategy, Ltd. 2021
Step 4: Monitor & Report Adherence
• When Business Rules are implemented can be used to:
• Check continued adherence of existing data
• Enforce the rules on new data to prevent new problems
• Best monitored via Data Quality Dashboards
• Provide regular reports on adherence of data to Business Rules
• Set KPIs to drive continuous data improvement
• Identify data quality trends
• Highlight areas where corrective action required
• Indicate where / if Business Rules may need to be amended to
meet changing business needs
• When reporting always try to relate data quality to business
outcomes
• Address the ‘so what’ objection
• Puts a financial or other benefit on continued data quality
improvement
28
Data Quality Dashboard
29. Global Data Strategy, Ltd. 2021
Summary
• Business Rules are key to uncovering data quality
problems and driving data quality improvement
• Business Rules can be explicit or implicit so have to be
discovered and created in a variety of ways
• Follow the simple 4 Step process outlined to ensure you
optimize the value of Business Rules in your data quality
initiatives
• Remember that Business Rules are not set in stone and
need to be monitored and amended in line with changing
organizational needs and requirements
• With data quality the business always ultimately rules, so
Business Rules provide the means to enable this
29
30. Global Data Strategy, Ltd. 2021
Who We Are: Business-Focused Data Strategy
Maximize the Organizational Value of Your Data Investment
In today’s business environment, showing rapid time to value for
any technical investment is critical.
But technology and data can be complex. At Global Data Strategy,
we help demystify technical complexity to help you:
• Demonstrate the ROI and business value of data to your
management
• Build a data strategy at your pace to match your unique culture
and organizational style.
• Create an actionable roadmap for “quick wins”, which building
towards a long-term scalable architecture.
Global Data Strategy’s shares experience from some of the largest
international organizations scaled to the pace of your unique team.
www.globaldatastrategy.com
Global Data Strategy has worked with organizations globally in the
following industries:
Finance · Retail · Social Services · Health Care · Education · Manufacturing
· Government · Public Utilities · Construction · Media & Entertainment ·
Insurance …. and more
31. Global Data Strategy, Ltd. 2021
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Data Modeling Case Study – Business Data Modeling at Kiewit
• April Master Data Management – Aligning Data, Process, and Governance
• May Data Architecture, Solution Architecture, Platform Architecture – What’s the Difference?
• June Enterprise Architecture vs. Data Architecture
• July Best Practices in Metadata Management
• August Data Quality Best Practices (with guest Nigel Turner)
• September Data Modeling Techniques
• October Data Governance: Aligning Technical & Business Approaches
• December Data Architecture for Digital Transformation
31
This Year’s Lineup