While technological innovation brings constant change to the data landscape, many organizations still struggle with the basics: ensuring they have reliable, high quality data. In health care, the promise of insight to be gained through analytics is dependent on ensuring the interactions between providers and patients are recorded accurately and completely. While traditional health care data is dependent on person-to-person contact, new technologies are emerging that change how health care is delivered and how health care data is captured, stored, accessed and used. Using health care as a lens through which to understand the emergence of big data, this presentation will ask the audience to think about data in old and new ways in order to gain insight about how to improve the quality of data, regardless of size.
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
DAMA Webinar - Big and Little Data Quality
1. LITTLE DATA IN A BIG
DATA WORLD
The case of health care
Dataversity / DAMA
February 2016
Laura Sebastian-Coleman,
Ph.D., IQCP
Offered by: Connecticut General Life Insurance Company or Cigna Health and Life Insurance Company.
2. About me
• Doing data quality in health care since 2003
• Have worked in banking, manufacturing, distribution,
commercial insurance, and academia.
• All have influenced my understanding of data, quality,
and measurement
• Developed the Data Quality Assessment Framework
(DQAF); published in Measuring Data Quality for
Ongoing Improvement (2013).
• IAIDQ Distinguished Member Award 2015.
• DAMA Publications Director, beginning summer 2015
• Influences on my thinking about data:
• The challenge of how to measure data quality
• The concept of measurement itself: A problems of
measurement is a microcosm of the general
challenge of data definition and collection.
• The demands of data warehousing, especially of data
integration.
2
9. Definition: Data
• Data’s Latin root is dare, past participle of to give. Data means “something given.” In
math and engineering, the terms data and givens are used interchangeably.
• The New Oxford American Dictionary (NOAD) defines data as “facts and statistics
collected together for reference or analysis.”
• ISO defines data as “re-interpretable representation of information in a formalized
manner suitable for communication, interpretation, or processing” (ISO 11179).
• Observations about the concept of data
– Data tries to tell the truth about the world (“facts”)
– Data is formal – it has a shape
– Data’s function is representational
– Data is often about quantities, measurements, and other numeric
representations “facts”
– Things are done with data: reference, analysis, interpretation, processing
• What the definitions leave out:
– Data is made by people. We choose what characteristics to represent. The creation
of data implies a set of expectations about data’s condition.
– People also use data. The uses of data imply a set of expectations about data’s
condition.
9
12. Data Quality
• Our ideas about data quality
come largely from science,
even though we create data
based on commerce.
• Today, we are using
organizational data in
scientific ways – to learn
about our business.
• We expect the data to be fit
for this purpose, but we
have not focused on
ensuring representational
effectiveness.
12
Fitness for
purpose
Representational
effectiveness
14. Butterfly effect in the clinical space
A) Decision features
a. Framing (e.g., gain vs. losses) (2 factors)
b. Order of choices (e.g. A à B vs. BàA in a simple two
choice decision) (2 factors)
c. Choice justification (e.g., effect of regret, guilt etc. on
dissonance reduction; yes vs. no) (2 factors)
B) Situational factors
a. Time pressure (e.g., yes vs no) (2 factors)
b. Cognitive load (e.g., high vs. low) (2 factors)
c. Social context (e.g., important vs. not important) (2
factors)
C) Characteristics of decision-maker
a. Individual [e.g., age (old vs. young), gender (female vs.
male) (4 factors)
b. Group (e.g, small vs. large group) (2 factors)
c. Cultural factors (e.g., present vs. not preset/important) (2
factors)
D) Individual differences
a. Decision styles (e.g. intuitive vs. analytic) (2 factors)
b. Cognitive ability (e.g., high vs. low) (2 factors)
c. Personality (e.g., openness, conscientious, extraversion,
agreeableness, neuroticism) (“Big 5” factors)
Table 1. Minimum number of the factors affecting decision-making
From Effect of Initial Conditions on Reproducibility of Scientific
Research, by Benjamin Djulbegovic and Iztok Hozo
• Small changes in the initial
conditions of an experiment can
have significant effects on the
outcome of replication attempts.
• Researchers used Doctor/Patient
interactions to study the butterfly
effect and identified 12 factors
that influence clinical decision
making.
• Those initial factors make up
20,480 combinations that could
represent the initial conditions of
the experiment.
• Yes, 20,480! Initial conditions can
influence clinical decision making
and the data that is recorded as
part of it.