1. Use of Data for Quality and Program
Improvement
Hugh Sturrock
Aimee Leidich
1
2. OUTLINE
2
• Introduction to data quality
• Basics of data visualization
• Introduction to pivot tables
• Example data exercise
• Real-world data exploration
4. Why good data is important
4
Facility Level
• Serves as basis for planning and developing Interventions
• Allows providers to identify patients/clients in need of services and/or referrals
• Improves efficiency through administrative organization
• Inventories resources and determines which supplies and medicines are available and which need to
be ordered when
• Monitors and evaluates quality of care
Region/district level
• Informs acquisition and distribution of resources
• Provides evidence for construction and/or expansion of
facilities
• Explains human resource capabilities and challenges
• Assists with more precise budgeting
• Assists council authorities in planning interventions and
monitoring those activities
• Demonstrates trends in calculated indicators used to
estimate future changes
• Demonstrates trends in calculated indicators used to
estimate future changes
National level
• Informs policy
• Assists in planning and assessing
various interventions to make
strategic decisions about the
improvement of those
interventions
• Works towards meeting the
overall national goal of reducing
the burden of poor health
• Provides evidence towards
meeting targets
• Provides the basis for M&E
6. Key Terms
• Data
• Indicator
• Quality Data
• Quality Control
• Data Quality Checks
• Data Quality Assessment
6
7. Quality Data
• Data that is reliable and accurately represents
the measure it was intended to present and is
valid for the use to which it is applied.
Decision makers have confidence in and rely
upon quality data.
7
8. Quality Control
• Process of controlling the usage of data with
known quality measurement for an
application or a process.
8
9. Data Quality Assessment
Procedure for determining whether or not a
data set is suitable for its intended purpose.
9
10. Data Quality Checks
• Procedures for verifying that forms, registers
and databases are completely and correctly
filled at each step of the reporting process.
– Examples:
• Spot-checks
• Cross-verifications
10
11. Spot-checks of actual service delivery tools
Perform spot checks to verify the complete and accurate
documentation of delivery of services or commodities.
11
Test date Unique ID No. Patient clinic
ID
Name Sex Age Result
Surname
Given name
07/01/2007 KS0031 1852 Michelle f 44 Pos
07/02/2007 KS0014 1824 Mary m 31
KS0088 1864 Andrew m 26
14/07/2007 KS0013 1754 Charles m 71 Neg
Missing date Incorrect gender entry
Missing data
12. Cross-check with other data-sources
Cross-check the verified report totals with
other data-sources (e.g. inventory records,
laboratory reports, aggregated reports etc).
12
Quarterly Report
Facility 1 25
Facility 2 20
TOTAL 45
Facility 1:
Cases: 25
Facility 2:
Cases: 20
14. Accuracy
• Also known as validity. Accurate data are
considered correct when the data measure
what they are intended to measure. Accurate
data minimize error (e.g., recording or
interviewer bias, transcription error, sampling
error) to a point of being negligible.
14
15. Precision
• Data have sufficient detail meaning they have
all the parameters and details needed to
produce the required information.
15
16. Completeness
• All variables in either reporting or recording
tools must be filled. It represents the
complete list of eligible persons or units and
not just a fraction of the list.
16
17. Timeliness
• Data are up-to-date (current) and information
is available on time. This implies all the
reports produced are submitted to the next
level within the recommended timeframe.
17
Due May 7th
18. Reliability
• The data generated by a program’s
information system are based on protocols
and procedures that do not change according
to who is using them and when or how often
they are used. The data are reliable because
they are measured and collected consistently.
18
19. Integrity
• Data have integrity when the system used to
generate them are protected from deliberate
bias or manipulation for political or personal
reasons.
19
20. Confidentiality
• Clients are assured that their data will be
maintained according to national and/or
international standards for data. This means
that personal data are not disclosed
inappropriately and that data in hard copy and
electronic form are treated with appropriate
levels of security (e.g. kept in locked cabinets
and in password protected files).
20
21. Factors that contribute to poor data
quality
• Data entry errors
• Inconsistent reporting forms
• Missing data
• Delayed reporting
• Failure to report
21
22. 22
Common Sources of Errors
• Transposition
• Copying
• Coding
• Routing
• Consistency
• Range
• Gaps
• Calculation
23. Indicator Result
Number of Pregnant
Women 21
23
Transposition Error
When two numbers are switched. Usually caused by
typing mistakes. (e.g. 12 is entered as 21)
12
Transposition error
24. 24
Copying Error
When a number or letter is copied as the wrong number
or letter. (e.g. 0 entered as the letter O)
Number
0 Entered as
Letter
O
25. Study ID SNo101 SNo102
1 54 3
2 30 1
3 22 2
4 43 3
5 33 2
6 30 2
11 37 3
Sno. Maswali Mpangilio wa kundi (Kodi)
101 Una miaka mingapi?
(Miaka kamili)
Miaka_____________
102 Umesoma mpaka
darasa la ngapi?
Hajasoma 0
Hakumaliza elimu ya msingi 1
Amemaliza elimu ya msingi 2
Hakumaliza elimu ya sekondari 3
Amemaliza elimu ya sekondari 4
Elimu ya juu (Chuo,chuo kikuu,
n.k.) 5
Hakujibu 98
25
Coding Error
When the wrong code is entered. (e.g. interview subject
circled 1 = Yes, but the coder copied 2 (= No) during
coding) Entered as 4
during interview
Coded as 3 in
the dataset
26. Registration and Personal Information
Unique CTC ID
Number
Why eligible
(Transfer in)
Sex
Age/
DOB
(under-5)
Name
211852 2
Michelle
Bamba F 44
331824 2 Mary Musa F
121864 2
Andrew
Matua M 26
26
Routing
When a number is placed in the wrong field or in the
wrong order (e.g. gender entered into the age category)
Gender erroneously entered
into the age category
27. Unique CTC ID
Number
Why eligible
(Transfer in)
Sex
Age/
DOB
(under-5)
Name
211852 2
Michelle
Bamba F 44
331824 2 Mary Musa M 34
121864 2
Andrew
Matua M 26
27
Consistency
When two or more responses on the same questionnaire
are contradictory (e.g. birth date and age; name and
gender) Mary erroneously
entered as a male
28. Unique CTC ID
Number
Why eligible
(Transfer in)
Name Sex
Weight
Age/
DOB
(under-5)
211852 2
Michelle
Bamba F 44 600
331824 2 Mary Musa M 34 42
121864 2
Andrew
Matua M 26 41
28
Range
When a number lies outside the range of probable
or possible values (e.g. Age = 151 yrs)
Weight erroneously
entered as 600kg
29. Registration and Personal Information
Unique CTC ID
Number
Why eligible
(Transfer in)
Sex
Age/
DOB
(under-5)
Name
2
Michelle
Bamba F 44
2 Mary Musa M 34
2
Andrew
Matua M 26
29
Gaps
When data are not filled in
Unique ID is missing
30. Calculation
When data is not calculated correctly. (e.g. 3+1 = 5)
Indicator TOTAL
(Males +
Females)
Males Females
Total
<1 year
1-4 years
5-14 years
≥15 years
Total
<1 year
1-4 years
5-14 years
≥15 years
1.1 Cumulative number of persons
ever enrolled in care at this facility
at beginning of the reporting
quarter 350 110 3 2 8 97 230 5 7 17 201
340 = 110 + 230
Total males and females added erroneously
32. Why Do We Spend So Much Time and
Energy Collecting All This Data ?!
Strengthen M&E
programs
Use evidence for
decision making
Strengthen
capacity of staff
Improve program
planning and
resource allocation
Gain efficiency
and
effectiveness
Improve data
quality
32
33. Data Is At The Center of M&E
Improve
coverage, reach,
intensity of
services
DATA
Improve
quality of
data
Priority setting
and resource
allocation
Accountability
But…..only if we review, discuss, interpret, and
use it regularly! 33
34. Use Data To Guide Resource
Allocation
• A program needs adequate resources and staff in
order to achieve its goals.
• Presenting high-quality program data can help
program managers to advocate for additional
resources.
Our malaria
surveillance data
suggest we need more
vehicles!
Our malaria
surveillance data
suggest we need more
trained nurses!
Our RDT data suggest we
need faster allocation of
RDTs to avoid stockouts
34
35. Data Use for Decision Making
• No one “gold standard” approach
• Hybrid of approaches depending on the
context
– Dissemination in all appropriate forums
– Motivate/incentivise efforts in data use
– Reduce institutional and behavioural barriers to
data use (e.g. accountability and performance
measurement; attitudes)
37. Key Definitions
• Results: Simple description/observations of your results
(who, what, where, when, magnitude, trend).
• Interpretation: Explanation of why your results may have
occurred.
• Conclusion: the key message of your results, implications
and the “action-plan” that you recommend based on your
results.
– The “Take Away”
37
Result Interpretation Conclusion
Nine elephants damaged
storefronts on Market St
in San Francisco in 2010,
one elephant damaged a
store in 2013.
The number of elephants
on Market St in San
Francisco has decreased
since 2010 because a
zookeeper has started
laying a trail of peanuts to
Ocean Beach
Citizens should be
sensitized to encourage
elephants to play at the
beach instead of on
Market St
39. Presenting Data In Tables
• Tables may be the only presentation format needed when the
data are few, relationships are straightforward and when
display of exact values is important.
Table X. PEPFAR annual progress reporting, PMTCT indicators, FY12-13,
Namibia
Indicator Estimate
Number of pregnant women that are tested or know their
HIV status at ANC and L&D 62,142
Number of pregnant women with known positive status at
entry to ANC or L&D 7,546
Number of pregnant women newly tested positive 4,251
Source: PEPFAR Annual Progress Report, Namibia 2013
39
40. Bar Charts Are Useful to Show Simple
Comparisons, Esp. Differences in Quantity.
Fig. 7. Partner HIV testing among pregnant women
attending ANC, Country X, 2009-10 to 2011-12.
55,097 57,219
70,025
2,659 (4.8%) 2,490 (4.4%) 2,546 (3.6%)
80,000
70,000
60,000
50,000
40,000
30,000
20,000
10,000
0
2009 -10 2010 -11 2011-12
# of women or partners
Year
Pregnant women attending ANC Partner tested for HIV
40
41. Line Charts Are Good for Showing
Change Over Time (Trend)
Fig. 8. Percentage of patients alive on ART at 12 months after
initiation in Country X, by initiation cohort year.
77%
87%
91% 92% 91%
88% 88% 88% 87%
82%
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
41
42. Bar and Line Charts Can Be Used Together to
Show Trends Of Several Related Indicators
25,000
20,000
15,000
10,000
5,000
0
35
30
25
20
15
10
5
0
Fig. 9. Estimated MTCT rate at 6 weeks and MTCT rate at 6
weeks including breastfeeding, Country X, 2005-2012
2005 2006 2007 2008 2009 2010 2011 2012
# infants exposed
% infants infected
Year
Number infants exposed MTCT rate (excluding breastfeeding infants)
MTCT rate including breastfeeding infants
43. 43
Maps show geographic relationships
Est. no. HIV + per sq km
44. Figure title
• Be sure to include:
What (the indicator)
• HIV prevalence
• % circumcised
• % alive on ART
Who
• pregnant women age 15-49
• adults males age 15-49
• pediatric ART patients
Where
• in Namibia
• in Ohangwena region
• at Engela Hospital Clinic
When
• in 2012
• from 2008 to 2012
44
45. Who ?
70%
60%
50%
40%
30%
20%
10%
0%
What ?
When ?
2009-10 2010-11 2011-12
% distribution of ARV type
Fig. 11. Distribution of ARV prophylaxes used for PMTCT among
HIV positive pregnant women attending antenatal care in Namibia,
2009-10 to 2011-12.
Single-dose NVP Combination ARV HAART
Where ?
Source: Namibia MOHSS (2012) Annual Implementation Progress Report for the National Strategic Framework (NSF) 2011/12.
45
46. Presenting Data Tips (2)
• All relevant information needed to interpret the table,
figure, or map should be included so that the reader can
understand without reference to text (i.e. in a report)
• Clearly label your X and Y axes, format consistently (font,
font size, style, position)
• Use data series legends /labels
• Make the scale appropriate for the findings you want to
convey.
• Reference the source of your data
46
47. 100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Clear chart title
Fig. 12. Distribution of ARV prophylaxes used for PMTCT among HIV positive
pregnant women attending antenatal care in Namibia, 2009-10 to 2011-12.
2009-10 2010-11 2011-12
% distribution of ARV type
Reporting period
X-axis label
Single-dose NVP Combination ARV HAART
X-axis
label
Source: Namibia MOHSS (2012) Annual Implementation Progress Report for the National Strategic Framework (NSF) 2011/12.
Y-axis
label
Series legend
Data source reference
47
Scale spans to
100% to display
complete picture
48. Stratification of Data
• What is stratification?
– Dividing into subgroups
• What are common levels of data stratification?
– Year, age, sex, geographic region, facility
• Why do we stratify?
– Let’s look at stratification within the indicator:
• % of patients alive on ART 12 months after
initiation
48
49. What Do You Think About This
Figure?
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Fig. 13. Percentage of patients alive on ART at 12 months
after ART initiation.
49
50. We Can Stratify By Time, e.g.
Initiation Cohort…
Fig. 14. Percentage of patients alive on ART at 12 months
after initiation in Country X, by initiation cohort year.
77%
87%
91% 92% 91%
88% 88% 88% 87%
82%
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
50
51. We Can Stratify by Age Group
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
Fig. 15. Percentage of patients alive on ART at 12 months after
initiation in Country X, by cohort year and adult vs. pediatric
patients.
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
Adults Children
51
52. We Can Stratify By Geographic Area
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
Fig.19. Percentage of adult patients alive on ART at 12 months after
initiation by cohort year and selected districts in Country X.
2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
District A District B District C
52
53. We Can Stratify By Facilities Within
Geographic Areas
Fig. 20.Percentage of adult patients alive on ART at 12 months after
initiation by selected facilities within District Q in Country X.
0.87 0.91
0.89
0.81
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
2009 2010 2011 2012
% alive on ART
Initiation cohort year
Q: Health Centre 1 Q: Health Centre 2
Q: District Hospital District Q overall
53
54. We Can Stratify By Sex and Geography …
Three indicators for HIV testing by sex and province. Zambia. 2007
Females
54
Males
Source: DHS 2007
56. Magnitude and Trend (1)
• Magnitude :
– the amount of coverage
– The size of the difference between sub-groups or
time points
• Trend:
– the direction of change over time (i.e. increasing,
decreasing, or remaining stable)
56
57. Magnitude and Trend Statements (2)
“ From 1992 to 2002, HIV prevalence among pregnant women
increased (trend) from 4.2% to 22% (magnitude).
After peaking at 22% in 2002 (magnitude), HIV prevalence has
remained fairly stable from 2004-2012 (trend) at around 18-20%
(magnitude).”
57
Fig. 23. HIV prevalence among pregnant women receiving antenatal care at public
facilities in Country X, 1992-2012
58. Interpretation of Results
• Descriptive results are what you see,
interpretation is how you see it.
• Why do you think your results are what they are?
What are 1-2 possible programmatic
explanations:
– Programmatic/guidelines changes? (e.g. CD4 ART eligibility,
Option B+)
– Increased/decreased access to services at facilities within
district/region?
– Staff reductions? Staff trained in new areas (e.g. IMAI)
– Are data missing from some time points, facilities, sub-groups?
– Are there facilities or districts that are not reporting,
underreporting for this time period, or reporting data
differently?
58
59. Interpretation Statement (3)
“ Retention in District A is declining much more rapidly compared to the national
average. These declines may be related to the higher than average loss of ART doctors
within this district, which may have effected access and quality of care. Alternatively, the
observed trend in District A may be a result of incomplete data reported in the ePMS.
59
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
Fig. 28. Percentage of adult patients alive on ART at 12 months after
initiation by cohort year and selected districts in Country X.
2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
District A District B District C
61. Drawing Conclusions (1)
• Conclusions are the “take-away” message, i.e. what you want
your audience to remember and do after the presentation.
• Especially relating to programmatic implications of results.
• Conclusion can include the presenter’s recommendations for:
• Program improvement
• Additional data verification/quality checks
61
62. Conclusion Statement (2)
“Patient and facility level factors predictive of patient loss that are unique to District A
should be identified and corrected. Best practices from higher performing districts
should be shared. Failure to do so may result in increased AIDS mortality and drug
resistance in this district. The completeness of data from this district should also be
confirmed to validate our results.
62
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
Fig.31. Percentage of adult patients alive on ART at 12 months after
initiation by cohort year and selected districts in country X.
2004 2005 2006 2007 2008 2009 2010 2011 2012
% alive on ART
Initiation cohort year
District A District B District C