SlideShare una empresa de Scribd logo
1 de 34
Quality issues…Evaluation…Metadata
Resolution-Levels of details
Antti Jakobsson
Key Users Meeting 15th Oct 2010, Brussels
Key sucesses of the WP8
• Utilization of International and Open standards
• Common understanding of what quality means
in respect to the target specifications and user
requirements and
• How to measure it !
• Provision of these results in metadata
• Automation of the quality evaluation services
Benefits
• Early data error
detection;
• Faster product
turnaround;
• Reduced
maintenance costs;
• Consistent evaluation
procedures
• Better harmonisation;
• Improved spatial
analysis;
• Confident decision
making;
• Data that is trusted
and usable.
Data providers Data consumers
ESDIN approach
ESDIN approach to quality
ESDIN approach to quality
Quality Spreadsheets
GEOGRAPHICAL
NAMES 0
DATA
QUALITY
ELEMENTS
COM
PLE
TEN
ESS
LOGICAL
CONSISTENCY
POSITIONAL
ACCURACY
TEMPORAL
ACCURACY
THEMATIC
ACCURACY
FEATURE
TYPE &
Attributes
COM
MIS
SIO
N
OMISSI
ON
CONCE
PTUAL
CONSIS
TENCY
DOMAIN
CONSIS
TENCY
FORMA
T
CONSIS
TENCY
TOPOL
OGICAL
CONSIS
TENCY
ABSOL
UTE
ACCUR
ACY
RELATI
VE
ACCUR
AY
GRIDDE
D DATA
POSITI
ON
ACCUR
ACY
ACCUR
ACY OF
A TIME
MEASU
REMEN
T
TEMPO
RAL
CONSIS
TENCY
TEMPO
RAL
VALIDIT
Y
CLASSI
FICATIO
N
CORRE
CTNES
S
NON-
QUANTI
TATIVE
ATTRIB
UTE
CORRE
CTNES
S
QUANTI
TATIVE
ATTRIB
UTE
ACCUR
ACY
NamedPlace DQ
basic
measure
error
rate: Id 7
DQ
basic
measure
error
count: Id
10
inspireId DQ
basic
measure
error
count: Id
16
name
(Geographical
Name)
Sampling/Full Inspection
The cells of DQ basic measures are colour coded. The colours indicate the
evaluation procedure:
 Attribute inspection by sampling according to ISO
2859 series (yellow cell)
 Variable inspection by sampling according to ISO
3951-1 (green cell)
 Full inspection (orange cell)
FEATURES AND
ATTRIBUTES SAMPLING (ISO 2859)
FULL INSPECTION
(automatic) SAMPLING (ISO 3951)
ISO 2859 states the principles of testing
sufficient items of the whole population by
sampling. When expressed as two
integers the error ratios of data subsets
can be summed up to data set error rate
by dividing the total number of errors with
the total c
If errors exist (error count > 0) the sub set
should be rejected and corrective action
by the producer is needed. It is assumed
that the number of errors found is quite
small. The customer may be attempted to
make those few corrections them selves.
This i
ISO 3951 variable sampling gives reliable
results on small sample sizes. CE95/LE95
is close enough the upper limit (U) of the
standard on AQL 4 level. The ISO 3959
offers a clear acceptance criteria based
on the sample.
Mandatory
Voidable
Optional
According to INSPIRE Data
Specifications v3
Relevant data quality measures
Relevant ISO/TS 19138 data quality
measures
1 Name Rat
e of
exc
ess
ite
ms
Rate of
missing
items
Numbe
r of
items
not
compli
ant with
the
rules of
the
concep
tual
schem
a
Numbe
r of
invalid
overlap
s of
surface
s
Numbe
r of
items
not in
confor
mance
with
their
value
domain
Physic
al
structur
e
conflict
s
number
of
faulty
point-
curve
connec
tions
number
of
missing
connec
tions
due to
unders
hoots
number
of
missing
connec
tions
due to
oversh
oots
number
of
invalid
slivers
number
of
invalid
self-
interse
ct
errors
number
of
invalid
self-
overlap
errors
mean
value
of
position
al
uncerta
inties
(1D,
2D and
3D)
Linear
map
accura
cy at
95 %
signific
ance
level
Circula
r error
at 95 %
signific
ance
level
Misclas
sificatio
n rate
Rate of
incorre
ct
attribut
e
values
attribute
value
uncertainty
at 95 %
significance
level
2 Alias - - - overlapping
surfaces
- extraneous
nodes
undershoots overshoots slivers loops kickbacks - LMAS 95 % navigation
accuracy
-
3 Data
quality
element
compl
etenes
s
completenes
s
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
logical
consistency
positional
accuracy
positional
accuracy
positional
accuracy
thematic
accuracy
thematic
accuracy
thematic accuracy
4 Data
quality
subeleme
nt
commi
ssion
omission conceptual
consistency
conceptual
consistency
domain
consistency
format
consistency
topological
consistency
topological
consistency
topological
consistency
topological
consistency
topological
consistency
topological
consistency
absolute or
external
accuracy
absolute or
external
accuracy
absolute or
external
accuracy
classification
correctness
non-
quantitative
attribute
correctness
quantitative attribute
accuracy
5 Data
quality
basic
measure
err
or
rate
error
rate
error
count
error
count
error
count
error
count
error
count
error
count
error
count
error
count
error
count
error
count
not
aplicab
le
LE95
or
LE95I,
depen
ding
on the
evaluat
ion
CE95 error
rate
error
rate
LE95 or
LE95(r),
depending
on the
evaluation
procedure
Testing plans
24
How to utilize the quality model
• Quality model will be transformed to a rule set and conformance
levels
• ELF specifications will include these for the NMCAs
• Automated tools utilizing the rule and conformance levels
Quality requirements/Conformance
levels
• To set the requirements use the quality measures
• To consider the nature of reality
– Feature vagueness
– Change rates
– Themes
• Suggested guidance for positional accuracy
• Suggestion on setting the classification of conformance levels
Setting conformance levels (examples)
• Geometric accuracy is critical and
mostly well defined characteristic of
cadastral parcels while the
geographical names like a name of a
lake does not have just one correct
location. Any location within the area
of the lake is acceptable.
• Completeness of transportation
network is important to know and it
can be explicitly evaluated. Wetlands
may be important areas in
hydrography but their existence or
delineation can be hard to evaluate
during a dry season
Example logical consistency
Example Thematic Accuracy
Positional accuracy
Quality evaluation Process
• Step 1: Applying the data quality measure to the data to be checked.
The procedure for this is described in the the ISO19113/19114
standards
• Step 2: Reporting the score for each measure in a report form for
each measure
• Step 3: Comparing the result from step two to the defined
conformance level
• In addition, two continuing steps can be done:
• Step 4: Summarizing the conformance results into one result for
each for each data quality elements
• Step 5: Summarising the results from step 4 into one overall dataset
result
Aggregation of data quality
conformance results
• Aggregation where the measurements are on different scales and have different
units. -> transform all the data quality quantitative results into conformance results
using a set of conformance levels/classes. See previous slides
• Aggregation for inhomogeneous data. This can be done by just reporting the lowest
quality found in the most remote areas (see nature of reality slide). Another way (the
one recommended here) is to use different conformance classifications for the
different kind of areas (urban, rural, remote), and then summarise based on
“conformance score”. To make this useful, a metadata description is needed to give
the distribution between the kinds of area.
• Reporting details. The simplest way of reporting is just to give one value for the
dataset. This can be a simple “passed” or “failed” with a reference to the product
specification. But doing a lot of work in quality assessment, and just report one value,
can be considered oversimplification. One way of giving quality statements as grades
may be useful on the step 4 and step 5 (see above)..
Grading data example
Grade Data Quality description
Excellent Only class A for all quality measures
Very good A majority of A’s, but also some B’s
Good A majority of B’s, some A’s, no C’s
Adequat Only a very few C’s, the other B’s and better
Marginal A majority of C’s but also some B’s
Not good No measure reached the class B (i.e. all measures on class
C)
ESDIN approach to quality
Where you utilize quality webservices?
• If you are a data provider for SDI
– For quality control during production (automated)
called here conformance testing (this includes edge-
matching and generalization)
– For quality evaluation after the production (semi-
automated)
• If you are the SDI co-ordinator or data custodian
– For quality audit for process accreditation or data
certification doing either conformance testing and/or
quality evalution
• If you are customer or data user
– To evaluate usability using metadata information
Rulesets &
TemplatesDatabase
Object Oriented
Geospatial Rules Engine
Collaborative
Web-based
Rule Authoring
Web
Services
Interface
Data Quality
Evaluation
Service
Business
Rules
Data for Evaluation Quality Measures
Geospatial
Data File
Rule Builder:
Intuitive user interface
to author, agree and
manage DQ measures.
DQ Client Application:
Accessible, easy to use,
automatic Data Quality
Evaluation Service
DQ Rules Engine:
W3C Web Services interface
using open standards to
describe & execute
geospatial rule evaluation.
Rule Repository:
Data Quality Rules,
derived and guided
by Quality Model.
Web Feature
Service
Quality Evaluation Service
SOAP HTTP
40
DQ Rule Builder Environment
41
DQ Evaluation Service Concept
42
DQ Evaluation Report
43
ESDIN approach to quality
Metadata approach
• Metadata needed for discovery of datasets through metadata
catalogues and registries
• Metadata needed for the evaluation of those datasets, as to whether
they are of sufficient quality to meet end users’ needs
• Metadata specific to the requirements of the ELF specifications
Are we INSPIRE compliant?
• Yes…. We suggest some of the measures to be changed in the
future editing of the INSPIRE data specification
• There are some mistakes in the current specification that should be
corrected
• We also propose additional mesures
ESDIN/INSPIRE difference Admin units
Suggested
by INSPIRE
Data
Specificatio
n v3
Administrati
ve Units
Section
Data Quality
Element
Data Quality
sub-element
ISO 19138
measure Measure name / Basic quality measure Scope ESDIN quality model Comment
7.1.1
Completene
ss Commission Id 3 Rate of excess items / error rate
dataset-
level
The same as
ESDIN
7.1.2
Completene
ss Omission Id 7 Rate of missing items / error rate
dataset-
level
The same as
ESDIN
7.2.1.1
Logical
Consistency
Topological
consistency Id 21 *
Number of faulty point-curve
connections / error count
dataset-
level
The same as
ESDIN
7.2.1.2
Logical
Consistency
Topological
consistency Id 23
Number of missing connections due
to undershoots / error count
dataset-
level
The same as
ESDIN
7.2.2
Logical
Consistency
Conceptual
consistency Id 9
Conceptual schema compliance /
correctness indicator
dataset-
level
Number of items
not compliant with
the rules of the
conceptual schema
/ error count
used ID 10 in
stead. Id 9
applicable just
on single
instance level
7.3.1
Positional
Accuracy
Absolute
External
positional
accuracy Id 28
Mean value of positional uncertainties
(1D,2D and 3D) / not applicable
dataset-
level
Linear map
accuracy at 95 %
significance level /
LE95 or LE95I
Not used, used
36 instead
* Id 21 in ISO 19138, but has the incorrect id 9 in INSPIRE DataSpecification
AU
Additional quality measures
Additional ones from
ESDIN WP8
Logical
Consistency
Conceptual
consistency Id 11 Number of invalid overlaps of surfaces / error count dataset-level Topological consistency
Logical
Consistency
Domain
consistency Id 16
Number of items not in conformance with their value
domain / error count dataset-level
Logical
Consistency
Conceptual
consistency Id 10
Number of items not compliant with the rules of the
conceptual schema / error count dataset-level
Logical
Consistency
Format
consistency Id 19 Physical structure conflicts / error count dataset-level
Logical
Consistency
Topological
consistency Id 25 number of invalid slivers / error count dataset-level
Logical
Consistency
Topological
consistency Id 26 number of invalid self-intersect errors / error count dataset-level
Logical
Consistency
Topological
consistency Id 27 number of invalid self-overlap errors / error count dataset-level
Positional
accuracy
Absolute or
external
postitional
accuracy Id 36
Linear map accuracy at 95 % significance level / LE95
or LE95I dataset-level
Thematic
accuracy
Classification
correctness Id 61 Misclassification rate / error rate dataset-level
Thematic
accuracy
Non-quantitative
attribute
correctness Id 67 Rate of incorrect attribute values / error rate dataset-level
Resolution and Level of Details
Target level of detail
Scale
1:2,500,000
1:1,000,000
1,500,000
1,250,000
1,100,000
1,50,000
1:25,000
1:10,000
1:5,000
1:2,500
Global
Target level of detail
Regional
Master
Urban
Rural
Level of details
Mountainous
Target level
of detail
Conclusions
• It is important that INSPIRE will give a platform for data quality
information; minimum data quality comformance levels set and then
ability to report other user community related conformance levels
• Quality evaluation metadata should be available for automated
conformance testing
• Introducing a quality model which uses a same principles for all
Annex I themes -> we will suggest this a guideline for INSPIRE
implementation
• Introducing comformance levels that can be evaluated using semi-
automated or automated based on ISO standards
• Automation of quality evaluation and conformance testing can be
done for all transformation related workflows including schema
transformation, generalization and edge matching
• Significant saving potential in quality reporting and improvement of
data

Más contenido relacionado

Similar a Quality key users

1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixGe Peng
 
Webinar on Environmental Footprint data quality
Webinar on Environmental Footprint data qualityWebinar on Environmental Footprint data quality
Webinar on Environmental Footprint data qualityMarisa Vieira
 
Data warehouse 14 data reconciliation tools
Data warehouse 14 data reconciliation toolsData warehouse 14 data reconciliation tools
Data warehouse 14 data reconciliation toolsVaibhav Khanna
 
Test data management
Test data managementTest data management
Test data managementRohit Gupta
 
Value Stream Mapping – Stories From the Trenches
Value Stream Mapping – Stories From the TrenchesValue Stream Mapping – Stories From the Trenches
Value Stream Mapping – Stories From the TrenchesDevOps.com
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methodsgopichandbalusu
 
Using Data Visualization to Improve Your Data Balance Sheet
Using Data Visualization to Improve Your Data Balance SheetUsing Data Visualization to Improve Your Data Balance Sheet
Using Data Visualization to Improve Your Data Balance SheetAntea Group
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisGabor Szabo, CQE
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenLuis Willumsen
 
Metrology & The Consequences of Bad Measurement Decisions
Metrology & The Consequences of Bad Measurement DecisionsMetrology & The Consequences of Bad Measurement Decisions
Metrology & The Consequences of Bad Measurement DecisionsRick Hogan
 
20171019 data migration (rk)
20171019 data migration (rk)20171019 data migration (rk)
20171019 data migration (rk)Ruud Kapteijn
 
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...DATAVERSITY
 
Lecture 22
Lecture 22Lecture 22
Lecture 22Shani729
 

Similar a Quality key users (20)

What’s State of the Data?
What’s State of the Data?What’s State of the Data?
What’s State of the Data?
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity Matrix
 
IM426 3A G5.ppt
IM426 3A G5.pptIM426 3A G5.ppt
IM426 3A G5.ppt
 
SampleProject1
SampleProject1SampleProject1
SampleProject1
 
Webinar on Environmental Footprint data quality
Webinar on Environmental Footprint data qualityWebinar on Environmental Footprint data quality
Webinar on Environmental Footprint data quality
 
Presentation 1.pptx
Presentation 1.pptxPresentation 1.pptx
Presentation 1.pptx
 
Data warehouse 14 data reconciliation tools
Data warehouse 14 data reconciliation toolsData warehouse 14 data reconciliation tools
Data warehouse 14 data reconciliation tools
 
Test data management
Test data managementTest data management
Test data management
 
Value Stream Mapping – Stories From the Trenches
Value Stream Mapping – Stories From the TrenchesValue Stream Mapping – Stories From the Trenches
Value Stream Mapping – Stories From the Trenches
 
ml-09x01.pdf
ml-09x01.pdfml-09x01.pdf
ml-09x01.pdf
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methods
 
Using Data Visualization to Improve Your Data Balance Sheet
Using Data Visualization to Improve Your Data Balance SheetUsing Data Visualization to Improve Your Data Balance Sheet
Using Data Visualization to Improve Your Data Balance Sheet
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsen
 
Amita_Kashyap1_CV
Amita_Kashyap1_CVAmita_Kashyap1_CV
Amita_Kashyap1_CV
 
Metrology & The Consequences of Bad Measurement Decisions
Metrology & The Consequences of Bad Measurement DecisionsMetrology & The Consequences of Bad Measurement Decisions
Metrology & The Consequences of Bad Measurement Decisions
 
20171019 data migration (rk)
20171019 data migration (rk)20171019 data migration (rk)
20171019 data migration (rk)
 
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
 
Lecture 22
Lecture 22Lecture 22
Lecture 22
 

Más de Antti Jakobsson

ELF presentation for the ArticSDI steering group
ELF presentation for the ArticSDI steering groupELF presentation for the ArticSDI steering group
ELF presentation for the ArticSDI steering groupAntti Jakobsson
 
ELF - European Location Framework
ELF - European Location FrameworkELF - European Location Framework
ELF - European Location FrameworkAntti Jakobsson
 
European Location Framework and how it related to e-government and INSPIRE
European Location Framework and how it related to e-government and INSPIREEuropean Location Framework and how it related to e-government and INSPIRE
European Location Framework and how it related to e-government and INSPIREAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 
European Location Framework and INSPIRE
European Location Framework and INSPIREEuropean Location Framework and INSPIRE
European Location Framework and INSPIREAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location FrameworkAntti Jakobsson
 

Más de Antti Jakobsson (12)

Suomen vanhat kartat
Suomen vanhat kartatSuomen vanhat kartat
Suomen vanhat kartat
 
ELF presentation for the ArticSDI steering group
ELF presentation for the ArticSDI steering groupELF presentation for the ArticSDI steering group
ELF presentation for the ArticSDI steering group
 
ELF - European Location Framework
ELF - European Location FrameworkELF - European Location Framework
ELF - European Location Framework
 
European Location Framework and how it related to e-government and INSPIRE
European Location Framework and how it related to e-government and INSPIREEuropean Location Framework and how it related to e-government and INSPIRE
European Location Framework and how it related to e-government and INSPIRE
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 
European Location Framework and INSPIRE
European Location Framework and INSPIREEuropean Location Framework and INSPIRE
European Location Framework and INSPIRE
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 
Ichc2013
Ichc2013Ichc2013
Ichc2013
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 
European Location Framework
European Location FrameworkEuropean Location Framework
European Location Framework
 

Último

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Último (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Quality key users

  • 1. Quality issues…Evaluation…Metadata Resolution-Levels of details Antti Jakobsson Key Users Meeting 15th Oct 2010, Brussels
  • 2. Key sucesses of the WP8 • Utilization of International and Open standards • Common understanding of what quality means in respect to the target specifications and user requirements and • How to measure it ! • Provision of these results in metadata • Automation of the quality evaluation services
  • 3. Benefits • Early data error detection; • Faster product turnaround; • Reduced maintenance costs; • Consistent evaluation procedures • Better harmonisation; • Improved spatial analysis; • Confident decision making; • Data that is trusted and usable. Data providers Data consumers
  • 7. Quality Spreadsheets GEOGRAPHICAL NAMES 0 DATA QUALITY ELEMENTS COM PLE TEN ESS LOGICAL CONSISTENCY POSITIONAL ACCURACY TEMPORAL ACCURACY THEMATIC ACCURACY FEATURE TYPE & Attributes COM MIS SIO N OMISSI ON CONCE PTUAL CONSIS TENCY DOMAIN CONSIS TENCY FORMA T CONSIS TENCY TOPOL OGICAL CONSIS TENCY ABSOL UTE ACCUR ACY RELATI VE ACCUR AY GRIDDE D DATA POSITI ON ACCUR ACY ACCUR ACY OF A TIME MEASU REMEN T TEMPO RAL CONSIS TENCY TEMPO RAL VALIDIT Y CLASSI FICATIO N CORRE CTNES S NON- QUANTI TATIVE ATTRIB UTE CORRE CTNES S QUANTI TATIVE ATTRIB UTE ACCUR ACY NamedPlace DQ basic measure error rate: Id 7 DQ basic measure error count: Id 10 inspireId DQ basic measure error count: Id 16 name (Geographical Name)
  • 8. Sampling/Full Inspection The cells of DQ basic measures are colour coded. The colours indicate the evaluation procedure:  Attribute inspection by sampling according to ISO 2859 series (yellow cell)  Variable inspection by sampling according to ISO 3951-1 (green cell)  Full inspection (orange cell) FEATURES AND ATTRIBUTES SAMPLING (ISO 2859) FULL INSPECTION (automatic) SAMPLING (ISO 3951) ISO 2859 states the principles of testing sufficient items of the whole population by sampling. When expressed as two integers the error ratios of data subsets can be summed up to data set error rate by dividing the total number of errors with the total c If errors exist (error count > 0) the sub set should be rejected and corrective action by the producer is needed. It is assumed that the number of errors found is quite small. The customer may be attempted to make those few corrections them selves. This i ISO 3951 variable sampling gives reliable results on small sample sizes. CE95/LE95 is close enough the upper limit (U) of the standard on AQL 4 level. The ISO 3959 offers a clear acceptance criteria based on the sample. Mandatory Voidable Optional According to INSPIRE Data Specifications v3
  • 9. Relevant data quality measures Relevant ISO/TS 19138 data quality measures 1 Name Rat e of exc ess ite ms Rate of missing items Numbe r of items not compli ant with the rules of the concep tual schem a Numbe r of invalid overlap s of surface s Numbe r of items not in confor mance with their value domain Physic al structur e conflict s number of faulty point- curve connec tions number of missing connec tions due to unders hoots number of missing connec tions due to oversh oots number of invalid slivers number of invalid self- interse ct errors number of invalid self- overlap errors mean value of position al uncerta inties (1D, 2D and 3D) Linear map accura cy at 95 % signific ance level Circula r error at 95 % signific ance level Misclas sificatio n rate Rate of incorre ct attribut e values attribute value uncertainty at 95 % significance level 2 Alias - - - overlapping surfaces - extraneous nodes undershoots overshoots slivers loops kickbacks - LMAS 95 % navigation accuracy - 3 Data quality element compl etenes s completenes s logical consistency logical consistency logical consistency logical consistency logical consistency logical consistency logical consistency logical consistency logical consistency logical consistency positional accuracy positional accuracy positional accuracy thematic accuracy thematic accuracy thematic accuracy 4 Data quality subeleme nt commi ssion omission conceptual consistency conceptual consistency domain consistency format consistency topological consistency topological consistency topological consistency topological consistency topological consistency topological consistency absolute or external accuracy absolute or external accuracy absolute or external accuracy classification correctness non- quantitative attribute correctness quantitative attribute accuracy 5 Data quality basic measure err or rate error rate error count error count error count error count error count error count error count error count error count error count not aplicab le LE95 or LE95I, depen ding on the evaluat ion CE95 error rate error rate LE95 or LE95(r), depending on the evaluation procedure
  • 11. How to utilize the quality model • Quality model will be transformed to a rule set and conformance levels • ELF specifications will include these for the NMCAs • Automated tools utilizing the rule and conformance levels
  • 12. Quality requirements/Conformance levels • To set the requirements use the quality measures • To consider the nature of reality – Feature vagueness – Change rates – Themes • Suggested guidance for positional accuracy • Suggestion on setting the classification of conformance levels
  • 13. Setting conformance levels (examples) • Geometric accuracy is critical and mostly well defined characteristic of cadastral parcels while the geographical names like a name of a lake does not have just one correct location. Any location within the area of the lake is acceptable. • Completeness of transportation network is important to know and it can be explicitly evaluated. Wetlands may be important areas in hydrography but their existence or delineation can be hard to evaluate during a dry season
  • 17. Quality evaluation Process • Step 1: Applying the data quality measure to the data to be checked. The procedure for this is described in the the ISO19113/19114 standards • Step 2: Reporting the score for each measure in a report form for each measure • Step 3: Comparing the result from step two to the defined conformance level • In addition, two continuing steps can be done: • Step 4: Summarizing the conformance results into one result for each for each data quality elements • Step 5: Summarising the results from step 4 into one overall dataset result
  • 18. Aggregation of data quality conformance results • Aggregation where the measurements are on different scales and have different units. -> transform all the data quality quantitative results into conformance results using a set of conformance levels/classes. See previous slides • Aggregation for inhomogeneous data. This can be done by just reporting the lowest quality found in the most remote areas (see nature of reality slide). Another way (the one recommended here) is to use different conformance classifications for the different kind of areas (urban, rural, remote), and then summarise based on “conformance score”. To make this useful, a metadata description is needed to give the distribution between the kinds of area. • Reporting details. The simplest way of reporting is just to give one value for the dataset. This can be a simple “passed” or “failed” with a reference to the product specification. But doing a lot of work in quality assessment, and just report one value, can be considered oversimplification. One way of giving quality statements as grades may be useful on the step 4 and step 5 (see above)..
  • 19. Grading data example Grade Data Quality description Excellent Only class A for all quality measures Very good A majority of A’s, but also some B’s Good A majority of B’s, some A’s, no C’s Adequat Only a very few C’s, the other B’s and better Marginal A majority of C’s but also some B’s Not good No measure reached the class B (i.e. all measures on class C)
  • 20. ESDIN approach to quality
  • 21. Where you utilize quality webservices? • If you are a data provider for SDI – For quality control during production (automated) called here conformance testing (this includes edge- matching and generalization) – For quality evaluation after the production (semi- automated) • If you are the SDI co-ordinator or data custodian – For quality audit for process accreditation or data certification doing either conformance testing and/or quality evalution • If you are customer or data user – To evaluate usability using metadata information
  • 22.
  • 23.
  • 24. Rulesets & TemplatesDatabase Object Oriented Geospatial Rules Engine Collaborative Web-based Rule Authoring Web Services Interface Data Quality Evaluation Service Business Rules Data for Evaluation Quality Measures Geospatial Data File Rule Builder: Intuitive user interface to author, agree and manage DQ measures. DQ Client Application: Accessible, easy to use, automatic Data Quality Evaluation Service DQ Rules Engine: W3C Web Services interface using open standards to describe & execute geospatial rule evaluation. Rule Repository: Data Quality Rules, derived and guided by Quality Model. Web Feature Service Quality Evaluation Service SOAP HTTP 40
  • 25. DQ Rule Builder Environment 41
  • 26. DQ Evaluation Service Concept 42
  • 28. ESDIN approach to quality
  • 29. Metadata approach • Metadata needed for discovery of datasets through metadata catalogues and registries • Metadata needed for the evaluation of those datasets, as to whether they are of sufficient quality to meet end users’ needs • Metadata specific to the requirements of the ELF specifications
  • 30. Are we INSPIRE compliant? • Yes…. We suggest some of the measures to be changed in the future editing of the INSPIRE data specification • There are some mistakes in the current specification that should be corrected • We also propose additional mesures
  • 31. ESDIN/INSPIRE difference Admin units Suggested by INSPIRE Data Specificatio n v3 Administrati ve Units Section Data Quality Element Data Quality sub-element ISO 19138 measure Measure name / Basic quality measure Scope ESDIN quality model Comment 7.1.1 Completene ss Commission Id 3 Rate of excess items / error rate dataset- level The same as ESDIN 7.1.2 Completene ss Omission Id 7 Rate of missing items / error rate dataset- level The same as ESDIN 7.2.1.1 Logical Consistency Topological consistency Id 21 * Number of faulty point-curve connections / error count dataset- level The same as ESDIN 7.2.1.2 Logical Consistency Topological consistency Id 23 Number of missing connections due to undershoots / error count dataset- level The same as ESDIN 7.2.2 Logical Consistency Conceptual consistency Id 9 Conceptual schema compliance / correctness indicator dataset- level Number of items not compliant with the rules of the conceptual schema / error count used ID 10 in stead. Id 9 applicable just on single instance level 7.3.1 Positional Accuracy Absolute External positional accuracy Id 28 Mean value of positional uncertainties (1D,2D and 3D) / not applicable dataset- level Linear map accuracy at 95 % significance level / LE95 or LE95I Not used, used 36 instead * Id 21 in ISO 19138, but has the incorrect id 9 in INSPIRE DataSpecification AU
  • 32. Additional quality measures Additional ones from ESDIN WP8 Logical Consistency Conceptual consistency Id 11 Number of invalid overlaps of surfaces / error count dataset-level Topological consistency Logical Consistency Domain consistency Id 16 Number of items not in conformance with their value domain / error count dataset-level Logical Consistency Conceptual consistency Id 10 Number of items not compliant with the rules of the conceptual schema / error count dataset-level Logical Consistency Format consistency Id 19 Physical structure conflicts / error count dataset-level Logical Consistency Topological consistency Id 25 number of invalid slivers / error count dataset-level Logical Consistency Topological consistency Id 26 number of invalid self-intersect errors / error count dataset-level Logical Consistency Topological consistency Id 27 number of invalid self-overlap errors / error count dataset-level Positional accuracy Absolute or external postitional accuracy Id 36 Linear map accuracy at 95 % significance level / LE95 or LE95I dataset-level Thematic accuracy Classification correctness Id 61 Misclassification rate / error rate dataset-level Thematic accuracy Non-quantitative attribute correctness Id 67 Rate of incorrect attribute values / error rate dataset-level
  • 33. Resolution and Level of Details Target level of detail Scale 1:2,500,000 1:1,000,000 1,500,000 1,250,000 1,100,000 1,50,000 1:25,000 1:10,000 1:5,000 1:2,500 Global Target level of detail Regional Master Urban Rural Level of details Mountainous Target level of detail
  • 34. Conclusions • It is important that INSPIRE will give a platform for data quality information; minimum data quality comformance levels set and then ability to report other user community related conformance levels • Quality evaluation metadata should be available for automated conformance testing • Introducing a quality model which uses a same principles for all Annex I themes -> we will suggest this a guideline for INSPIRE implementation • Introducing comformance levels that can be evaluated using semi- automated or automated based on ISO standards • Automation of quality evaluation and conformance testing can be done for all transformation related workflows including schema transformation, generalization and edge matching • Significant saving potential in quality reporting and improvement of data