SlideShare una empresa de Scribd logo
1 de 40
Metadata Quality Assurance Framework
Péter Király <peter.kiraly@gwdg.de>
Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen, Germany
QQML2016
8th International Conference on Qualitative and Quantitative Methods in Libraries
2016-05-24, London
Metadata Quality Assurance Framework
2
the problem
there are „good” and „bad” metadata records
Metadata Quality Assurance Framework
3
Typical issues – non-informative field
 Title is not informative
non informative:
„photograph, framed”,
„group photograph”
„photograph”
vs
informative:
„Photograph of Sir
Dugald Clerk”,
„Photograph of "Puffing Billy"
Metadata Quality Assurance Framework
4
Typical issues – Field overuse
 What is the meaning of the field? (overuse)
TextGrid OAI-PMH response
Metadata Quality Assurance Framework
5
Why data quality is important?
„Fitness for purpose” (QA principle)
no metadata no access to data no data usage
more explanation:
Data on the Web Best Practices
W3C Working Draft 19 May 2016
https://www.w3.org/TR/dwbp/
Metadata Quality Assurance Framework
6
Europeana Data Quality Committee
 Online collaboration
 Use case documents
 Problem catalog
 Tickets
 Discussion forum
 #EuropeanaDataQuality
 Bi-weekly teleconf
 Bi-yearly face-to-face
meeting
 Topics
 Usage scenarios
 Metadata profiles
 Schema modification
 Measuring
 Event model
 Proposals for data
providers
Metadata Quality Assurance Framework
7
What it is good for?
 improve the metadata
 improve services: good data → functions
 improve metadata schema & documentation
 propagate „good practice”
Domains:
 cultural heritage sector
 research data management and archiving
Metadata Quality Assurance Framework
8
Research hypothesis
hypothesis
with measuring structural elements we
can predict metadata record quality
Metadata Quality Assurance Framework
9
Research hypothesis
proposed solution
an open source measuring and reporting tool
Metadata Quality Assurance Framework
Metadata Quality Assurance Framework
10
What to measure?
Metadata Quality Assurance Framework
11
Measurements
 Schema-independent structural features
existence, cardinality, uniqueness, length,
dictionary entry, data type conformance
 Use case scenarios („fit for purpose”)
Requirements of the most important functions
 Problem catalog
Known metadata problems
Metadata Quality Assurance Framework
12
Discovery scenarios and their metadata requirements
Europeana’s most important functions
1. Basic retrieval with high precision and recall
2. Cross-language recall
3. Entity-based facets
4. Date-based facets
5. Improved language facets
6. Browse by subjects and resource types
7. Browse by agents
8. Browse/Search by Event
9. Entity-based knowledge cards and pages
10. Categorised similar items
11. Spatial search, browse, and map display
12. Entity-based autocompletion
13. Diversification of results
14. Hierarchical search and facets
Credit: the document was initialized by Timothy Hill, Europeana’s search engineer
Metadata Quality Assurance Framework
13
Discovery scenarios and their metadata requirements – Entity-based facets
Scenario
As a user I want to be able to filter by whether a person is the
subject of a book, or its author, engraver, printer etc.
Metadata analysis
In each case the underlying requirement is that the relevant EDM
fields for objects be populated by identifying URIs rather than free
text. These URIs need to be related, at a minimum, to a label for
each of the supported languages.
Measurement rules
 The relevant field values should be resolvable URI
 each URI should have labels in multiple languages
Metadata Quality Assurance Framework
14
Problem catalog
Catalog of known metadata problems in Europeana
 Title contents same as description contents
 Systematic use of the same title
 Bad string: "empty" (and variants)
 Shelfmarks and other identifiers in fields
 Creator not an agent name
 Absurd geographical location
 Subject field used as description field
 Unicode U+FFFD (�)
 Very short description field
 ...
Credit: the document was initialized by Timoty Hill, Europeana’s search engineer
Metadata Quality Assurance Framework
15
How to define measurements?
Metadata Quality Assurance Framework
16
Problem catalog – proposed basis of implementation
Shapes Constraint Language (SHACL)
https://www.w3.org/TR/shacl/
A language for describing and constraining the contents of RDF
graphs. It provides a high-level vocabulary to identify predicates and
their associated cardinalities, datatypes and other constraints.
 sh:equals, sh:notEquals
 sh:hasValue
 sh:in
 sh:lessThan, sh:lessThanOrEquals
 sh:minCount, sh:maxCount
 sh:minLength, sh:maxLength
 sh:pattern
Metadata Quality Assurance Framework
17
early measurement results
and their visualization
Metadata Quality Assurance Framework
18
overall view collection view record view
Completeness – 40 measurements
Field cardinality – 27 measurements
Uniqueness – 6 measurements
Language specification – 20 measurements
Problem catalog – 3 measurements
etc.
links
measurementsaggregated numbers
Metadata Quality Assurance Framework
19
completeness
What is the ratio of populated fields in records?
Metadata Quality Assurance Framework
20
Field frequency / main
Alternative title is a rare field
Metadata Quality Assurance Framework
21
Field frequency per collections / all
no record has alternative title
every record has alternative title
Metadata Quality Assurance Framework
22
multilinguality
Do we know the language of a field value?
Metadata Quality Assurance Framework
23
Multilinguality
@resource is a URI
@ = language notation in RDF
no language specification
Metadata Quality Assurance Framework
24
Language frequency / barchart
Metadata Quality Assurance Framework
25
Language frequency / barchart
same language,
different encodings
Metadata Quality Assurance Framework
26
Language frequency / Treemap with resources
has no language
specification
has language
specification
Is a URI
Metadata Quality Assurance Framework
27
uniqueness (entropy)
How unique the terms are in a field?
Metadata Quality Assurance Framework
28
Entropy – term uniqueness / main
1 means a unique term
0.0000x means a very frequent term
These are cumulative numbers
entropycumolative = term1 + ... + termn
Metadata Quality Assurance Framework
29
Entropy – term uniqueness / collection
max is exceptional (=1425 * mean)
unique records
not or less unique records
Metadata Quality Assurance Framework
30
Entropy – term uniqueness / refining the picture
bulk of records are close to zero
although 25% are between 0.05 and 1.25
Metadata Quality Assurance Framework
31
Entropy – term uniqueness / terms
explanation of uniqueness score
TF-IDF values come from Apache Solr
term frequency: 1
document freq.: 2
uniqueness score: 0.5
Metadata Quality Assurance Framework
32
problem catalog
Does the record have any specific issues?
Metadata Quality Assurance Framework
33
Problem catalog – same title and description
there is one title and
description which is the same
... and we have 9 such records
Metadata Quality Assurance Framework
34
Problem catalog – same title and description – example
Metadata Quality Assurance Framework
35
completeness sub-dimensions
Are the sub-dimensions (field groups
supporting specific functionalities) complete?
Metadata Quality Assurance Framework
36
Record view – functionality matrix
existing
missing
functionalities
Metadata Quality Assurance Framework
37
miscellaneous
Metadata Quality Assurance Framework
38
Further steps
 Incorporating into Europeana’s ingestion tool
 Process usage statistics (logs, Google Analitics)
 Human evaluation of metadata quality
 Measuring timeliness (changes of scores over time)
 Machine learning based classification & clustering
 Incorporating into research data management tool
 Cooperation with other projects
Metadata Quality Assurance Framework
39
Architectural overview
Apache Spark
(Java)
OAI-PMH client (PHP)
Analysis with
Spark (Scala) Analysis with R
Web interface
(PHP, d3.js)
Hadoop File
System
JSON files
Apache Solr
Apache
Cassandra
JSON files
JSON files image files
CSV files
CSV files
recent workflow
planned workflow
Metadata Quality Assurance Framework
40
Follow me
 Europeana Data Quality Committee
http://pro.europeana.eu/europeana-tech/data-
quality-committee
 research plan and blog http://pkiraly.github.io
 site http://144.76.218.178/europeana-qa/
 source codes
 https://github.com/pkiraly/europeana-qa-spark
 https://github.com/pkiraly/europeana-qa-r
 @kiru, https://www.linkedin.com/in/peterkiraly

Más contenido relacionado

La actualidad más candente

A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...Nandana Mihindukulasooriya
 
OAI Metadata: Why and How
OAI Metadata: Why and HowOAI Metadata: Why and How
OAI Metadata: Why and HowJenn Riley
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsAdrian Paschke
 
Identifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexIdentifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexAndriy Nikolov
 
Establishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBEstablishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBnw13
 
Semantic Technologies in ST&DL
Semantic Technologies in ST&DLSemantic Technologies in ST&DL
Semantic Technologies in ST&DLAndrea Nuzzolese
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mappingVlad Vega
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesResearch Data Alliance
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the FutureCarole Goble
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platformJindřich Mynarz
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
 
OSFair2017 Training | FAIR metrics - Starring your data sets
OSFair2017 Training | FAIR metrics - Starring your data setsOSFair2017 Training | FAIR metrics - Starring your data sets
OSFair2017 Training | FAIR metrics - Starring your data setsOpen Science Fair
 
2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovukJun Zhao
 
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Daniel Valcarce
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 
Data analysis in dataverse & visualization of datasets on historical maps
Data analysis in dataverse & visualization of datasets on historical mapsData analysis in dataverse & visualization of datasets on historical maps
Data analysis in dataverse & visualization of datasets on historical mapsvty
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference
 

La actualidad más candente (20)

A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
 
OAI Metadata: Why and How
OAI Metadata: Why and HowOAI Metadata: Why and How
OAI Metadata: Why and How
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
 
Identifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexIdentifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web Index
 
Establishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBEstablishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNB
 
Semantic Technologies in ST&DL
Semantic Technologies in ST&DLSemantic Technologies in ST&DL
Semantic Technologies in ST&DL
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
 
Metadata crosswalks
Metadata crosswalksMetadata crosswalks
Metadata crosswalks
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologies
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
OSFair2017 Training | FAIR metrics - Starring your data sets
OSFair2017 Training | FAIR metrics - Starring your data setsOSFair2017 Training | FAIR metrics - Starring your data sets
OSFair2017 Training | FAIR metrics - Starring your data sets
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk
 
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 
Data analysis in dataverse & visualization of datasets on historical maps
Data analysis in dataverse & visualization of datasets on historical mapsData analysis in dataverse & visualization of datasets on historical maps
Data analysis in dataverse & visualization of datasets on historical maps
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 

Destacado

BalticMiles We Love to Give You More
BalticMiles We Love to Give You More BalticMiles We Love to Give You More
BalticMiles We Love to Give You More NORD DDB RIGA
 
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"NORD DDB RIGA
 
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"NORD DDB RIGA
 
Gu 2016 programma conegliano 2016 co1
Gu 2016 programma conegliano 2016 co1Gu 2016 programma conegliano 2016 co1
Gu 2016 programma conegliano 2016 co1Montagnin Mariano
 
Serbia in the (Lo)Clouds
Serbia in the (Lo)CloudsSerbia in the (Lo)Clouds
Serbia in the (Lo)Cloudslocloud
 
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...Faculty of Economics in Osijek
 
Transform customer experience through PHYGITAL
Transform customer experience through PHYGITALTransform customer experience through PHYGITAL
Transform customer experience through PHYGITALJaslynn joan
 
A jók és a rosszak - metaadatok minőségellenőrzése
A jók és a rosszak - metaadatok minőségellenőrzéseA jók és a rosszak - metaadatok minőségellenőrzése
A jók és a rosszak - metaadatok minőségellenőrzésePéter Király
 
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...Europeana
 
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...NORD DDB RIGA
 
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?NORD DDB RIGA
 
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...NORD DDB RIGA
 
Középiskolai könyvtárhasználati óra
Középiskolai könyvtárhasználati óraKözépiskolai könyvtárhasználati óra
Középiskolai könyvtárhasználati óraSZEkonyvtar
 
Rakstveida saziņa. Vēstule
Rakstveida saziņa. VēstuleRakstveida saziņa. Vēstule
Rakstveida saziņa. VēstuleUzdevumi.lv
 
Könyvtári rendszer
Könyvtári rendszer Könyvtári rendszer
Könyvtári rendszer tudaskozpont
 
A Wikipédia; Hivatkozás elektronikus dokumentumokra
A Wikipédia; Hivatkozás elektronikus dokumentumokraA Wikipédia; Hivatkozás elektronikus dokumentumokra
A Wikipédia; Hivatkozás elektronikus dokumentumokratudaskozpont
 
The Future of Historic Sounds – a prelude
The Future of Historic Sounds – a preludeThe Future of Historic Sounds – a prelude
The Future of Historic Sounds – a preludeEuropeana_Sounds
 
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013rina fitri
 

Destacado (20)

BalticMiles We Love to Give You More
BalticMiles We Love to Give You More BalticMiles We Love to Give You More
BalticMiles We Love to Give You More
 
Metadata quality criteria
Metadata quality criteriaMetadata quality criteria
Metadata quality criteria
 
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"
Čempionu Brokastis #23 / Miks Koljērs / "Funny ha-ha or funny peculiar?"
 
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"
Čempionu Brokastis #22 / Daunis Auers / "Karš, karavīri un kvass"
 
Gu 2016 programma conegliano 2016 co1
Gu 2016 programma conegliano 2016 co1Gu 2016 programma conegliano 2016 co1
Gu 2016 programma conegliano 2016 co1
 
Serbia in the (Lo)Clouds
Serbia in the (Lo)CloudsSerbia in the (Lo)Clouds
Serbia in the (Lo)Clouds
 
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...
Vinkovci najstariji grad Europe // Marija Vrljić, Mladen Mustapić, Nikolina K...
 
Transform customer experience through PHYGITAL
Transform customer experience through PHYGITALTransform customer experience through PHYGITAL
Transform customer experience through PHYGITAL
 
A jók és a rosszak - metaadatok minőségellenőrzése
A jók és a rosszak - metaadatok minőségellenőrzéseA jók és a rosszak - metaadatok minőségellenőrzése
A jók és a rosszak - metaadatok minőségellenőrzése
 
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...
europeana agm 2015, 4/11, bp 2015 to 2016 - strategic positioning &amp; e280 ...
 
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...
Čempionu Brokastis #21 /Renārs Liepiņš & Jānis Lazda-Lazdiņš / "No Kannu zāle...
 
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?
Kā radīt pašpietiekamu ziņu - reklāmu, ko cilvēki paši padarīs populāru?
 
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...
Čempionu Brokastis #23 / Edgars Lapiņš / "Autentisks mārketings kritiski domā...
 
Középiskolai könyvtárhasználati óra
Középiskolai könyvtárhasználati óraKözépiskolai könyvtárhasználati óra
Középiskolai könyvtárhasználati óra
 
Rakstveida saziņa. Vēstule
Rakstveida saziņa. VēstuleRakstveida saziņa. Vēstule
Rakstveida saziņa. Vēstule
 
Könyvtári rendszer
Könyvtári rendszer Könyvtári rendszer
Könyvtári rendszer
 
A Wikipédia; Hivatkozás elektronikus dokumentumokra
A Wikipédia; Hivatkozás elektronikus dokumentumokraA Wikipédia; Hivatkozás elektronikus dokumentumokra
A Wikipédia; Hivatkozás elektronikus dokumentumokra
 
The Future of Historic Sounds – a prelude
The Future of Historic Sounds – a preludeThe Future of Historic Sounds – a prelude
The Future of Historic Sounds – a prelude
 
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013
Contoh rpp-kimia-kls-x-pertemuan 2-kurklm-2013
 
Generation Z
Generation ZGeneration Z
Generation Z
 

Similar a Metadata quality Assurance Framework at QQML2016 - short

How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Measuring Metadata Quality (ELAG, 2018)
Measuring Metadata Quality (ELAG, 2018)Measuring Metadata Quality (ELAG, 2018)
Measuring Metadata Quality (ELAG, 2018)Péter Király
 
Metadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open AccessMetadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open AccessPaolo Nesi
 
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...Paolo Nesi
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In PracticeMarcia Zeng
 
Towards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityTowards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityXavier Ochoa
 
Nothing is created, nothing is lost, everything changes (ELAG, 2017)
Nothing is created, nothing is lost, everything changes (ELAG, 2017)Nothing is created, nothing is lost, everything changes (ELAG, 2017)
Nothing is created, nothing is lost, everything changes (ELAG, 2017)Péter Király
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...Artificial Intelligence Institute at UofSC
 
DataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open DataDataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open Datadapaasproject
 
LinkedUp - Linked Data & Education
LinkedUp - Linked Data & EducationLinkedUp - Linked Data & Education
LinkedUp - Linked Data & EducationStefan Dietze
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
Data Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open DataData Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open DataMarco Torchiano
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical Universitybutest
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...OSTHUS
 

Similar a Metadata quality Assurance Framework at QQML2016 - short (20)

How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Measuring Metadata Quality (ELAG, 2018)
Measuring Metadata Quality (ELAG, 2018)Measuring Metadata Quality (ELAG, 2018)
Measuring Metadata Quality (ELAG, 2018)
 
Metadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open AccessMetadata Quality assessment tool for Open Access
Metadata Quality assessment tool for Open Access
 
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
Metadata Quality assessment tool for Open Access Cultural Heritage institutio...
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
 
Towards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityTowards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata Quality
 
Nothing is created, nothing is lost, everything changes (ELAG, 2017)
Nothing is created, nothing is lost, everything changes (ELAG, 2017)Nothing is created, nothing is lost, everything changes (ELAG, 2017)
Nothing is created, nothing is lost, everything changes (ELAG, 2017)
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
 
DataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open DataDataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open Data
 
LinkedUp - Linked Data & Education
LinkedUp - Linked Data & EducationLinkedUp - Linked Data & Education
LinkedUp - Linked Data & Education
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Data Quality
Data QualityData Quality
Data Quality
 
Thesis Defense MBI
Thesis Defense MBIThesis Defense MBI
Thesis Defense MBI
 
Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
 
Data Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open DataData Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open Data
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 

Más de Péter Király

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Péter Király
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Péter Király
 
Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)Péter Király
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Péter Király
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)Péter Király
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Péter Király
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Péter Király
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Péter Király
 
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)Péter Király
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Péter Király
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Péter Király
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)Péter Király
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)Péter Király
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Péter Király
 
Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Péter Király
 
Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)Péter Király
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Péter Király
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Péter Király
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Péter Király
 
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)Péter Király
 

Más de Péter Király (20)

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)
 
Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)Measuring Metadata Quality (doctoral defense 2019)
Measuring Metadata Quality (doctoral defense 2019)
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
 
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
Introduction to data quality management (BVB KVB FDM-KompetenzPool, 2021)
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...
 
Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)
 
Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)Measuring Metadata Quality in Europeana (ADOCHS 2017)
Measuring Metadata Quality in Europeana (ADOCHS 2017)
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)
 
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
Metadata quality in cultural heritage institutions (ReIRes-FAIR 2018)
 

Último

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 

Último (20)

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 

Metadata quality Assurance Framework at QQML2016 - short

  • 1. Metadata Quality Assurance Framework Péter Király <peter.kiraly@gwdg.de> Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen, Germany QQML2016 8th International Conference on Qualitative and Quantitative Methods in Libraries 2016-05-24, London
  • 2. Metadata Quality Assurance Framework 2 the problem there are „good” and „bad” metadata records
  • 3. Metadata Quality Assurance Framework 3 Typical issues – non-informative field  Title is not informative non informative: „photograph, framed”, „group photograph” „photograph” vs informative: „Photograph of Sir Dugald Clerk”, „Photograph of "Puffing Billy"
  • 4. Metadata Quality Assurance Framework 4 Typical issues – Field overuse  What is the meaning of the field? (overuse) TextGrid OAI-PMH response
  • 5. Metadata Quality Assurance Framework 5 Why data quality is important? „Fitness for purpose” (QA principle) no metadata no access to data no data usage more explanation: Data on the Web Best Practices W3C Working Draft 19 May 2016 https://www.w3.org/TR/dwbp/
  • 6. Metadata Quality Assurance Framework 6 Europeana Data Quality Committee  Online collaboration  Use case documents  Problem catalog  Tickets  Discussion forum  #EuropeanaDataQuality  Bi-weekly teleconf  Bi-yearly face-to-face meeting  Topics  Usage scenarios  Metadata profiles  Schema modification  Measuring  Event model  Proposals for data providers
  • 7. Metadata Quality Assurance Framework 7 What it is good for?  improve the metadata  improve services: good data → functions  improve metadata schema & documentation  propagate „good practice” Domains:  cultural heritage sector  research data management and archiving
  • 8. Metadata Quality Assurance Framework 8 Research hypothesis hypothesis with measuring structural elements we can predict metadata record quality
  • 9. Metadata Quality Assurance Framework 9 Research hypothesis proposed solution an open source measuring and reporting tool Metadata Quality Assurance Framework
  • 10. Metadata Quality Assurance Framework 10 What to measure?
  • 11. Metadata Quality Assurance Framework 11 Measurements  Schema-independent structural features existence, cardinality, uniqueness, length, dictionary entry, data type conformance  Use case scenarios („fit for purpose”) Requirements of the most important functions  Problem catalog Known metadata problems
  • 12. Metadata Quality Assurance Framework 12 Discovery scenarios and their metadata requirements Europeana’s most important functions 1. Basic retrieval with high precision and recall 2. Cross-language recall 3. Entity-based facets 4. Date-based facets 5. Improved language facets 6. Browse by subjects and resource types 7. Browse by agents 8. Browse/Search by Event 9. Entity-based knowledge cards and pages 10. Categorised similar items 11. Spatial search, browse, and map display 12. Entity-based autocompletion 13. Diversification of results 14. Hierarchical search and facets Credit: the document was initialized by Timothy Hill, Europeana’s search engineer
  • 13. Metadata Quality Assurance Framework 13 Discovery scenarios and their metadata requirements – Entity-based facets Scenario As a user I want to be able to filter by whether a person is the subject of a book, or its author, engraver, printer etc. Metadata analysis In each case the underlying requirement is that the relevant EDM fields for objects be populated by identifying URIs rather than free text. These URIs need to be related, at a minimum, to a label for each of the supported languages. Measurement rules  The relevant field values should be resolvable URI  each URI should have labels in multiple languages
  • 14. Metadata Quality Assurance Framework 14 Problem catalog Catalog of known metadata problems in Europeana  Title contents same as description contents  Systematic use of the same title  Bad string: "empty" (and variants)  Shelfmarks and other identifiers in fields  Creator not an agent name  Absurd geographical location  Subject field used as description field  Unicode U+FFFD (�)  Very short description field  ... Credit: the document was initialized by Timoty Hill, Europeana’s search engineer
  • 15. Metadata Quality Assurance Framework 15 How to define measurements?
  • 16. Metadata Quality Assurance Framework 16 Problem catalog – proposed basis of implementation Shapes Constraint Language (SHACL) https://www.w3.org/TR/shacl/ A language for describing and constraining the contents of RDF graphs. It provides a high-level vocabulary to identify predicates and their associated cardinalities, datatypes and other constraints.  sh:equals, sh:notEquals  sh:hasValue  sh:in  sh:lessThan, sh:lessThanOrEquals  sh:minCount, sh:maxCount  sh:minLength, sh:maxLength  sh:pattern
  • 17. Metadata Quality Assurance Framework 17 early measurement results and their visualization
  • 18. Metadata Quality Assurance Framework 18 overall view collection view record view Completeness – 40 measurements Field cardinality – 27 measurements Uniqueness – 6 measurements Language specification – 20 measurements Problem catalog – 3 measurements etc. links measurementsaggregated numbers
  • 19. Metadata Quality Assurance Framework 19 completeness What is the ratio of populated fields in records?
  • 20. Metadata Quality Assurance Framework 20 Field frequency / main Alternative title is a rare field
  • 21. Metadata Quality Assurance Framework 21 Field frequency per collections / all no record has alternative title every record has alternative title
  • 22. Metadata Quality Assurance Framework 22 multilinguality Do we know the language of a field value?
  • 23. Metadata Quality Assurance Framework 23 Multilinguality @resource is a URI @ = language notation in RDF no language specification
  • 24. Metadata Quality Assurance Framework 24 Language frequency / barchart
  • 25. Metadata Quality Assurance Framework 25 Language frequency / barchart same language, different encodings
  • 26. Metadata Quality Assurance Framework 26 Language frequency / Treemap with resources has no language specification has language specification Is a URI
  • 27. Metadata Quality Assurance Framework 27 uniqueness (entropy) How unique the terms are in a field?
  • 28. Metadata Quality Assurance Framework 28 Entropy – term uniqueness / main 1 means a unique term 0.0000x means a very frequent term These are cumulative numbers entropycumolative = term1 + ... + termn
  • 29. Metadata Quality Assurance Framework 29 Entropy – term uniqueness / collection max is exceptional (=1425 * mean) unique records not or less unique records
  • 30. Metadata Quality Assurance Framework 30 Entropy – term uniqueness / refining the picture bulk of records are close to zero although 25% are between 0.05 and 1.25
  • 31. Metadata Quality Assurance Framework 31 Entropy – term uniqueness / terms explanation of uniqueness score TF-IDF values come from Apache Solr term frequency: 1 document freq.: 2 uniqueness score: 0.5
  • 32. Metadata Quality Assurance Framework 32 problem catalog Does the record have any specific issues?
  • 33. Metadata Quality Assurance Framework 33 Problem catalog – same title and description there is one title and description which is the same ... and we have 9 such records
  • 34. Metadata Quality Assurance Framework 34 Problem catalog – same title and description – example
  • 35. Metadata Quality Assurance Framework 35 completeness sub-dimensions Are the sub-dimensions (field groups supporting specific functionalities) complete?
  • 36. Metadata Quality Assurance Framework 36 Record view – functionality matrix existing missing functionalities
  • 37. Metadata Quality Assurance Framework 37 miscellaneous
  • 38. Metadata Quality Assurance Framework 38 Further steps  Incorporating into Europeana’s ingestion tool  Process usage statistics (logs, Google Analitics)  Human evaluation of metadata quality  Measuring timeliness (changes of scores over time)  Machine learning based classification & clustering  Incorporating into research data management tool  Cooperation with other projects
  • 39. Metadata Quality Assurance Framework 39 Architectural overview Apache Spark (Java) OAI-PMH client (PHP) Analysis with Spark (Scala) Analysis with R Web interface (PHP, d3.js) Hadoop File System JSON files Apache Solr Apache Cassandra JSON files JSON files image files CSV files CSV files recent workflow planned workflow
  • 40. Metadata Quality Assurance Framework 40 Follow me  Europeana Data Quality Committee http://pro.europeana.eu/europeana-tech/data- quality-committee  research plan and blog http://pkiraly.github.io  site http://144.76.218.178/europeana-qa/  source codes  https://github.com/pkiraly/europeana-qa-spark  https://github.com/pkiraly/europeana-qa-r  @kiru, https://www.linkedin.com/in/peterkiraly