(TARA) Call Girls Chakan ( 7001035870 ) HI-Fi Pune Escorts Service
Wikidata Introductory Workshop
1. Wikidata Introductory Workshop
Friends of OpenGLAM
Beat Estermann, Bern, 13 May 2019
Unless otherwise noted, the content of this presentation is made available under the CC BY 4.0 license.
2. • Short introduction to Wikidata
• What is its purpose?
• How does it work?
• Wikidata + GLAM
• Aim & vision
• Where do we stand?
• Zooming in on data related to heritage institutions
• Where would you / your institution fit in?
• Let’s Practice!
• Querying Wikidata
• Editing Wikidata
On the Programme Today
If you want to build a ship, don’t drum up people together to collect wood and don’t
assign them tasks and work, but rather teach them to long for the endless immensity
of the sea. – Antoine de Saint Exupery
Course page on Wikidata:
https://tinyurl.com/WD-intro-2019
6. Purpose of Wikidata
• Centralized Interwiki-Links [Example: Bern]
• Centralized Data Management for Infoboxes [Example: Ferdinand de Saussure]
• Centralized Data Management for Lists [Example: Lista de pinturas de A. Norfini]
• Possibility of Querying the Data in a Standardized Format
[Example Queries / External Applications]
« The Sum of All Human Knowledge» as Linked Open Data
Multilingual
With Sourced Statements
Freely usable by anyone (CC Zero)
7. Structure of Wikidata – RDF Triples
BernBern SwitzerlandSwitzerlandis the capital ofis the capital of
Subject Predicate Object
SwitzerlandSwitzerland capitalcapital BernBern
Predicate ObjectSubject
SwitzerlandSwitzerland
Subject
is ais a
Predicate
CountryCountry
Object
instance
class
property
instance
instance
property
SwitzerlandSwitzerland
Subject
GDPGDP
Predicate
518 Mia. $ value
property
instance
point in time 2015 value
qualifier
8. Structure of Wikidata – Linked Data
Subject Predicate Object
Bern
(Q70)
is a
(P31 - instance of)
municipality of Switzerland (
Q70208)
Bern
(Q70)
is the capital of
(P1376 - is the capital of)
Switzerland
(Q39)
Berlin
(Q64)
is a
(P31 - instance of)
municipality of Germany
(Q262166)
Berlin
(Q64)
is the capital of
(P1376 - is the capital of)
Germany
(Q183)
Switzerland
(Q39)
is a
(P31 - instance of)
country
(Q6256)
Germany
(Q183)
is a
(P31 - instance of)
country
(Q6256)
municipality of Switzerland
(Q70208)
is a subclass of
(P279 - subclass of)
municipality
(Q15284)
municipality of Germany
(Q262166)
is a subclass of
(P279 - subclass of)
municipality
(Q15284)
BernBern SwitzerlandSwitzerlandis the capital ofis the capital of
Subject Predicate Object
URI
URI URI
9. Structure of a Wikidata Entry
Douglas
Adams
Douglas
Adams Jane BelsonJane Belsonspousespouse
Subject
Predicate
Object
start / end time 25 Nov. 1991 – 11 May
Ref.
10. It’s Your Turn!
• Does the WD item of your
place of residence /
provenance have a statement
for the mayor? Is it up to
date?
• Does the Wikipedia page
contain this information?
• How about the Catalan
Wikipedia? US Department of Commerce, Bureau of the Census, Public
Information Office, around 1940. NARA. Public Domain.
Find the information on the Internet
and add it to Wikidata!
Add it also to Wikipedia! (in your
language and in Catalan!)
11. Wikidata + GLAM
• Aims & vision
• Where do we stand?
• Zooming in on data related to heritage institutions
12. • The aim of this project is to coordinate, facilitate and promote
the ingestion of cultural heritage related data into
Wikidata, to facilitate the cleansing and enhancement of this
data and to promote its use across Wikipedia, its sister
projects and beyond.
• It is our vision to establish Wikidata as a central hub for data
integration, data enhancement, and data management in
the heritage domain.
Aim and Vision (WikiProject Cultural Heritage)
13. • Establish Wikidata as a database that covers the entire world’s
cultural heritage.
• Establish Wikidata as a central hub that interlinks GLAM collections
around the world; and provides links to bibliographic, genealogic,
scientifc and other collections of information; create the ultimate
authority file.
• Foster truly multilingual and global collaboration among people
from various backgrounds.
• Leverage synergies between institutions, reduce duplicate work.
• Encourage debate in the community by highlighting and
interrogating differences in perspective.
• Provide a single source of data for some of the most popular web
sites and apps, including Wikipedia infoboxes and lists.
Vision (Blog posts: Stinson et al. 2016; Thornton / Cochrane 2016; Poulter 2017)
15. Current Trends in the Heritage Sector (1/2)
Source: OpenGLAM Benchmark Survey
N = 1560
Wikidata
16. Current Trends in the Heritage Sector (2/2)
Knowledge
Graph
Entity Extraction
Inter-linking
Machine Learning Human in the Loop
Services for Metadata Extraction & Enhancement
Source: Bern University of Applied Sciences
17. Core Aspects of Linked Data Publication
Source: eCH-0205 – Linked Open Data
18. • http://make.opendata.ch/wiki/data:glam_ch
• Personnalités Vaudoises (BCUL)
• Swiss Photography Metadata (Büro für Fotografiegeschichte)
• Artist data from the SIKART Lexicon on art in Switzerland (SIK-ISEA)
• Metadata of the Historical Dictionary of Switzerland (HLS)
• PCP Inventory (Federal Office for Civil Protection)
• Inventory of Historical Monuments (Canton of Zurich)
• Inventory of Historical Monuments (City of Zurich)
• Inventory of classified Gardens and Parks (City of Zurich)
• Art in the Urban Space (City of Zurich)
• Swiss GLAM Inventory (OpenGLAM)
• Inventory of Research Libraries in Switzerland (Swissbib)
• ISplus Swiss (G)LAM Inventory (Swiss National Library)
• Schauspielhaus Zürich Repertoire of Theatre and other Productions, 1938–1968
• Swiss Theatre Metadata (Swiss Theatre Collection)
• Plazi TreatmentBank (repository of the world's species) (Plazi.org)
• Historical Statistics of Switzerland (University of Zurich)
Data Provision – Which Datasets are Useful?
22. • Coping with the Bazaar:
• Sometimes changes to property definitions are too easily made by
volunteers.
• There is a rigorous process for creating new properties, but not for
changing definitions of properties or creating new classes.
• No master language; how to keep translations of definitions in synch?
• Sometimes, different approaches are used to model the same thing.
• What are good design principles?
• Aim for re-usability of properties across various domains.
• Select high-priority areas first, do not try to solve everything overnight for
the entire cultural heritage domain.
• Use existing databases in a given domain as a starting point to drive
ontology development.
• …
• Finding a balance between:
• The expressive power of an ontology
• Its practicability when it comes to large scale use by many people
• Its queryability (usability from the perspective of data users)
Challenges Related to Ontology Development (2/2)
23. • Mapping Between Data Models
• Getting an overview of appropriate properties and classes can be a
time-consuming exercise.
• Creating new properties requires community agreement and may involve
lengthy discussions and compromises.
• There is still a lot of work to be done in the area of typologies and
thesauri (= controlled vocabularies) [Example]
• Matching Items / Disambiguation
• There are tools like Mix’n’Match and OpenRefine to support this, but it
remains a major challenge, esp. with datasets which haven’t resolved this
issue internally.
• Incorrect / Incoherent Data on Wikidata
• Many data ingestion projects require cleansing up of existing data.
• Repeated Ingestion / Updates
• How to approach the historicization of data?
• How to set up processes to regularly update data?
Challenges Related to Data Ingestion
N.B.: We are not filling a void or starting from scratch, but contributing to an
existing ecosystem of data, data models, and community members!
26. • Establishing and Documenting Data Quality
• Getting rid of duplicates
• Dealing with incorrect and inconsistent data
• How to monitor data quality and data completeness?
• Building a Network of Trust
• Linking all statements to a reliable source
• In the future: “Signed Statements”
• Data Exchange Between Wikidata and Primary Databases
• Data synchronization: How to keep data mutually up to date?
• How to make it easier for GLAM employees to follow
changes/improvements to their data on Wikidata?
Challenges Related to Data Maintenance
27. • Chicken-or-Egg Problem:
• Data usage drives data quality & completeness
• Data quality & completeness are prerequisites of data use
See also: How Wikidata is Solving its Chicken-or-Egg Problem in the Field of Cultural Heritage
Challenges Related to Data Use
28. How about data related to heritage
institutions?
Some Ideas…
29. • Create an international database of all heritage
institutions on the basis of Wikidata.
• Use this database to populate infoboxes and lists on
Wikipedia.
• Call upon heritage institutions to complement their
Wikidata entries and to keep them up-to-date.
• Call upon Wikipedians to create/enhance Wikipedia
articles about the institutions and their holdings.
• Get heritage institutions to enhance the inter-linking
between their own databases and Wikidata.
• Get heritage institutions to make their content available
through Wikidata & Wikimedia Commons.
• Encourage the development of third-party applications
making use of the worldwide inventory of heritage
institutions.
Vision
30. • WikiProject “Heritage Institutions”:
• Aim and scope (may require updating)
• Data structure
• Typology (controlled vocabularies; may need expanding)
• Overview of data sources to import data from
• Use cases (requires updating)
• Sample queries / maintenance queries
• So far, information about 40’000 museums, 25’000 libraries, and
4’000 archives have been ingested in Wikidata. The quality and
completeness of the entries vary a lot.
• For some countries, there is virtually full coverage of all existing
institutions (e.g. Switzerland, Ukraine, Brazil). For some countries,
national inventories exist, but have not been ingested yet (e.g.
Portugal, Russia, Spain). And in many further countries, there are no
or only very fragmentary inventories. At the moment, there is no
systematic overview of the current coverage per country.
Status Quo – Wikidata
There are currently two projects aiming at the implementation of a worldwide database of
heritage institutions on Wikidata:
- FindingGLAMs (project run by Wikimedia Sweden, Unsesco, and the Wikimedia Foundation)
- Sum of All GLAMs (project run by the Wiki Movement Brazil)
31. • A few Wikipedias already use Wikidata-driven infobox templates for
museums, libraries, and/or archives
(a central overview is lacking)
• The Wikipedia in Portuguese uses Wikidata-driven lists in their main
namespace.
• In some Wikipedias, the use of Wikidata-driven infobox templates is
still a disputed practice; this is even more true for Wikidata-driven
lists.
• A few Wikipedias have a Mbabel template that can be used to create
draft articles on museums based on data from Wikidata.
• There are many missing articles about heritage institutions in
Wikipedia... ;-)
Status Quo – Wikipedia
32. • Wikimedia CH Campaign at the occasion of the International
Museums Day 2018 (creation and improvement of Wikipedia articles)
• Swiss Open Cultural Data Hackathon (data ingests and creation of
applications)
• Wiki Loves Monuments (international photo contest)
Status Quo – Campaigns & Community (Switzerland)
35. Querying & Editing Wikidata
Querying & Editing Wikidata
•Schauspielhaus Productions without a “based on” statement
•Swiss heritage institutions without a “director” statement
•Swiss heritage institutions without a German/French label
•Items about Swiss museums with statements that are not properly sourced
Querying Wikidata & Editing Wikipedia
•The performers with the most appearances in plays at Schauspielhaus Zürich but
without a Wikipedia article in German
Exploring Ontologies & Editing Wikidata
•Typology of heritage institutions
•Typology of concerts, recordings, etc.
The task descriptions can be found on the course page:
https://tinyurl.com/WD-intro-2019
36. Thank You for Your Attention!
I Hope You Will Enjoy Wikidata… ;-)
Contact
Beat Estermann
Bern University of Applied Sciences
beat.estermann@bfh.ch
+41 31 848 34 38