SlideShare a Scribd company logo
1 of 26
Download to read offline
Data-Hacking with
Wikimedia Projects:
Learn by Example, Including
Wikipedia, Wikidata, and beyond!
@notconfusing (Max Klein)
@wrought (Matt Senate)
Who is in the room?
●
Data hackers?
●
Programmers?
●
Artists/designers?
●
Open Access folks?
●
Academics?
●
Wikipedians?
●
Wikimedians?
●
Wiki-people?
Wikimedia Movement
●
wikipedia.org
●
commons.wikimedia.org
●
wikisource.org
●
wikidata.org
Wiki Context
●
Wikipedia: far-largest in
size and user base.
●
Projects often
organized by language.
●
Each language-project
has an independent
user community.
Wiki Context
●
See Wikimedia
projects as a form of:
“curated database”
●
Web's Least Common
Denominator for data.
●
Wiki Paradox:
– Low-barrier-to-entry
– High-barrier-to-entry
Buneman, et al. Curated Databases.
https://peerlibrary.org/p/rxQ6WBd89XviMF4Tk
How do Wikimedia projects
work?
●
Community
●
Opt-in
●
Reputation
●
Cultural Protocol
●
Bureaucracy
●
Adhocracy
●
Coordination
– WikiProjects
History of Wiki Data-
Hacking
●
Rambot
– First “bot”
– 2002
●
US Census Data
●
Created 200,000
articles
●
More than doubled
Wikipedia's size at
the time
●
No permissions
“Ignore All Rules” -->
Calcification
●
To get things done:
– Need to issue a
“Request for
Comment” or RFC
●
Everybody,
regardless of
expertise has
trouble with this.
– Sometimes, not
everyone is acting in
good faith, but try to
assume it, it will help.
Häskell und Grepl
●
Hänsel und Gretel ●
Wikidata
https://commons.wikimedia.org/wiki/File:Hansel_and_Gretel.jpg
●
Rural Hunger
Problems
●
The Wikimedia
datascape having
cross-language data
sharing problems.
●
Häskell and Grepl
are sent to the
Forest alone.
●
Data is sent to live in
Templates, living
alone.
●
Häskell and Grepl
first invent a
successful
breadcrumb system
●
Wikimedia commons
allows images to be
shared across Wikis.
●
Lucky if their
breadcrumbs are not
eaten / or whatever
put.
●
Lucky if we knew on
which pages the
Data was stored.
http://commons.wikimedia.org/wiki/File:Flickr_-_Per_Ola_Wiberg_~_mostly_away_-_-_
%22No_bread,_just_a_camera...huh,_quack_%22.jpg
●
A magical
gingerbread house
●
A magical data store
– Called “Wikidata”
– Interwiki data
sharing
– Plus with extra
sweeties
http://commons.wikimedia.org/wiki/Gingerbread_house#mediaviewer/File:Pe
pparkakshus.JPG
●
And the house
includes many
different sweeties.
●
Semantic Triples
●
Qualifiers
●
Ranks
http://commons.wikimedia.org/wiki/Category:Liquorice_candy#mediaviewer/File:Flickr_-
_cyclonebill_-_Slik_%281%29.jpg
●
Häskell and Grepl
eat the roof
hungrily.
●
In this story, the
User's start adding
to Wikidata.
– Importing Wikipedia
– Foreign Database
– Manual adds.
●
The evil witch trap ●
The evil data
witches, normally
keep the
information as a silo.
●
Grepl's cunning
defeat of the witch
●
Identifers
– Think of this as a foreign
database key (Brian
Jacobs)
– Max imported 400,000
biographical identifers.
– This started an Identiifer
craze on Wikidata
●
Tennis Player
●
Swiss Parliament
●
Danish Companies
●
Grepl's cunning
defeat of the witch
●
All Wikis can
transclude arbitrary
data (eventually).
●
And Citations can be
represented as
semantic properties.
See how sources are handled with “FRBR” format:
http://www.wikidata.org/wiki/Help:Sources#Scientific.2C_newspaper_or_magazine_article
Signalling Open Access
●
WikiProject Open
Access
– On English Wikipedia
●
(Data) Problem:
Signalling “Open
Access” is hard!
●
Solution: Use clear
signals directly to
the relevant data.
– Copyright license
– Source content
– Metadata
What?
How?
●
Text WikiSource→
●
Media Commons→
●
Metadata Wikidata→
●
Signals Wikipedia→
– Including license!
●
Public domain, CC0, CC-
BY, CC-BY-SA, etc
●
RFC RecitationBot→
●
RFC RecitationBot→
●
RFC RecitationBot→
●
RFC RecitationBot→
In the Wikimedia Universe
●
Where does
“Signalling Open
Access” fit in the
Wikimedia narrative?
●
(Data) Problem:
Managing citations &
references is hard!
●
Possible Solutions:
– Templates (Many)
– Categories (Many)
– Namespace (FR)
– WikiScholar (Dead)
– VisualEditor (Zotero?)
“Signalling OA”
Opportunities
●
Take pass at citation
management
– Quality & experience
●
Integrate metadata
with Wikidata
– Where it belongs
– Sensitivity and rigor
●
Forge deep
knowledge resources
on Wikipedia
– Snapshots of sources,
deeper linking
●
Automate, but use
human judgment
– Save time and energy,
improve accuracy.
Paths for Data Hackers
●
Reputation
– Create a user account
– Contribute in good faith
to wikimedia projects
●
Cultural protocol
– Identify the scope,
concerns, and nature of
a given project
– Learn to navigate
●
History and Context
– Form a narrative for
your hack based on
past endeavors
– Seek consent and
build consensus
●
Community
– Reach out, on IRC
and mailing lists!
Of course we're Open Source
github.com/wpoa/OA-signalling
github.com/wpoa/recitation-bot
Thanks!
@notconfusing (Max)
@wrought (Matt)

More Related Content

What's hot

Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit history
David Laniado
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
Frauke Ziedorn
 

What's hot (20)

DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fast
 
Aileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to MetadataAileen O'Carroll - DRI Training UCC: Introduction to Metadata
Aileen O'Carroll - DRI Training UCC: Introduction to Metadata
 
DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Wikidata at Wikipeda Day 15 (2016) NYC
Wikidata at Wikipeda Day 15 (2016) NYCWikidata at Wikipeda Day 15 (2016) NYC
Wikidata at Wikipeda Day 15 (2016) NYC
 
Video game controlled vocabulary in wikidata
Video game controlled vocabulary in wikidataVideo game controlled vocabulary in wikidata
Video game controlled vocabulary in wikidata
 
Digital Narratives for Transylvania DH
Digital Narratives for Transylvania DHDigital Narratives for Transylvania DH
Digital Narratives for Transylvania DH
 
Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit history
 
Digital game preservation conference 12 25-2018
Digital game preservation conference   12 25-2018Digital game preservation conference   12 25-2018
Digital game preservation conference 12 25-2018
 
Kathryn Cassidy - DRI Training Series: 4. Metadata and XML
Kathryn Cassidy - DRI Training Series: 4. Metadata and XMLKathryn Cassidy - DRI Training Series: 4. Metadata and XML
Kathryn Cassidy - DRI Training Series: 4. Metadata and XML
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
 
Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
Contract Cheating in Canada (for University of Calgary) - 17 October 2018
Contract Cheating in Canada (for University of Calgary) - 17 October 2018Contract Cheating in Canada (for University of Calgary) - 17 October 2018
Contract Cheating in Canada (for University of Calgary) - 17 October 2018
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 

Viewers also liked (6)

GLAMHerbert
GLAMHerbertGLAMHerbert
GLAMHerbert
 
2014 05-21 poster on ORCID identifiers in Wikipedia, Wikidata & sister projects
2014 05-21 poster on ORCID identifiers in Wikipedia, Wikidata & sister projects2014 05-21 poster on ORCID identifiers in Wikipedia, Wikidata & sister projects
2014 05-21 poster on ORCID identifiers in Wikipedia, Wikidata & sister projects
 
2015-04-03 WikiArabia GLAM-Wiki presentation by Andy Mabbett
2015-04-03 WikiArabia GLAM-Wiki presentation by Andy Mabbett2015-04-03 WikiArabia GLAM-Wiki presentation by Andy Mabbett
2015-04-03 WikiArabia GLAM-Wiki presentation by Andy Mabbett
 
The Near Future of CSS
The Near Future of CSSThe Near Future of CSS
The Near Future of CSS
 
Classroom Management Tips for Kids and Adolescents
Classroom Management Tips for Kids and AdolescentsClassroom Management Tips for Kids and Adolescents
Classroom Management Tips for Kids and Adolescents
 
The Presentation Come-Back Kid
The Presentation Come-Back KidThe Presentation Come-Back Kid
The Presentation Come-Back Kid
 

Similar to Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access Signalling Project

Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Olivier Grisel
 
N2Y3: Mashups Require Commons
N2Y3: Mashups Require CommonsN2Y3: Mashups Require Commons
N2Y3: Mashups Require Commons
Mike Linksvayer
 
Wmf wikimedia conference japan feb 3 en pdf
Wmf wikimedia conference japan feb 3 en pdfWmf wikimedia conference japan feb 3 en pdf
Wmf wikimedia conference japan feb 3 en pdf
Wikimedia Foundation
 
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge ExtractionFrom Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
STLab
 

Similar to Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access Signalling Project (20)

Bot programming in Wikimedia Commons with Pywikibot
Bot programming in Wikimedia Commons with PywikibotBot programming in Wikimedia Commons with Pywikibot
Bot programming in Wikimedia Commons with Pywikibot
 
Wikimedia 재단과 MediaWiki 위키 소프트웨어 조사
Wikimedia 재단과 MediaWiki 위키 소프트웨어 조사Wikimedia 재단과 MediaWiki 위키 소프트웨어 조사
Wikimedia 재단과 MediaWiki 위키 소프트웨어 조사
 
Tel Vortrag
Tel VortragTel Vortrag
Tel Vortrag
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 
Exploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXLExploring Article Networks on Wikipedia with NodeXL
Exploring Article Networks on Wikipedia with NodeXL
 
ConfrencePres
ConfrencePresConfrencePres
ConfrencePres
 
N2Y3: Mashups Require Commons
N2Y3: Mashups Require CommonsN2Y3: Mashups Require Commons
N2Y3: Mashups Require Commons
 
Intranet 2.0: Using Wikis
Intranet 2.0: Using WikisIntranet 2.0: Using Wikis
Intranet 2.0: Using Wikis
 
The Elephant in the Library - Integrating Hadoop
The Elephant in the Library - Integrating HadoopThe Elephant in the Library - Integrating Hadoop
The Elephant in the Library - Integrating Hadoop
 
Wmf wikimedia conference japan feb 3 en pdf
Wmf wikimedia conference japan feb 3 en pdfWmf wikimedia conference japan feb 3 en pdf
Wmf wikimedia conference japan feb 3 en pdf
 
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge ExtractionFrom Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
 
Wikidata: A New Way to Disseminate Structured Data
Wikidata: A New Way to Disseminate Structured DataWikidata: A New Way to Disseminate Structured Data
Wikidata: A New Way to Disseminate Structured Data
 
Open Culture - How Wiki loves art and data - Romaine
Open Culture - How Wiki loves art and data - RomaineOpen Culture - How Wiki loves art and data - Romaine
Open Culture - How Wiki loves art and data - Romaine
 
Open Access and Wikipedia : Taking accessible research to the global public"
Open Access and  Wikipedia : Taking accessible research to the global public"Open Access and  Wikipedia : Taking accessible research to the global public"
Open Access and Wikipedia : Taking accessible research to the global public"
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Beyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free KnowledgeBeyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free Knowledge
 
The Elephant in the Library
The Elephant in the LibraryThe Elephant in the Library
The Elephant in the Library
 
Wikidata Introductory Workshop
Wikidata Introductory WorkshopWikidata Introductory Workshop
Wikidata Introductory Workshop
 
Mphil Computational Biology Seminar Series Presentation (20201111)
Mphil Computational Biology Seminar Series Presentation (20201111)Mphil Computational Biology Seminar Series Presentation (20201111)
Mphil Computational Biology Seminar Series Presentation (20201111)
 
Big dataorig
Big dataorigBig dataorig
Big dataorig
 

Recently uploaded

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 

Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access Signalling Project

  • 1. Data-Hacking with Wikimedia Projects: Learn by Example, Including Wikipedia, Wikidata, and beyond! @notconfusing (Max Klein) @wrought (Matt Senate)
  • 2. Who is in the room? ● Data hackers? ● Programmers? ● Artists/designers? ● Open Access folks? ● Academics? ● Wikipedians? ● Wikimedians? ● Wiki-people?
  • 4. Wiki Context ● Wikipedia: far-largest in size and user base. ● Projects often organized by language. ● Each language-project has an independent user community.
  • 5. Wiki Context ● See Wikimedia projects as a form of: “curated database” ● Web's Least Common Denominator for data. ● Wiki Paradox: – Low-barrier-to-entry – High-barrier-to-entry Buneman, et al. Curated Databases. https://peerlibrary.org/p/rxQ6WBd89XviMF4Tk
  • 6. How do Wikimedia projects work? ● Community ● Opt-in ● Reputation ● Cultural Protocol ● Bureaucracy ● Adhocracy ● Coordination – WikiProjects
  • 7. History of Wiki Data- Hacking ● Rambot – First “bot” – 2002 ● US Census Data ● Created 200,000 articles ● More than doubled Wikipedia's size at the time ● No permissions
  • 8. “Ignore All Rules” --> Calcification ● To get things done: – Need to issue a “Request for Comment” or RFC ● Everybody, regardless of expertise has trouble with this. – Sometimes, not everyone is acting in good faith, but try to assume it, it will help.
  • 9. Häskell und Grepl ● Hänsel und Gretel ● Wikidata https://commons.wikimedia.org/wiki/File:Hansel_and_Gretel.jpg
  • 10. ● Rural Hunger Problems ● The Wikimedia datascape having cross-language data sharing problems.
  • 11. ● Häskell and Grepl are sent to the Forest alone. ● Data is sent to live in Templates, living alone.
  • 12. ● Häskell and Grepl first invent a successful breadcrumb system ● Wikimedia commons allows images to be shared across Wikis.
  • 13. ● Lucky if their breadcrumbs are not eaten / or whatever put. ● Lucky if we knew on which pages the Data was stored. http://commons.wikimedia.org/wiki/File:Flickr_-_Per_Ola_Wiberg_~_mostly_away_-_-_ %22No_bread,_just_a_camera...huh,_quack_%22.jpg
  • 14. ● A magical gingerbread house ● A magical data store – Called “Wikidata” – Interwiki data sharing – Plus with extra sweeties http://commons.wikimedia.org/wiki/Gingerbread_house#mediaviewer/File:Pe pparkakshus.JPG
  • 15. ● And the house includes many different sweeties. ● Semantic Triples ● Qualifiers ● Ranks http://commons.wikimedia.org/wiki/Category:Liquorice_candy#mediaviewer/File:Flickr_- _cyclonebill_-_Slik_%281%29.jpg
  • 16. ● Häskell and Grepl eat the roof hungrily. ● In this story, the User's start adding to Wikidata. – Importing Wikipedia – Foreign Database – Manual adds.
  • 17. ● The evil witch trap ● The evil data witches, normally keep the information as a silo.
  • 18. ● Grepl's cunning defeat of the witch ● Identifers – Think of this as a foreign database key (Brian Jacobs) – Max imported 400,000 biographical identifers. – This started an Identiifer craze on Wikidata ● Tennis Player ● Swiss Parliament ● Danish Companies
  • 19. ● Grepl's cunning defeat of the witch ● All Wikis can transclude arbitrary data (eventually). ● And Citations can be represented as semantic properties. See how sources are handled with “FRBR” format: http://www.wikidata.org/wiki/Help:Sources#Scientific.2C_newspaper_or_magazine_article
  • 20. Signalling Open Access ● WikiProject Open Access – On English Wikipedia ● (Data) Problem: Signalling “Open Access” is hard! ● Solution: Use clear signals directly to the relevant data. – Copyright license – Source content – Metadata
  • 21. What?
  • 22. How? ● Text WikiSource→ ● Media Commons→ ● Metadata Wikidata→ ● Signals Wikipedia→ – Including license! ● Public domain, CC0, CC- BY, CC-BY-SA, etc ● RFC RecitationBot→ ● RFC RecitationBot→ ● RFC RecitationBot→ ● RFC RecitationBot→
  • 23. In the Wikimedia Universe ● Where does “Signalling Open Access” fit in the Wikimedia narrative? ● (Data) Problem: Managing citations & references is hard! ● Possible Solutions: – Templates (Many) – Categories (Many) – Namespace (FR) – WikiScholar (Dead) – VisualEditor (Zotero?)
  • 24. “Signalling OA” Opportunities ● Take pass at citation management – Quality & experience ● Integrate metadata with Wikidata – Where it belongs – Sensitivity and rigor ● Forge deep knowledge resources on Wikipedia – Snapshots of sources, deeper linking ● Automate, but use human judgment – Save time and energy, improve accuracy.
  • 25. Paths for Data Hackers ● Reputation – Create a user account – Contribute in good faith to wikimedia projects ● Cultural protocol – Identify the scope, concerns, and nature of a given project – Learn to navigate ● History and Context – Form a narrative for your hack based on past endeavors – Seek consent and build consensus ● Community – Reach out, on IRC and mailing lists!
  • 26. Of course we're Open Source github.com/wpoa/OA-signalling github.com/wpoa/recitation-bot Thanks! @notconfusing (Max) @wrought (Matt)