SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Metadata:
The Three-Legged Dance
Tim Spalding
ALA NISO-BISG Forum | June 23, 2017
tim@librarything.com
@LibraryThingTim
Who am I?
Book-lover, ex-scholar, programmers
LibraryThing (2005)
LibraryThing for Libraries (2007)
TinyCat (2016)
Syndetics Unbound (2016)
At the Intersection Of…
Readers
Collectors
Libraries
Academic, Public,
School, "Tiny"
Online booksellers
Bookstores
Publishers
Authors
Also: archives,
scholars, famous
dead people with
books, music and
movie lovers
Data is Good
Everyone their data
Every data its glorious purpose
Every data its data that makes it better
My Approach to Data Is…
Loving
Respectful
Flexible
Statistical
Optimistic as to what librarians can do…
The Three-Legged Stool
Professional data
User data
Content data
(a very, very simplified framework)
Professional Data
Library cataloging (MARC, BIBFRAME)
Publisher/bookseller (ONIX, Amazon, Bowker)
Classification (DDC, LCC, BIC, BISAC, LCSH)
Professional reviews
Bibliographies and guides (LibGuides, bibliographic
monographs)
Reading levels (Lexile, AR, F&P)
User Data
Intentional
User reviews
Ratings
Tags
Annotations
Lists
Discussions
User book recommendations
Implicit
Purchase patterns
Ownership patterns
Checkout patterns
Reading patterns
Popularity
Content Data
Text of book
Samples, quotes, etc.
Tables of contents
Indexes
Word and phrase statistics
In-text references and footnotes
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
One-Legged Stools:
"Recommendations,"
"Similar Books," etc.
One-Legged Stools:
Recommendations
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
One-Legged Stools:
Recommendations
Boring
Repetitive
Keep people in their bubble
No serendipity, surprise
No taste!
Recommendations too much
by statistics?
One-Legged Stools:
Recommendations
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
Solution: Add a leg or two…
Let users act like professionals
Use statistics on classification
"Everyone a Librarian"
Improved
Author disambiguation
(1,741,282)
Edition/work control
(5,544,233)
Canonical book titles
Series
Author name variants
Created
Work relationships
(contained in, commentary
on, parody of, etc.)
Awards
Places, characters, events
Author picture
Author information
(education, family,
occupation, nationality, etc.)
The Dewmoji !
174.3 = 💭 🚎 🙈 ⚖
1 💭 Philosophy and Psychology
7 🚎 Ethics
4 🙈 Professional Ethics
.3 ⚖ Lawyers
"Everyone's a librarian?"
Ha. Add ANOTHER leg.
Librarians at LibraryThing vet USER DATA:
Tag approval
— LibraryThing has 135m tags; 75% belong to 30,000 unique
Series approval
Award approval
Picture approval
Review approval
Solution: Add a leg or two…
Let users act like professionals.
Use user statistics on professional
data
Does that classification map to
user/usage data?
DDC against
"people who have X have Y"
Clusters well — high "salience"
618.4 — Birthing books
668.1 — Soapmaking
638.1 — Beekeeping
Clusters terribly — low "salience"
All literature in DDC
796.1 — Miscellaneous games
225.6 — New Testament > Hermeneutics, Exegesis
How we do Recommendations
Basic Factors
"People who have X have Y
statistics"
Three different statistical
approaches
Shared tags
Reorder and Drop
Ratings
Reviews
User recommendations
User up and down votes
LT Popularity curves
Library popularity curves
Tag "salience"
Tag approval
tag-to-author
Classification systems
Classification salience
Series
Series order
Series-order importance
Author clustering
In-house algorithmic genre
system
Crosswalks from genre to tag, etc.
Final factor: TASTE!
Mix of authors, popularities,
genres, etc.
Steal Someone's Leg
Users do stuff to Professional data
Users add and improve bibliographic information
Professionals do stuff to user data
Professional curation of tags, reviews
Professionals pretend to be users
Publishers suggest similar books
Random Hortatory Slogans
Use all the data you can
Free your data
Use data by others,
even distant others
Be flexible
Use statistics
Don't be afraid of users
But don't let them run
rampant either…
Cede ground …
… Take ground
Add professional value
to non-professional
data
Thank you!
tim@librarything.com
@LibraryThingTim
Idea:
What's the best shelf-order system?
Lay out an entire "typical" library in one long line by
classification
Take data on non-library clustering (e.g., people who
have X have Y)
Calculate the average distance you'd have to travel

Más contenido relacionado

Más de National Information Standards Organization (NISO)

Más de National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 

Spalding Metadata The Three Legged Dance

  • 1. Metadata: The Three-Legged Dance Tim Spalding ALA NISO-BISG Forum | June 23, 2017 tim@librarything.com @LibraryThingTim
  • 2. Who am I? Book-lover, ex-scholar, programmers LibraryThing (2005) LibraryThing for Libraries (2007) TinyCat (2016) Syndetics Unbound (2016)
  • 3.
  • 4.
  • 5. At the Intersection Of… Readers Collectors Libraries Academic, Public, School, "Tiny" Online booksellers Bookstores Publishers Authors Also: archives, scholars, famous dead people with books, music and movie lovers
  • 6. Data is Good Everyone their data Every data its glorious purpose Every data its data that makes it better
  • 7. My Approach to Data Is… Loving Respectful Flexible Statistical Optimistic as to what librarians can do…
  • 8. The Three-Legged Stool Professional data User data Content data (a very, very simplified framework)
  • 9. Professional Data Library cataloging (MARC, BIBFRAME) Publisher/bookseller (ONIX, Amazon, Bowker) Classification (DDC, LCC, BIC, BISAC, LCSH) Professional reviews Bibliographies and guides (LibGuides, bibliographic monographs) Reading levels (Lexile, AR, F&P)
  • 10. User Data Intentional User reviews Ratings Tags Annotations Lists Discussions User book recommendations Implicit Purchase patterns Ownership patterns Checkout patterns Reading patterns Popularity
  • 11. Content Data Text of book Samples, quotes, etc. Tables of contents Indexes Word and phrase statistics In-text references and footnotes
  • 12. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: "Recommendations," "Similar Books," etc.
  • 13.
  • 14.
  • 15. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: Recommendations
  • 24.
  • 25. Boring Repetitive Keep people in their bubble No serendipity, surprise No taste! Recommendations too much by statistics?
  • 26. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  • 27. Solution: Add a leg or two… Let users act like professionals Use statistics on classification
  • 28.
  • 29.
  • 30. "Everyone a Librarian" Improved Author disambiguation (1,741,282) Edition/work control (5,544,233) Canonical book titles Series Author name variants Created Work relationships (contained in, commentary on, parody of, etc.) Awards Places, characters, events Author picture Author information (education, family, occupation, nationality, etc.)
  • 31. The Dewmoji ! 174.3 = 💭 🚎 🙈 ⚖ 1 💭 Philosophy and Psychology 7 🚎 Ethics 4 🙈 Professional Ethics .3 ⚖ Lawyers
  • 32. "Everyone's a librarian?" Ha. Add ANOTHER leg. Librarians at LibraryThing vet USER DATA: Tag approval — LibraryThing has 135m tags; 75% belong to 30,000 unique Series approval Award approval Picture approval Review approval
  • 33. Solution: Add a leg or two… Let users act like professionals. Use user statistics on professional data Does that classification map to user/usage data?
  • 34.
  • 35.
  • 36. DDC against "people who have X have Y" Clusters well — high "salience" 618.4 — Birthing books 668.1 — Soapmaking 638.1 — Beekeeping Clusters terribly — low "salience" All literature in DDC 796.1 — Miscellaneous games 225.6 — New Testament > Hermeneutics, Exegesis
  • 37. How we do Recommendations Basic Factors "People who have X have Y statistics" Three different statistical approaches Shared tags Reorder and Drop Ratings Reviews User recommendations User up and down votes LT Popularity curves Library popularity curves Tag "salience" Tag approval tag-to-author Classification systems Classification salience Series Series order Series-order importance Author clustering In-house algorithmic genre system Crosswalks from genre to tag, etc. Final factor: TASTE! Mix of authors, popularities, genres, etc.
  • 38.
  • 39. Steal Someone's Leg Users do stuff to Professional data Users add and improve bibliographic information Professionals do stuff to user data Professional curation of tags, reviews Professionals pretend to be users Publishers suggest similar books
  • 40. Random Hortatory Slogans Use all the data you can Free your data Use data by others, even distant others Be flexible Use statistics Don't be afraid of users But don't let them run rampant either… Cede ground … … Take ground Add professional value to non-professional data
  • 42. Idea: What's the best shelf-order system? Lay out an entire "typical" library in one long line by classification Take data on non-library clustering (e.g., people who have X have Y) Calculate the average distance you'd have to travel