SlideShare una empresa de Scribd logo
1 de 34
The Secret Life of a DataPhile
Evan Stein
Mark Wainwright
Georgiana Bogdan
Decibel Music Systems
PART 1
A Life in Search
Evan Stein
Decibel Music Systems
The Talk
 Evan Stein: Introduction – Reasons for Decibel as a product
 Mark Wainwright: Technical issues with music metadata
 Georgiana Bogdan: Metadata collection and processing
What is Decibel?
 Fact-based metadata system
 Social / buying-based recommendations (e.g., Amazon)
 Sound / mood-based recommendations (e.g., EchoNest)
 Fact-based navigation (e.g., MusicBrainz, Gracenote)
 Data and search provided through an API
 White-label services for customers’ products
 Navigate collections through linked information
 Furnish information, sleeve-note equivalent and file tagging
 Repertoire, artist and recording normalisation
 Insane level of detail
Decibel at work
A bit of history
 Musician
 Library of Congress
 Studies in Musicology
 Switch to computers, thanks to Fernando Pessoa
 Manhattan DA
 Standard & Poor’s
 Decibel
Library work
 Classification
 Retrieval (by classification)
 LC / Dewey Decimal, metadata
 Knowledge of the domain is a key to good work
 There are human databases walking about
Musicology
 Catalogues and classifications
 Works
 Instruments
 Eras, genres, styles
 Biography
 Ways of thinking about music
 Repertoire, theory, performance practice
 Sociology, anthropology, psychology, linguistics
 Correlation with other art forms
 Performers
Law enforcement
 Data for hypothesis-formation
 Unknown start and end
 Non-linear search
 Multiple languages, phonetics, semantics
 Linkage
Finance
 Normalisation
 Language
 Workflow
 Currency
 Formulas
Why the British Library?
 Fact based systems are good for research
 You don’t know what the user wants to know until they want to know it
 Data-based thinking allows you to follow your train of thought
 Good for navigating collections
 Improvements in bandwidth and storage
 Personal collections are getting larger
 Stores and services are also collections
 Library collections are being digitised, and physically smaller
Digitisation
 Information extracted from artefact (record, book, video, etc.)
 Cons
 Possible lack of context and background
 Ignores the artefact
 Pros
 “Good enough” for most uses
 Can be consumed anywhere
PART 2
Asking a Lot
Mark Wainwright
Decibel Music Systems
Relational
Database
Graph
Database
Graph Database Features
 Polymorphism More detail without affecting performance
 Recursive Relationships Results are more complete
 Associative Structure More interesting questions
Album Artist Track
Album has one disc
Album is disc
Track
Sung
By a man
Married To
A woman
Whose Song
Is Performed
Relational Database: Organised by Type
Graph Database: Organised by Association
Graph Database
 Polymorphism More detail without affecting performance
 Recursive Relationships Results are more complete
 Associative Structure More interesting questions
PART 3
Metadata Collection and Processing:
A Data Detective’s Investigation
Georgiana Bogdan
Decibel Music Systems
Why do we collect metadata?
KEEP
CALM
AND
JOIN THE
DIGITAL
MUSIC
REVOLUTION!
Source: IFPI Digital Music Report 2014
Why do we collect metadata?
Because few things matter more.
It is crucial for:
 Artists
 Music Listeners
 Music Providers
 Copyright Holders
 Music Libraries & Archives
Who do we collect metadata for?
Music streaming services
Copyright Collection Society
App Developers
Online radio services
Record Labels
Other music
industry
players
Digital Music Stores
Music
distributors
What metadata do we collect?
 Comprehensive data model
 Graph database for representing and storing data; API for delivering it
 Rich data fields; mix of internet sources, research and editorial content
What metadata do we collect?
Artist
Place
And
Dates of
Birth (and
Death)
Artist
Biogra
phy
Nationa
lity
Relation
ships
Album
Release
label
Track
Count
Duration
Album
Contrib
utionsRelease
Date
and
Region
Genre
Release
© and ℗
Cover
Art
Track
Genre
Mixing
Venue
and
Date
Publisher
Writer
Track ℗
details
Participa
nts/Artist
sMastering
Venue and
Date
Recording
Venue and
Date
Number
Performa
nce Type
Duration
ISRC
How do we collect metadata?
 Online legal sources (and the magic of computer programming!)
 Research Team + Editorial Team. The Right People!
 Data Partnerships
How do we keep our metadata evolving?
 Keeping an eye on emerging music markets – data & content in local languages
 Being aware of the music ecosystem; connecting with the industry players
 Directly engaging with music industry professionals. Being social and sociable!
http://www.decibel.net
@decibelnet
Evan Stein
evan.stein@decibel.net
Mark Wainwright
mark.wainwright@decibel.net
Georgiana Bogdan
georgiana.bogdan@decibel.net
How to contact us?

Más contenido relacionado

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Destacado

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destacado (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

The secret life of a dataphile - Keeping Tracks

  • 1. The Secret Life of a DataPhile Evan Stein Mark Wainwright Georgiana Bogdan Decibel Music Systems
  • 2. PART 1 A Life in Search Evan Stein Decibel Music Systems
  • 3. The Talk  Evan Stein: Introduction – Reasons for Decibel as a product  Mark Wainwright: Technical issues with music metadata  Georgiana Bogdan: Metadata collection and processing
  • 4. What is Decibel?  Fact-based metadata system  Social / buying-based recommendations (e.g., Amazon)  Sound / mood-based recommendations (e.g., EchoNest)  Fact-based navigation (e.g., MusicBrainz, Gracenote)  Data and search provided through an API  White-label services for customers’ products  Navigate collections through linked information  Furnish information, sleeve-note equivalent and file tagging  Repertoire, artist and recording normalisation  Insane level of detail
  • 6. A bit of history  Musician  Library of Congress  Studies in Musicology  Switch to computers, thanks to Fernando Pessoa  Manhattan DA  Standard & Poor’s  Decibel
  • 7. Library work  Classification  Retrieval (by classification)  LC / Dewey Decimal, metadata  Knowledge of the domain is a key to good work  There are human databases walking about
  • 8. Musicology  Catalogues and classifications  Works  Instruments  Eras, genres, styles  Biography  Ways of thinking about music  Repertoire, theory, performance practice  Sociology, anthropology, psychology, linguistics  Correlation with other art forms  Performers
  • 9. Law enforcement  Data for hypothesis-formation  Unknown start and end  Non-linear search  Multiple languages, phonetics, semantics  Linkage
  • 10. Finance  Normalisation  Language  Workflow  Currency  Formulas
  • 11. Why the British Library?  Fact based systems are good for research  You don’t know what the user wants to know until they want to know it  Data-based thinking allows you to follow your train of thought  Good for navigating collections  Improvements in bandwidth and storage  Personal collections are getting larger  Stores and services are also collections  Library collections are being digitised, and physically smaller
  • 12.
  • 13. Digitisation  Information extracted from artefact (record, book, video, etc.)  Cons  Possible lack of context and background  Ignores the artefact  Pros  “Good enough” for most uses  Can be consumed anywhere
  • 14. PART 2 Asking a Lot Mark Wainwright Decibel Music Systems
  • 16. Graph Database Features  Polymorphism More detail without affecting performance  Recursive Relationships Results are more complete  Associative Structure More interesting questions
  • 18. Album has one disc Album is disc
  • 19.
  • 20. Track Sung By a man Married To A woman Whose Song Is Performed
  • 21. Relational Database: Organised by Type Graph Database: Organised by Association
  • 22.
  • 23. Graph Database  Polymorphism More detail without affecting performance  Recursive Relationships Results are more complete  Associative Structure More interesting questions
  • 24. PART 3 Metadata Collection and Processing: A Data Detective’s Investigation Georgiana Bogdan Decibel Music Systems
  • 25.
  • 26. Why do we collect metadata? KEEP CALM AND JOIN THE DIGITAL MUSIC REVOLUTION! Source: IFPI Digital Music Report 2014
  • 27. Why do we collect metadata? Because few things matter more. It is crucial for:  Artists  Music Listeners  Music Providers  Copyright Holders  Music Libraries & Archives
  • 28. Who do we collect metadata for? Music streaming services Copyright Collection Society App Developers Online radio services Record Labels Other music industry players Digital Music Stores Music distributors
  • 29. What metadata do we collect?  Comprehensive data model  Graph database for representing and storing data; API for delivering it  Rich data fields; mix of internet sources, research and editorial content
  • 30. What metadata do we collect? Artist Place And Dates of Birth (and Death) Artist Biogra phy Nationa lity Relation ships Album Release label Track Count Duration Album Contrib utionsRelease Date and Region Genre Release © and ℗ Cover Art Track Genre Mixing Venue and Date Publisher Writer Track ℗ details Participa nts/Artist sMastering Venue and Date Recording Venue and Date Number Performa nce Type Duration ISRC
  • 31. How do we collect metadata?  Online legal sources (and the magic of computer programming!)  Research Team + Editorial Team. The Right People!  Data Partnerships
  • 32. How do we keep our metadata evolving?  Keeping an eye on emerging music markets – data & content in local languages  Being aware of the music ecosystem; connecting with the industry players  Directly engaging with music industry professionals. Being social and sociable!
  • 33.