SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
Audio fingerprinting and metadata
     correction with Python

           Alastair Porter


         November 21, 2011
Me

     Background in Computer Science
     Masters McGill Music Tech
     Online
         http://github.com/alastair (20/28 music; 11 in python)
         http://twitter.com/alastairporter
Python as a go-to language

     Quick for prototyping
     Use the same code in a production release
     Very handy for API access (thin wrapper around urllib2)
Music and Metadata
Music and Metadata

  The problem:
      People are really bad at naming music
      Inconsistent over releases


  The solution:
      Crowdsourcing
      Get info from as many trusted sources as possible
      Make renaming take no effort
MusicBrainz
Amazon
Amazon (Coverart)
Last.fm
Last.fm (Genre tags)
MusicBrainz
albumidentify




  http://github.com/albumidentify/albumidentify
MP3, FLAC, Ogg, CDs
Identification strategy

      If there’s a CD TOC, use that (musicbrainz lookup)
      If no match, use audio fingerprinting
      If no match, do a text lookup (artist/album)
Fingerprinting

     Converts an audio signal to a short sequence of numbers
     Smaller to compare than an entire file
     Perceptual features rather than byte comparison (works
     with different encodings)
Identification strategy

      Fingerprinting gives us a set of candidate tracks
      A track could be on many albums (original release, best of,
      mix album)
      Keep a list of what tracks we have for each album
      Once we fill all the slots for an album, success!
Metadata strategy

     Text information from Musicbrainz
     Genre from last.fm
     Image from Amazon (or folder.jpg)
     Musicbrainz tells us where these are (don’t need to search)
     Save in every file (Text is cheap)
Writing it all out

      Custom MP3/ID3 writer
      Ogg meta tags
      FLAC meta tags
      Name files
          Artist/Artist - Year - Album/01 - Artist - Track
      Replaygain!
      Be a good citizen: Submit fingerprints to musicbrainz
What’s next

     New version of musicbrainz
     New fingerprinter
     More metadata
     More metadata
Thanks

  More information:
      MusicBrainz: http://musicbrainz.org
      albumidentify:
      http://github.com/albumidentify/albumidentify
      More fingerprinting: http://acoustid.org,
      http://echoprint.me
      Last.fm

Más contenido relacionado

La actualidad más candente

CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)shirlon
 
1. initial plans (js)
1. initial plans (js)1. initial plans (js)
1. initial plans (js)Jack Sullivan
 
Music Sampling in Hip Hop
Music Sampling in Hip HopMusic Sampling in Hip Hop
Music Sampling in Hip HopAshamim
 
Twitter bots I have known and loved
Twitter bots I have known and lovedTwitter bots I have known and loved
Twitter bots I have known and lovedSteve Winton
 
Podcasting Tips
Podcasting TipsPodcasting Tips
Podcasting Tipstheartguy
 
FCP #3 Importing Media
FCP #3 Importing MediaFCP #3 Importing Media
FCP #3 Importing MediaSamuel Edsall
 
Analysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for theAnalysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for thechrismuzz
 
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing ChinaThe Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing Chinaolympic125
 
Elvis Presley Cut Me And I Bleed 1999
Elvis Presley   Cut Me And I Bleed 1999Elvis Presley   Cut Me And I Bleed 1999
Elvis Presley Cut Me And I Bleed 1999Elvis Live
 
Sgp slideshow
Sgp slideshowSgp slideshow
Sgp slideshowjprestler
 
Scott Slotnick Personal Persona
Scott Slotnick Personal PersonaScott Slotnick Personal Persona
Scott Slotnick Personal PersonaScott Slotnick
 
File Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and MixesFile Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and MixesMagic Finger Lounge
 
Music Horror Analysis
Music Horror AnalysisMusic Horror Analysis
Music Horror Analysisgmckillop
 

La actualidad más candente (20)

CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)CFADW PRESENTATION(Music sampling in hip hop)
CFADW PRESENTATION(Music sampling in hip hop)
 
Props List
Props ListProps List
Props List
 
1. initial plans (js)
1. initial plans (js)1. initial plans (js)
1. initial plans (js)
 
Music Sampling in Hip Hop
Music Sampling in Hip HopMusic Sampling in Hip Hop
Music Sampling in Hip Hop
 
Assignment 53
Assignment 53Assignment 53
Assignment 53
 
Twitter bots I have known and loved
Twitter bots I have known and lovedTwitter bots I have known and loved
Twitter bots I have known and loved
 
Podcasting
PodcastingPodcasting
Podcasting
 
Podcasting Tips
Podcasting TipsPodcasting Tips
Podcasting Tips
 
Podcast Tutorial
Podcast TutorialPodcast Tutorial
Podcast Tutorial
 
FCP #3 Importing Media
FCP #3 Importing MediaFCP #3 Importing Media
FCP #3 Importing Media
 
Analysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for theAnalysis of the mystery jets digi pack for the
Analysis of the mystery jets digi pack for the
 
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing ChinaThe Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
The Olympic Soundtrack Artists 2008 Summer Olympics Beijing China
 
Elvis Presley Cut Me And I Bleed 1999
Elvis Presley   Cut Me And I Bleed 1999Elvis Presley   Cut Me And I Bleed 1999
Elvis Presley Cut Me And I Bleed 1999
 
Project pronunciation game 1
Project pronunciation game 1Project pronunciation game 1
Project pronunciation game 1
 
Sgp slideshow
Sgp slideshowSgp slideshow
Sgp slideshow
 
Scott Slotnick Personal Persona
Scott Slotnick Personal PersonaScott Slotnick Personal Persona
Scott Slotnick Personal Persona
 
File Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and MixesFile Naming Conventions and Creating Stems and Mixes
File Naming Conventions and Creating Stems and Mixes
 
Magazine names
Magazine namesMagazine names
Magazine names
 
Music Horror Analysis
Music Horror AnalysisMusic Horror Analysis
Music Horror Analysis
 
\-_-/
\-_-/\-_-/
\-_-/
 

Destacado

Mp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMontreal Python
 
Mp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMontreal Python
 
Mp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMontreal Python
 
Mp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMontreal Python
 
Mp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMontreal Python
 
Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Montreal Python
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMontreal Python
 

Destacado (7)

Mp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with PythonMp25: Optical Music Recognition with Python
Mp25: Optical Music Recognition with Python
 
Mp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook gameMp24: The Bachelor, a facebook game
Mp24: The Bachelor, a facebook game
 
Mp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without PythonMp24: Fabulous Mobile Development with and without Python
Mp24: Fabulous Mobile Development with and without Python
 
Mp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with TalentsMp26 : Connecting Startups with Talents
Mp26 : Connecting Startups with Talents
 
Mp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based DesignsMp25 Message Switching for Actor Based Designs
Mp25 Message Switching for Actor Based Designs
 
Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?Mp26 : How do you Solve a Problem like Santa Claus?
Mp26 : How do you Solve a Problem like Santa Claus?
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is bliss
 

Similar a Mp25: Audio Fingerprinting and metadata correction with Python

Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Paul Lamere
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Yi-Hsuan Yang
 
Copyright in music a lesson in heavy metal
Copyright in music   a lesson in heavy metalCopyright in music   a lesson in heavy metal
Copyright in music a lesson in heavy metalStephen Marvin
 
Metadata for musicians: setting up release
Metadata for musicians: setting up releaseMetadata for musicians: setting up release
Metadata for musicians: setting up releaseKristin Thomson
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Yi-Hsuan Yang
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information RetrievalSease
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information RetrievalAndrea Gazzarini
 
Do Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsDo Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsMatthew Hawn
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At SpotifyVidhya Murali
 
Audio on the web
Audio on the webAudio on the web
Audio on the webJoel May
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Oscar Celma
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 
Audio format
Audio formatAudio format
Audio formatavid
 
Mti presentation
Mti presentationMti presentation
Mti presentationDing Xu
 
Mti presentation
Mti presentationMti presentation
Mti presentationDing Xu
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetluisfvazquez1
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resourcesbradfordswanson
 
Music discovery on the net
Music discovery on the netMusic discovery on the net
Music discovery on the netguestbf080
 

Similar a Mp25: Audio Fingerprinting and metadata correction with Python (20)

Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
 
Copyright in music a lesson in heavy metal
Copyright in music   a lesson in heavy metalCopyright in music   a lesson in heavy metal
Copyright in music a lesson in heavy metal
 
Metadata for musicians: setting up release
Metadata for musicians: setting up releaseMetadata for musicians: setting up release
Metadata for musicians: setting up release
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
Do Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic PlaylistsDo Androids Dream Of Algorithmic Playlists
Do Androids Dream Of Algorithmic Playlists
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 
Audio on the web
Audio on the webAudio on the web
Audio on the web
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 
Music mobile
Music mobileMusic mobile
Music mobile
 
Audio format
Audio formatAudio format
Audio format
 
Mti presentation
Mti presentationMti presentation
Mti presentation
 
Mti presentation
Mti presentationMti presentation
Mti presentation
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Teaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology ResourcesTeaching Music Technology Concepts with Few Music Technology Resources
Teaching Music Technology Concepts with Few Music Technology Resources
 
Music discovery on the net
Music discovery on the netMusic discovery on the net
Music discovery on the net
 
DJ Workshop v.0.2b
DJ Workshop v.0.2bDJ Workshop v.0.2b
DJ Workshop v.0.2b
 

Último

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Último (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Mp25: Audio Fingerprinting and metadata correction with Python

  • 1. Audio fingerprinting and metadata correction with Python Alastair Porter November 21, 2011
  • 2. Me Background in Computer Science Masters McGill Music Tech Online http://github.com/alastair (20/28 music; 11 in python) http://twitter.com/alastairporter
  • 3. Python as a go-to language Quick for prototyping Use the same code in a production release Very handy for API access (thin wrapper around urllib2)
  • 5. Music and Metadata The problem: People are really bad at naming music Inconsistent over releases The solution: Crowdsourcing Get info from as many trusted sources as possible Make renaming take no effort
  • 14. Identification strategy If there’s a CD TOC, use that (musicbrainz lookup) If no match, use audio fingerprinting If no match, do a text lookup (artist/album)
  • 15. Fingerprinting Converts an audio signal to a short sequence of numbers Smaller to compare than an entire file Perceptual features rather than byte comparison (works with different encodings)
  • 16. Identification strategy Fingerprinting gives us a set of candidate tracks A track could be on many albums (original release, best of, mix album) Keep a list of what tracks we have for each album Once we fill all the slots for an album, success!
  • 17. Metadata strategy Text information from Musicbrainz Genre from last.fm Image from Amazon (or folder.jpg) Musicbrainz tells us where these are (don’t need to search) Save in every file (Text is cheap)
  • 18. Writing it all out Custom MP3/ID3 writer Ogg meta tags FLAC meta tags Name files Artist/Artist - Year - Album/01 - Artist - Track Replaygain! Be a good citizen: Submit fingerprints to musicbrainz
  • 19. What’s next New version of musicbrainz New fingerprinter More metadata More metadata
  • 20. Thanks More information: MusicBrainz: http://musicbrainz.org albumidentify: http://github.com/albumidentify/albumidentify More fingerprinting: http://acoustid.org, http://echoprint.me Last.fm