SlideShare a Scribd company logo
1 of 18
PARKEET - VOICE
FINGERPRINTING &
RECOGNITION
Agenda
• About us!?
• Motivation
• Project approach ...
• Project outcome…
• Q&A
About us
• We are data-centric company with many years of
experience in different industries
• Telco
• Banking
• Finance
• Retail
• Manufacturing
• Distributions
• Transportation
• We are here to solve any data-related problems
and extract value from your data.
Organizational Structure
• As a company, we are
strategically focused
on knowledge.
• We have established
team-based matrix
organization in order
to provide flexibility
of teams according to
competencies needed
for particular
projects.
• It helps us to
accomplish optimal
results in quality,
within defined budgets
and deadlines.
Consulting Development
Research
* Note: Competencies of all team members are developed in a way that they can
actively be engaged also in other team that is not their primary competence
R&D
Consulting
Evolution of ASR, TTS and multispeaker ...
Glossary
• ASR - Automatic Speech Recognition
• TTS - Text-To-Speech
• WER - word error rate
• cpWER – concatenated minimum-permutation word error
rate
• DER - diarization error rate
• VAD - voice activity detection
Motivation
• Automatic speech recognition is a long-researched
problem, the goal of which is human turn speech into
words.
• In the technological sphere, this problem is better
known as „Speech-to-text„ and some of the applications
include conversational agents (Siri) and conversation
transcriptions.
• End-to-end models have become a popular approach as an
alternative to traditional hybrid models in automatic
speech recognition (ASR).
• The multi-speaker speech separation and recognition
task is a central task in cocktail party problem
Ground work ...
• ASR (automatic speech recognition) with 1
speaker and with clean data is a very well
solved problem (superhuman performance)
• In real use cases, we might have multiple
speakers, different accents, noise, etc.
• Diarization - “who spoke when” —> very hard
problem, unknown number of speakers in advance
Ground work ...
• Historically, separating speakers was done
modular - a separate network for automatic
speech recognition (speech-to-text) and for
speaker diarization
• Recently, end-to-end (one network for all
tasks) have started making an impact
• Big companies (Microsoft, Google) have models
that perform well, but they are not open-source
• One network for VAD (voice activity detection),
speaker diarization, ASR
Our approach
• (Relevant projects) Audio separation for music
instruments with custom built CV framework
merging SOTA architectures
• MobileNet
• ResUNet
Single microphone recording
issue
Version 0.1
• Wave2Vec 2.0 - Unsupervised pretraining, uses
unlabeled data, fine tuned for downstream tasks
• Open source!!!
Version 0.2 – enhanced
Wawe2Vec
• “Injecting” layers in different parts of
architecture - output VAD/diarization/ASR
• Initial layer(s) – VAD (almost binary task)
• Mid layer(s) – diarization (complex calculation)
• Final layer(s) - ASR/speech-to-text (high complexity)
• This is our baseline for all other work!
Whisper by OpenAI
• Transformer based model, uses log-mel
spectrograms
• Trained on 680000 hours of multilingual data
• Robust to accents and noise – Important
• Open source!!!
Version 0.6 – enhanced Whisper
• “Injecting” layers
• Initial layer(s) – VAD (almost binary task)
• Mid layer(s) – diarization (complex calculation)
• Final layer(s) - ASR/speech-to-text (high complexity)
• Since it is robust to noise, it achieves a good
model for real-world applications of multi-
speaker ASR (meetings have noise, people have
different accents)
• A big problem occurs when speakers overlap
Short DEMO
Model ASR and diarization output:
That’s all for today. Okay we have to fill in
all this stuff stuff m stuff. Meeting adjourned,
meeting edjourned, yeah, I think I’ve learned
not to bring play-dough to meetings. Yeah, I
think it would be a good idea, I like it.
Speaker 1 Speaker 2 Speaker 3 Speaker 4
Final conculsion…
www.atmc.ai
info@atmc.ai

More Related Content

Similar to [DSC Europe 22] Parakeet - a parrot that can repeat words very well - Tomislav Krizan

Baobab suite overview january 2013 v3 (slideshare)
Baobab suite overview january 2013 v3 (slideshare)Baobab suite overview january 2013 v3 (slideshare)
Baobab suite overview january 2013 v3 (slideshare)afrozaar
 
Design Systems at Scale
Design Systems at ScaleDesign Systems at Scale
Design Systems at ScaleSarah Federman
 
Jamaza Services
Jamaza ServicesJamaza Services
Jamaza ServicesJamaza
 
KazooCon 2014 - Deploying Kazoo Globally
KazooCon 2014 - Deploying Kazoo GloballyKazooCon 2014 - Deploying Kazoo Globally
KazooCon 2014 - Deploying Kazoo Globally2600Hz
 
Manatee to Dolphin: Transitioning to a Startup Mentality
Manatee to Dolphin: Transitioning to a Startup MentalityManatee to Dolphin: Transitioning to a Startup Mentality
Manatee to Dolphin: Transitioning to a Startup MentalityTodd Kaplinger
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
 
BIL Corporate
BIL CorporateBIL Corporate
BIL Corporatebschandru
 
Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Gladson DSouza
 
WSO2 Intro Webinar - Simplifying Enterprise Integration with Configurable WS...
WSO2 Intro Webinar -  Simplifying Enterprise Integration with Configurable WS...WSO2 Intro Webinar -  Simplifying Enterprise Integration with Configurable WS...
WSO2 Intro Webinar - Simplifying Enterprise Integration with Configurable WS...WSO2
 
Software Architecture as Systems Dissolve
Software Architecture as Systems DissolveSoftware Architecture as Systems Dissolve
Software Architecture as Systems DissolveEoin Woods
 
Coding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE frameworkCoding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE frameworkJames Wickett
 
Embedded services by Faststream Technologies
Embedded services by Faststream TechnologiesEmbedded services by Faststream Technologies
Embedded services by Faststream TechnologiesHari Narayana
 

Similar to [DSC Europe 22] Parakeet - a parrot that can repeat words very well - Tomislav Krizan (20)

Baobab suite overview january 2013 v3 (slideshare)
Baobab suite overview january 2013 v3 (slideshare)Baobab suite overview january 2013 v3 (slideshare)
Baobab suite overview january 2013 v3 (slideshare)
 
resume
resumeresume
resume
 
resume
resumeresume
resume
 
Design Systems at Scale
Design Systems at ScaleDesign Systems at Scale
Design Systems at Scale
 
SharePoint Solutions
SharePoint SolutionsSharePoint Solutions
SharePoint Solutions
 
Jamaza Services
Jamaza ServicesJamaza Services
Jamaza Services
 
Euro IT Group
Euro IT GroupEuro IT Group
Euro IT Group
 
Malik M. Ashfaque - CV
Malik M. Ashfaque - CVMalik M. Ashfaque - CV
Malik M. Ashfaque - CV
 
Mycv Tb
Mycv TbMycv Tb
Mycv Tb
 
KazooCon 2014 - Deploying Kazoo Globally
KazooCon 2014 - Deploying Kazoo GloballyKazooCon 2014 - Deploying Kazoo Globally
KazooCon 2014 - Deploying Kazoo Globally
 
Manatee to Dolphin: Transitioning to a Startup Mentality
Manatee to Dolphin: Transitioning to a Startup MentalityManatee to Dolphin: Transitioning to a Startup Mentality
Manatee to Dolphin: Transitioning to a Startup Mentality
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
BIL Corporate
BIL CorporateBIL Corporate
BIL Corporate
 
Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.Choosing the right Technologies for your next unicorn.
Choosing the right Technologies for your next unicorn.
 
WSO2 Intro Webinar - Simplifying Enterprise Integration with Configurable WS...
WSO2 Intro Webinar -  Simplifying Enterprise Integration with Configurable WS...WSO2 Intro Webinar -  Simplifying Enterprise Integration with Configurable WS...
WSO2 Intro Webinar - Simplifying Enterprise Integration with Configurable WS...
 
Software Architecture as Systems Dissolve
Software Architecture as Systems DissolveSoftware Architecture as Systems Dissolve
Software Architecture as Systems Dissolve
 
Coding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE frameworkCoding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE framework
 
Embedded services by Faststream Technologies
Embedded services by Faststream TechnologiesEmbedded services by Faststream Technologies
Embedded services by Faststream Technologies
 
Rhytha Brochure
Rhytha BrochureRhytha Brochure
Rhytha Brochure
 
Resume_Current
Resume_CurrentResume_Current
Resume_Current
 

More from DataScienceConferenc1

[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDFDataScienceConferenc1
 
[DSC Europe 23] Rania Wazir - Mathematician jokes, cute cat photos, offensiv...
[DSC Europe 23] Rania Wazir -  Mathematician jokes, cute cat photos, offensiv...[DSC Europe 23] Rania Wazir -  Mathematician jokes, cute cat photos, offensiv...
[DSC Europe 23] Rania Wazir - Mathematician jokes, cute cat photos, offensiv...DataScienceConferenc1
 
[DSC Europe 23] Irena Cerovic - AI in International Development.pdf
[DSC Europe 23] Irena Cerovic - AI in International Development.pdf[DSC Europe 23] Irena Cerovic - AI in International Development.pdf
[DSC Europe 23] Irena Cerovic - AI in International Development.pdfDataScienceConferenc1
 
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...DataScienceConferenc1
 
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptxDataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Goran Dumic - Data-Driven Approach In Treatments
[DSC Europe 23][DigiHealth]  Goran Dumic -  Data-Driven Approach In Treatments[DSC Europe 23][DigiHealth]  Goran Dumic -  Data-Driven Approach In Treatments
[DSC Europe 23][DigiHealth] Goran Dumic - Data-Driven Approach In TreatmentsDataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Milos Todorovic - Bridging the Gap-Innovating Ag...
[DSC Europe 23][DigiHealth]  Milos Todorovic - Bridging the Gap-Innovating Ag...[DSC Europe 23][DigiHealth]  Milos Todorovic - Bridging the Gap-Innovating Ag...
[DSC Europe 23][DigiHealth] Milos Todorovic - Bridging the Gap-Innovating Ag...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Vladimir Brusic - SMART HEALTH HOME: Technology,...
[DSC Europe 23][DigiHealth]  Vladimir Brusic - SMART HEALTH HOME: Technology,...[DSC Europe 23][DigiHealth]  Vladimir Brusic - SMART HEALTH HOME: Technology,...
[DSC Europe 23][DigiHealth] Vladimir Brusic - SMART HEALTH HOME: Technology,...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Dimitar Penkov Grid Search Optimization of Novel...
[DSC Europe 23][DigiHealth]  Dimitar Penkov Grid Search Optimization of Novel...[DSC Europe 23][DigiHealth]  Dimitar Penkov Grid Search Optimization of Novel...
[DSC Europe 23][DigiHealth] Dimitar Penkov Grid Search Optimization of Novel...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMEDDataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...DataScienceConferenc1
 
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...DataScienceConferenc1
 
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with SeifDataScienceConferenc1
 
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...DataScienceConferenc1
 
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...DataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help youDataScienceConferenc1
 
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...DataScienceConferenc1
 

More from DataScienceConferenc1 (20)

[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
 
[DSC Europe 23] Rania Wazir - Mathematician jokes, cute cat photos, offensiv...
[DSC Europe 23] Rania Wazir -  Mathematician jokes, cute cat photos, offensiv...[DSC Europe 23] Rania Wazir -  Mathematician jokes, cute cat photos, offensiv...
[DSC Europe 23] Rania Wazir - Mathematician jokes, cute cat photos, offensiv...
 
[DSC Europe 23] Irena Cerovic - AI in International Development.pdf
[DSC Europe 23] Irena Cerovic - AI in International Development.pdf[DSC Europe 23] Irena Cerovic - AI in International Development.pdf
[DSC Europe 23] Irena Cerovic - AI in International Development.pdf
 
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...
[DSC Europe 23] Ilija Duni - How Foursquare Builds Meaningful Bridges Between...
 
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
 
[DSC Europe 23][DigiHealth] Goran Dumic - Data-Driven Approach In Treatments
[DSC Europe 23][DigiHealth]  Goran Dumic -  Data-Driven Approach In Treatments[DSC Europe 23][DigiHealth]  Goran Dumic -  Data-Driven Approach In Treatments
[DSC Europe 23][DigiHealth] Goran Dumic - Data-Driven Approach In Treatments
 
[DSC Europe 23][DigiHealth] Milos Todorovic - Bridging the Gap-Innovating Ag...
[DSC Europe 23][DigiHealth]  Milos Todorovic - Bridging the Gap-Innovating Ag...[DSC Europe 23][DigiHealth]  Milos Todorovic - Bridging the Gap-Innovating Ag...
[DSC Europe 23][DigiHealth] Milos Todorovic - Bridging the Gap-Innovating Ag...
 
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...
[DSC Europe 23][DigiHealth] Urosh VIlimanovich Clinical Data Management and C...
 
[DSC Europe 23][DigiHealth] Vladimir Brusic - SMART HEALTH HOME: Technology,...
[DSC Europe 23][DigiHealth]  Vladimir Brusic - SMART HEALTH HOME: Technology,...[DSC Europe 23][DigiHealth]  Vladimir Brusic - SMART HEALTH HOME: Technology,...
[DSC Europe 23][DigiHealth] Vladimir Brusic - SMART HEALTH HOME: Technology,...
 
[DSC Europe 23][DigiHealth] Dimitar Penkov Grid Search Optimization of Novel...
[DSC Europe 23][DigiHealth]  Dimitar Penkov Grid Search Optimization of Novel...[DSC Europe 23][DigiHealth]  Dimitar Penkov Grid Search Optimization of Novel...
[DSC Europe 23][DigiHealth] Dimitar Penkov Grid Search Optimization of Novel...
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
 
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...
[DSC Europe 23][DigiHealth] Katarina Vucicevic - Navigating theKinetics of Dr...
 
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...
[DSC Europe 23][DigiHealth] Anja Baresic 0- Croatian digital Healthcare ecosy...
 
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
 
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif
[DSC Europe 23][AI:CSI] Uros Arsenijevic Unlocking Cybersecurity with Seif
 
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...
 
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
 
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
 
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you
[DSC Europe 23][DigiHealth] Ligia Kornowska-How_may AI help you
 
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
[DSC Europe 23][DigiHealth] Ilya Zakharov - NETWORK NEUROSCIENCE WHERE THE BR...
 

Recently uploaded

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 

Recently uploaded (20)

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 

[DSC Europe 22] Parakeet - a parrot that can repeat words very well - Tomislav Krizan

  • 2. Agenda • About us!? • Motivation • Project approach ... • Project outcome… • Q&A
  • 3. About us • We are data-centric company with many years of experience in different industries • Telco • Banking • Finance • Retail • Manufacturing • Distributions • Transportation • We are here to solve any data-related problems and extract value from your data.
  • 4. Organizational Structure • As a company, we are strategically focused on knowledge. • We have established team-based matrix organization in order to provide flexibility of teams according to competencies needed for particular projects. • It helps us to accomplish optimal results in quality, within defined budgets and deadlines. Consulting Development Research * Note: Competencies of all team members are developed in a way that they can actively be engaged also in other team that is not their primary competence R&D Consulting
  • 5. Evolution of ASR, TTS and multispeaker ...
  • 6. Glossary • ASR - Automatic Speech Recognition • TTS - Text-To-Speech • WER - word error rate • cpWER – concatenated minimum-permutation word error rate • DER - diarization error rate • VAD - voice activity detection
  • 7. Motivation • Automatic speech recognition is a long-researched problem, the goal of which is human turn speech into words. • In the technological sphere, this problem is better known as „Speech-to-text„ and some of the applications include conversational agents (Siri) and conversation transcriptions. • End-to-end models have become a popular approach as an alternative to traditional hybrid models in automatic speech recognition (ASR). • The multi-speaker speech separation and recognition task is a central task in cocktail party problem
  • 8. Ground work ... • ASR (automatic speech recognition) with 1 speaker and with clean data is a very well solved problem (superhuman performance) • In real use cases, we might have multiple speakers, different accents, noise, etc. • Diarization - “who spoke when” —> very hard problem, unknown number of speakers in advance
  • 9. Ground work ... • Historically, separating speakers was done modular - a separate network for automatic speech recognition (speech-to-text) and for speaker diarization • Recently, end-to-end (one network for all tasks) have started making an impact • Big companies (Microsoft, Google) have models that perform well, but they are not open-source • One network for VAD (voice activity detection), speaker diarization, ASR
  • 10. Our approach • (Relevant projects) Audio separation for music instruments with custom built CV framework merging SOTA architectures • MobileNet • ResUNet
  • 12. Version 0.1 • Wave2Vec 2.0 - Unsupervised pretraining, uses unlabeled data, fine tuned for downstream tasks • Open source!!!
  • 13. Version 0.2 – enhanced Wawe2Vec • “Injecting” layers in different parts of architecture - output VAD/diarization/ASR • Initial layer(s) – VAD (almost binary task) • Mid layer(s) – diarization (complex calculation) • Final layer(s) - ASR/speech-to-text (high complexity) • This is our baseline for all other work!
  • 14. Whisper by OpenAI • Transformer based model, uses log-mel spectrograms • Trained on 680000 hours of multilingual data • Robust to accents and noise – Important • Open source!!!
  • 15. Version 0.6 – enhanced Whisper • “Injecting” layers • Initial layer(s) – VAD (almost binary task) • Mid layer(s) – diarization (complex calculation) • Final layer(s) - ASR/speech-to-text (high complexity) • Since it is robust to noise, it achieves a good model for real-world applications of multi- speaker ASR (meetings have noise, people have different accents) • A big problem occurs when speakers overlap
  • 16. Short DEMO Model ASR and diarization output: That’s all for today. Okay we have to fill in all this stuff stuff m stuff. Meeting adjourned, meeting edjourned, yeah, I think I’ve learned not to bring play-dough to meetings. Yeah, I think it would be a good idea, I like it. Speaker 1 Speaker 2 Speaker 3 Speaker 4