SlideShare una empresa de Scribd logo
1 de 21
SPEECH RECOGNITION 
Made by : fathi tarek 
Email:ftarek@fcih1.com
History of speech 
recognition: 
 1950s and 1960s: Baby Talk 
 The first speech recognition systems could understand only digits. 
(Given the complexity of human language, it makes sense that 
inventors and engineers first focused on numbers.) Bell 
Laboratories designed in 1952 the "Audrey" system, which 
recognized digits spoken by a single voice. Ten years later, IBM 
demonstrated at the 1962 World's Fair its "Shoebox" machine, 
which could understand 16 words spoken in English. 
 Labs in the United States, Japan, England, and the Soviet Union 
developed other hardware dedicated to recognizing spoken 
sounds, expanding speech recognition technology to support four 
vowels and nine consonants. 
 They may not sound like much, but these first efforts were an 
impressive start, especially when you consider how primitive 
computers themselves were at the time
1970S: SPEECH RECOGNITION 
TAKES OFF 
•Speech recognition technology made major strides in the 1970s, thanks to 
interest and funding from the U.S. Department of Defense. The DoD's DARPA 
Speech Understanding Research (SUR) program, from 1971 to 1976, was one of 
the largest of its kind in the history of speech recognition, and among other 
things it was responsible for Carnegie Mellon's "Harpy" speech-understanding 
system. 
• Harpy could understand 1011 words, approximately the vocabulary of an 
average three-year-oldHarpy was significant because it introduced a more 
efficient search approach, called beam search, to "prove the finite-state network 
of possible sentences," according to Readings in Speech Recognition by Alex 
Waibel and Kai-Fu Lee. (The story of speech recognition is very much tied to 
advances in search methodology and technology, as Google's entrance into 
speech recognition on mobile devices proved just a few years ago.) 
•The '70s also marked a few other important milestones in speech recognition 
technology, including the founding of the first commercial speech recognition 
company, Threshold Technology, as well as Bell Laboratories' introduction of a 
system that could interpret multiple people's voices.
1980S: SPEECH RECOGNITION 
TURNS TOWARD PREDICTION 
Over the next decade, thanks to new approaches to understanding what 
people say, speech recognition vocabulary jumped from about a few hundred 
words to several thousand words, and had the potential to recognize an 
unlimited number of words. One major reason was a new statistical method 
known as the hidden Markov model. 
Rather than simply using templates for words and looking for sound patterns, 
HMM considered the probability of unknown sounds' being words. This 
foundation would be in place for the next two decades (see Automatic Speech 
Recognition—A Brief History of the Technology Development by B.H. Juang 
and Lawrence R. Rabiner). 
Equipped with this expanded vocabulary, speech recognition started to work 
its way into commercial applications for business and specialized industry (for 
instance, medical use). It even entered the home, in the form ofWorlds of 
Wonder's Julie doll(1987), which children could train to respond to their voice. 
("Finally, the doll that understands you.")
In 1990, Dragon launched the first consumer speech recognition 
product,Dragon Dictate, for an incredible price of $9000. Seven years 
later,the much-improved Dragon NaturallySpeaking arrived. The 
applicationrecognized continuous speech, so you could speak, well, 
naturally, atabout 100 words per minute. However, you had to train the 
program for45 minutes, and it was still expensive at $695. 
The advent of the first voice portal, VAL from BellSouth, was in 
1996;VAL was a dial-in interactive voice recognition system that 
wassupposed to give you information based on what you said on the 
phone.VAL paved the way for all the inaccurate voice-activated menus 
thatwould plague callers for the next 15 years and beyond.
2000s: Speech Recognition Plateaus–Until Google Comes Along By 2001, 
computer speech recognition had topped out at 80 percent accuracy, 
and, near the end of the decade, the technology’s progress seemed to 
be stalled. Recognition systems did well when the language universe 
was limited–but they were still “guessing,” with the assistance of 
statistical models, among similar-sounding words, and the known 
language universe continued to grow as the Internet grew. 
Did you know speech recognition and voice commands were built 
into Windows Vista and Mac OS X? Manycomputer users weren’t aware 
that those features existed. WindowsSpeech Recognition and OS X’s 
voice commands were interesting, but notas accurate or as easy to use 
as a plain old keyboard and mouse.
In 2010, Google added “personalized recognition” to Voice Search 
on Android phones, so that thesoftware could record users’ voice searches 
and produce a more accuratespeech model. The company also added 
Voice Search to its Chrome browserin mid-2011. Remember how we 
started with 10 to 100 words, and thengraduated to a few thousand? 
Google’s English Voice Search system nowincorporates 230 billion words 
from actual user queries. 
And now along comes Siri. Like Google’s Voice Search, Siri relies oncloud-based 
processing. It draws what it knows about you to generate 
acontextual reply, and it responds to your voice input with personality.(As 
my PCWorld colleague David Daw points out: “It’s not just fun butfunny. 
When you ask Siri the meaning of life, it tells you ’42’ or ‘Allevidence to 
date points to chocolate.’ If you tell it you want to hidea body, it helpfully 
volunteers nearby dumps and metal foundries.”) 
Speech recognition has gone from utility to entertainment. The 
childseems all grown up.
THE FUTURE 
Accurate, Ubiquitous Speech 
The explosion of voice recognition apps indicates that 
speechrecognition’s time has come, and that you can expect plenty 
more appsin the future. These apps will not only let you control your PC 
byvoice or convert voice to text–they’ll also support multiplelanguages, 
offer assorted speaker voices for you to choose from, andintegrate into 
every part of your mobile devices (that is, they’llovercome Siri’s 
shortcomings). 
The quality of speech recognition apps will improve, too. For 
instance,Sensory’sTrulyhandsfreeVoice Control can hear and 
understand you,even in noisy environments.
WHAT IS SPEECH RECOGNITION?? 
Speech recognition is the ability of a machine or program to identify 
words and phrases in spoken language and convert them to a machine-readable 
format. 
Another definition 
Speech recognition is an alternative to typing on a keyboard. Put 
simply, you talk to the computer ,mobiles and your words appear on the 
screen. The software has been developed to provide a fast method of 
writing on a computer and can help people with a variety of disabilities. 
It is useful for people with physical disabilities who often find typing 
difficult, painful or impossible. Voice-recognition software can also help 
those with spelling difficulties, including users with dyslexia, because 
recognized words are almost always correctly spelled.
However, speech is more than sequences of phones that forms words 
and 
sentences. There are contents of speech that carries information, e.g. 
the 
prosody of the speech indicates grammatical structures, and the stress 
of a 
word signals its importance/topicality. This information is sometimes 
called 
the paralinguistic content of speech
 Advantages 
 Speech is a very natural way to interact, and it is 
not necessary to sit at a keyboard or work with a 
remote control. 
 No training required for users! 
 Disadvantages 
 Even the best speech recognition systems 
sometimes make errors. If there is noise or some 
other sound in the room (e.g. the television or a 
kettle boiling), the number of errors will increase. 
 Speech Recognition works best if the 
microphone is close to the user (e.g. in a phone, 
or if the user is wearing a microphone). More 
distant microphones (e.g. on a table or wall) will 
tend to increase the number of errors.
Voice recognition software 
 Voice-recognition software programs work by analyzing 
sounds and converting them to text. They also use 
knowledge of how English is usually spoken to decide 
what the speaker most probably said. Once correctly set 
up, the systems should recognize around 95% of what is 
said if you speak clearly.
Voice recognition in 
operating systems 
 Mobile Devices / Smart phones 
Many cell phone handsets have basic dial-by-voice 
features built in. Smartphones such as 
iPhone or Blackberry also support this. A 
number of 3rd party Apps have implemented 
natural language speech recognition support, 
including:
 Smart phones and mobile devices are in the middle 
of major innovations in technology to provide 
hands-free access to features and navigation, often 
called voice commands, voice-enabled, voice 
actions or speech recognition. This technology has 
major implications for use by people who have 
disabilities as assistive technology. As long as a user 
has a strong, clear voice, these devices become 
easier to use and give increased access to use of the 
Internet, use of mobile devices and communication 
accessibility.
 Windows 7 built-in speech recognition 
 The Windows Speech Recognition by Microsoft is the speech recognition 
system that comes built into Windows Vista andWindows 7. Windows 
Vista and Windows 7 include version 8.0 of the Microsoft speech recognition 
engine. Speech Recognition is available only in English, French, Spanish, 
German, Japanese, Simplified Chinese, and Traditional Chinese and only in 
the corresponding version of Windows. That means that you can not use the 
French speech recognition engine if you use an English version of Windows. 
 Windows XP or 2000 only 
 e-Speaking – software for Windows XP that facilitates use of 
the Microsoft Speech API by adding ability to create commands to perform 
custom actions. 
 Microsoft Speech API – Speech recognition functionality included as part of 
Microsoft Office and onTablet PCs running Microsoft Windows XP Tablet PC 
Edition. It can also be downloaded as part of the Speech SDK 5.1 for 
Windows applications, but since that is aimed at developers building speech 
applications, the pure SDK form lacks any user interface, and thus is 
unsuitable for end users. 
 Vestec Inc. - Specializing in Natural Language Understanding and Speech 
Recognition solutions. ASR, NLU and TTS engines support 17 languages in 
server, embedded (on low cost chip) or cloud based environments.
Macintosh
Types of speech recognition 
1. Text-To-Speech: 
As it sounds, Text-To-Speech (or TTS) will 
manipulate a string of text into an audio clip. 
It is useful for blind people to be able to use 
computers but can also be used to simply 
improve computer experience. There are 
several programs available that perform TTS, 
some of which are command-line based 
(ideal for scripting) and others which provide 
a handy GUI.
2. Simple Voice Control/Commands: 
This is the most basic form of Speech-To-Text 
application. These are designed to recognize 
a small number of specific, typically one-word 
commands and then perform an action. This 
is often used as an alternative to an 
application launcher, allowing the user for 
instance to say the word “firefox” and have 
his OS open a new browser window.
3.Full dictation/recognition: 
Full dictation/recognition software allows the 
user to read full sentences or paragraphs and 
translates that data into text on the fly. This 
could be used, for instance, to dictate an 
entire letter into the window of an email 
client. In some cases, these types of 
applications need to be trained to your voice 
and can improve in accuracy the more they 
are used
Thank you

Más contenido relacionado

La actualidad más candente

Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentationshahinmehr
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniquessonukumar142
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data miningJimit Rupani
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognitionVinay Jaisriram
 

La actualidad más candente (20)

Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Voice recognition
Voice recognitionVoice recognition
Voice recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentation
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 

Destacado

Impasse in a detention unit
Impasse in a detention unitImpasse in a detention unit
Impasse in a detention unitjoannakato
 
Узагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметникУзагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметникIrinaochakov
 
Huong dan thiet ke mo phong
Huong dan thiet ke mo phongHuong dan thiet ke mo phong
Huong dan thiet ke mo phongMai Thanh
 
High impact leadership for emerging leaders
High impact leadership for emerging leadersHigh impact leadership for emerging leaders
High impact leadership for emerging leadersCraig Bihari
 
喜多福蠔油
喜多福蠔油喜多福蠔油
喜多福蠔油彥欣 李
 
Joints and cartilages
Joints and cartilagesJoints and cartilages
Joints and cartilagesalshahbaa
 
Design Portfolio
Design PortfolioDesign Portfolio
Design PortfolioCindy Van
 
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENFINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENArnaud Veere
 
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"Варвара Разумовская
 
Steven Glick Resume
Steven Glick ResumeSteven Glick Resume
Steven Glick ResumeSteve Glick
 
Session iv(master pages)
Session iv(master pages)Session iv(master pages)
Session iv(master pages)Shrijan Tiwari
 
Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Mark Morales
 
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...Glucksman Library, University of Limerick
 

Destacado (20)

вирус бронзавости парадајза
вирус бронзавости парадајзавирус бронзавости парадајза
вирус бронзавости парадајза
 
Impasse in a detention unit
Impasse in a detention unitImpasse in a detention unit
Impasse in a detention unit
 
Annisaa noviyanti
Annisaa noviyantiAnnisaa noviyanti
Annisaa noviyanti
 
Узагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметникУзагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметник
 
Huong dan thiet ke mo phong
Huong dan thiet ke mo phongHuong dan thiet ke mo phong
Huong dan thiet ke mo phong
 
High impact leadership for emerging leaders
High impact leadership for emerging leadersHigh impact leadership for emerging leaders
High impact leadership for emerging leaders
 
喜多福蠔油
喜多福蠔油喜多福蠔油
喜多福蠔油
 
looseleaf_Portfolio
looseleaf_Portfoliolooseleaf_Portfolio
looseleaf_Portfolio
 
Joints and cartilages
Joints and cartilagesJoints and cartilages
Joints and cartilages
 
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRECalendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
 
Präsentationsmodus
PräsentationsmodusPräsentationsmodus
Präsentationsmodus
 
Design Portfolio
Design PortfolioDesign Portfolio
Design Portfolio
 
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENFINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
 
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
 
My app
My appMy app
My app
 
IGF Norway 2014 12-09
IGF Norway 2014 12-09IGF Norway 2014 12-09
IGF Norway 2014 12-09
 
Steven Glick Resume
Steven Glick ResumeSteven Glick Resume
Steven Glick Resume
 
Session iv(master pages)
Session iv(master pages)Session iv(master pages)
Session iv(master pages)
 
Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Morales SBA deck 2-10-15
Morales SBA deck 2-10-15
 
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
 

Similar a Speech Recognition

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionshanle03
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docxaryan532920
 
virtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdfvirtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdfHarshKumar534677
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistantShubham Bhalekar
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneIJERA Editor
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of changeThe ALS Association
 
The concept of Voice Recognition.
The concept of Voice Recognition.The concept of Voice Recognition.
The concept of Voice Recognition.NithishKumar366585
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?CarterRodriguez6
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...TELKOMNIKA JOURNAL
 
The Affordances Of Mobile Technologies
The Affordances Of Mobile TechnologiesThe Affordances Of Mobile Technologies
The Affordances Of Mobile TechnologiesNeil Milliken
 
A Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UIA Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UILHBS
 
Speak easy global edition
Speak easy global editionSpeak easy global edition
Speak easy global editionWEB制作仲間
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...IOSR Journals
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediacidroypaes
 

Similar a Speech Recognition (20)

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 
virtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdfvirtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdf
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistant
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phone
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of change
 
Seminar
SeminarSeminar
Seminar
 
The concept of Voice Recognition.
The concept of Voice Recognition.The concept of Voice Recognition.
The concept of Voice Recognition.
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?
 
Speakeasy 04 2017
Speakeasy 04 2017Speakeasy 04 2017
Speakeasy 04 2017
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
 
The Affordances Of Mobile Technologies
The Affordances Of Mobile TechnologiesThe Affordances Of Mobile Technologies
The Affordances Of Mobile Technologies
 
A Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UIA Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UI
 
Voice recognition
Voice recognitionVoice recognition
Voice recognition
 
Voice Tech TO #1
Voice Tech TO #1Voice Tech TO #1
Voice Tech TO #1
 
Amadou
AmadouAmadou
Amadou
 
Speak easy global edition
Speak easy global editionSpeak easy global edition
Speak easy global edition
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social media
 
Assign
AssignAssign
Assign
 

Último

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 

Último (20)

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 

Speech Recognition

  • 1. SPEECH RECOGNITION Made by : fathi tarek Email:ftarek@fcih1.com
  • 2. History of speech recognition:  1950s and 1960s: Baby Talk  The first speech recognition systems could understand only digits. (Given the complexity of human language, it makes sense that inventors and engineers first focused on numbers.) Bell Laboratories designed in 1952 the "Audrey" system, which recognized digits spoken by a single voice. Ten years later, IBM demonstrated at the 1962 World's Fair its "Shoebox" machine, which could understand 16 words spoken in English.  Labs in the United States, Japan, England, and the Soviet Union developed other hardware dedicated to recognizing spoken sounds, expanding speech recognition technology to support four vowels and nine consonants.  They may not sound like much, but these first efforts were an impressive start, especially when you consider how primitive computers themselves were at the time
  • 3. 1970S: SPEECH RECOGNITION TAKES OFF •Speech recognition technology made major strides in the 1970s, thanks to interest and funding from the U.S. Department of Defense. The DoD's DARPA Speech Understanding Research (SUR) program, from 1971 to 1976, was one of the largest of its kind in the history of speech recognition, and among other things it was responsible for Carnegie Mellon's "Harpy" speech-understanding system. • Harpy could understand 1011 words, approximately the vocabulary of an average three-year-oldHarpy was significant because it introduced a more efficient search approach, called beam search, to "prove the finite-state network of possible sentences," according to Readings in Speech Recognition by Alex Waibel and Kai-Fu Lee. (The story of speech recognition is very much tied to advances in search methodology and technology, as Google's entrance into speech recognition on mobile devices proved just a few years ago.) •The '70s also marked a few other important milestones in speech recognition technology, including the founding of the first commercial speech recognition company, Threshold Technology, as well as Bell Laboratories' introduction of a system that could interpret multiple people's voices.
  • 4. 1980S: SPEECH RECOGNITION TURNS TOWARD PREDICTION Over the next decade, thanks to new approaches to understanding what people say, speech recognition vocabulary jumped from about a few hundred words to several thousand words, and had the potential to recognize an unlimited number of words. One major reason was a new statistical method known as the hidden Markov model. Rather than simply using templates for words and looking for sound patterns, HMM considered the probability of unknown sounds' being words. This foundation would be in place for the next two decades (see Automatic Speech Recognition—A Brief History of the Technology Development by B.H. Juang and Lawrence R. Rabiner). Equipped with this expanded vocabulary, speech recognition started to work its way into commercial applications for business and specialized industry (for instance, medical use). It even entered the home, in the form ofWorlds of Wonder's Julie doll(1987), which children could train to respond to their voice. ("Finally, the doll that understands you.")
  • 5. In 1990, Dragon launched the first consumer speech recognition product,Dragon Dictate, for an incredible price of $9000. Seven years later,the much-improved Dragon NaturallySpeaking arrived. The applicationrecognized continuous speech, so you could speak, well, naturally, atabout 100 words per minute. However, you had to train the program for45 minutes, and it was still expensive at $695. The advent of the first voice portal, VAL from BellSouth, was in 1996;VAL was a dial-in interactive voice recognition system that wassupposed to give you information based on what you said on the phone.VAL paved the way for all the inaccurate voice-activated menus thatwould plague callers for the next 15 years and beyond.
  • 6. 2000s: Speech Recognition Plateaus–Until Google Comes Along By 2001, computer speech recognition had topped out at 80 percent accuracy, and, near the end of the decade, the technology’s progress seemed to be stalled. Recognition systems did well when the language universe was limited–but they were still “guessing,” with the assistance of statistical models, among similar-sounding words, and the known language universe continued to grow as the Internet grew. Did you know speech recognition and voice commands were built into Windows Vista and Mac OS X? Manycomputer users weren’t aware that those features existed. WindowsSpeech Recognition and OS X’s voice commands were interesting, but notas accurate or as easy to use as a plain old keyboard and mouse.
  • 7. In 2010, Google added “personalized recognition” to Voice Search on Android phones, so that thesoftware could record users’ voice searches and produce a more accuratespeech model. The company also added Voice Search to its Chrome browserin mid-2011. Remember how we started with 10 to 100 words, and thengraduated to a few thousand? Google’s English Voice Search system nowincorporates 230 billion words from actual user queries. And now along comes Siri. Like Google’s Voice Search, Siri relies oncloud-based processing. It draws what it knows about you to generate acontextual reply, and it responds to your voice input with personality.(As my PCWorld colleague David Daw points out: “It’s not just fun butfunny. When you ask Siri the meaning of life, it tells you ’42’ or ‘Allevidence to date points to chocolate.’ If you tell it you want to hidea body, it helpfully volunteers nearby dumps and metal foundries.”) Speech recognition has gone from utility to entertainment. The childseems all grown up.
  • 8. THE FUTURE Accurate, Ubiquitous Speech The explosion of voice recognition apps indicates that speechrecognition’s time has come, and that you can expect plenty more appsin the future. These apps will not only let you control your PC byvoice or convert voice to text–they’ll also support multiplelanguages, offer assorted speaker voices for you to choose from, andintegrate into every part of your mobile devices (that is, they’llovercome Siri’s shortcomings). The quality of speech recognition apps will improve, too. For instance,Sensory’sTrulyhandsfreeVoice Control can hear and understand you,even in noisy environments.
  • 9. WHAT IS SPEECH RECOGNITION?? Speech recognition is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format. Another definition Speech recognition is an alternative to typing on a keyboard. Put simply, you talk to the computer ,mobiles and your words appear on the screen. The software has been developed to provide a fast method of writing on a computer and can help people with a variety of disabilities. It is useful for people with physical disabilities who often find typing difficult, painful or impossible. Voice-recognition software can also help those with spelling difficulties, including users with dyslexia, because recognized words are almost always correctly spelled.
  • 10. However, speech is more than sequences of phones that forms words and sentences. There are contents of speech that carries information, e.g. the prosody of the speech indicates grammatical structures, and the stress of a word signals its importance/topicality. This information is sometimes called the paralinguistic content of speech
  • 11.  Advantages  Speech is a very natural way to interact, and it is not necessary to sit at a keyboard or work with a remote control.  No training required for users!  Disadvantages  Even the best speech recognition systems sometimes make errors. If there is noise or some other sound in the room (e.g. the television or a kettle boiling), the number of errors will increase.  Speech Recognition works best if the microphone is close to the user (e.g. in a phone, or if the user is wearing a microphone). More distant microphones (e.g. on a table or wall) will tend to increase the number of errors.
  • 12. Voice recognition software  Voice-recognition software programs work by analyzing sounds and converting them to text. They also use knowledge of how English is usually spoken to decide what the speaker most probably said. Once correctly set up, the systems should recognize around 95% of what is said if you speak clearly.
  • 13. Voice recognition in operating systems  Mobile Devices / Smart phones Many cell phone handsets have basic dial-by-voice features built in. Smartphones such as iPhone or Blackberry also support this. A number of 3rd party Apps have implemented natural language speech recognition support, including:
  • 14.
  • 15.  Smart phones and mobile devices are in the middle of major innovations in technology to provide hands-free access to features and navigation, often called voice commands, voice-enabled, voice actions or speech recognition. This technology has major implications for use by people who have disabilities as assistive technology. As long as a user has a strong, clear voice, these devices become easier to use and give increased access to use of the Internet, use of mobile devices and communication accessibility.
  • 16.  Windows 7 built-in speech recognition  The Windows Speech Recognition by Microsoft is the speech recognition system that comes built into Windows Vista andWindows 7. Windows Vista and Windows 7 include version 8.0 of the Microsoft speech recognition engine. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows. That means that you can not use the French speech recognition engine if you use an English version of Windows.  Windows XP or 2000 only  e-Speaking – software for Windows XP that facilitates use of the Microsoft Speech API by adding ability to create commands to perform custom actions.  Microsoft Speech API – Speech recognition functionality included as part of Microsoft Office and onTablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.  Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.
  • 18. Types of speech recognition 1. Text-To-Speech: As it sounds, Text-To-Speech (or TTS) will manipulate a string of text into an audio clip. It is useful for blind people to be able to use computers but can also be used to simply improve computer experience. There are several programs available that perform TTS, some of which are command-line based (ideal for scripting) and others which provide a handy GUI.
  • 19. 2. Simple Voice Control/Commands: This is the most basic form of Speech-To-Text application. These are designed to recognize a small number of specific, typically one-word commands and then perform an action. This is often used as an alternative to an application launcher, allowing the user for instance to say the word “firefox” and have his OS open a new browser window.
  • 20. 3.Full dictation/recognition: Full dictation/recognition software allows the user to read full sentences or paragraphs and translates that data into text on the fly. This could be used, for instance, to dictate an entire letter into the window of an email client. In some cases, these types of applications need to be trained to your voice and can improve in accuracy the more they are used