SlideShare una empresa de Scribd logo
1 de 27
Spock
An advanced Human Computer Interaction Engine and API
A BTP synopsis by Sonal Raj & Shashi Kant
Objective
What goals will Spock achieve ?
Objectives
▪ Obtain a viable knowledge of the working of existing algorithms and
methods of interaction, and build up a concept system to improve these
techniques.
▪ Code a Human Computer Interaction Engine capable to techniques like
motion and gesture recognition, speech synthesis (biometric authentications,
iris control and many other features could be included in future versions and
are beyond the scope of this project.).
▪ Implement an Application Programming Interface (API) to extend the
functionality of this engine to multitude of applications running on the
system to implement it in their own way.
Spock Sonal Raj & Shashi Kant 3
Motivation
Why on earth did we want to do this?
Motivations
▪ Definitely not marks or Compulsion ! 
▪ Inspired by brilliant pieces of work like “The Sixth Sense” by Pranav Mistry
▪ Vision to overcome the limitations of existing systems like Microsoft Kinect
and Nintendo.
▪ Sci-fi movies like Star Trek or the Jarvis interactive computer systems in the
Iron Man movie series
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 5
Motivations
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 6
Pranav Mistry’s work at MIT on Sixth Sense can be checked out at:
http://www.pranavmistry.com/projects/sixthsense/
Iron Man’s J.A.R.V.I.S computer system can be referenced at : http://marvel-
movies.wikia.com/wiki/J.A.R.V.I.S.
Mission Statement
“This Project’s mission is to develop a technological product which will change
the way you interacted with your computers using life-like sensing methods.”
Evolution
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 7
Natural User Interface (NUI)
Existing Work Survey
What all led to this?
Projects
“Sixth Sense”
- Pranav Mistry
- MIT Labs
hello
Projects
“Sixth Sense”
- Pranav Mistry
- MIT Labs
hello
Projects
“Sixth Sense”
- Pranav Mistry
- MIT Labs
hello
“Sphinx”
Carnegie Mellon
University
“Microsoft Kinect”
- MS R&D
hello
“SIRI for IOS”
Apple Inc.
CMUSphinx toolkit is a leading speech recognition toolkit with
various tools used to build speech applications.
Kinect is a line of motion sensing input devices by Microsoft
for Xbox 360 and Xbox One video game consoles and
Windows PCs.
SIRI is a voice assistant for iOS. Uses natural speech
synthesis tools.
Papers and
Publications
hello
1. Real-time hand gesture recognition using range
cameras, Hervé Lahamy and Derek Litchi, Department of
Geomatics Engineering, University of Calgary, NW, Calgary,
Alberta
2. Real-Time Human Pose Recognition in Parts from
Single Depth Images, Jamie Shotton Andrew Fitzgibbon,
Microsoft Research Cambridge & Xbox Incubation.
3. Minimum variance modulation filter for robust
speech recognition. Yu-Hsiang Bosco Chiu and Richard M Stern,
Carnegie Mellon University, Pittsburgh, USA.
4. Some recent research work at LIUM based on the
use of CMU Sphinx, Yannick Estève ET. Al. LIUM,
University of Le Mans, France
Proposed Work
What we are going to do?
Tech to be Used
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 15
• Languages
1. C/C++
2. Python
3. Bash
• Frameworks
• OpenCV
• CMUSphinx
Impact Areas
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 16
• Lively Computing experience to “muggle”
computer users.
• Robust and Capable devices.
• Advanced Search
• Security feed analysis
• And many more . . .
Future Work and Applications
▪ Boon to physically challenged disabled users. The blind can now command a
computer with voice and gesture with accurate dictation tools within Spock
to control a system, rather than typing on Braille keyboards.
▪ Used in Embedded devices for robot and appliance control
▪ take search to the next level. You can not only search for your keywords
within a video or audio file, but with powerful system architecture, you can
do it in near real-time
▪ In combination of suitable machine Learning and AI Techniques, it can be
used in Defense and Military operations for monitoring cross-border
conversations or surveillance data in an automated and much more efficient
way compared to manned analysis.
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 18
Progress so far.
Where have we reached?
Progress
▪ Technologies learnt and configured
▪ Studied documentations of the OpenCV Framework and the Sphinx
Speech toolkit from CMU
▪ Fiddled around with configurations quite a bit. Thereby, their
configurations in Linux as well as Windows was carried out and
learnt successfully.
▪ Interface implementations of the above frameworks.
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 20
Progress
▪ Coding Work: Noise Reduction and image synthesis: [Algorithm] This
was done by
 subtracting the RGB value of the pixels of the previous frame from the RGB
values of the pixels of the current frame.
 Then this image is converted to octachrome (8 colors only - red, blue, green,
cyan, magenta, yellow, white, black). This makes most pixels neutral or grey.
 This is followed by the greying of those pixels not surrounded by 20 non-
grey pixels, in the function crosshair (IplImage*img1, IplImage* img2). The
non-grey pixels that remain represent proper motion and noise is eliminated.
A database is provided with the code which contains a set of points for each
gesture.
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 21
Progress
▪ Coding Work: Speech synthesis:
▪ Split the waveform by on utterances by silences
▪ match all possible combination of words with the audio models used in
speech recognition acoustic model phonetic dictionary Language Model.
▪ Other concepts used in speech recognition are : Lattice, N-best lists, word
confusion networks, speech database, text databases are currently being
implemented.
Spock Sonal Raj (487/10) & Shashi Kant (462/10) 22
Project Timeline
What and by when?
References
Who helped?
References
hello
Projects
1. Sixth Sense, Pranav Mistry, MIT Labs
2. Microsoft Kinect, Microsoft Research and XBox
3. CMUSphinx, Carnegie Mellon university
4. SIRI and Google Voice Search, Apple and Google, Inc.
Papers and Publications.
1. Real-time hand gesture recognition using range cameras, Hervé Lahamy and Derek Litchi, Department of
Geomatics Engineering, University of Calgary, NW, Calgary, Alberta
2. Real-Time Human Pose Recognition in Parts from Single Depth Images, Jamie Shotton Andrew
Fitzgibbon, Microsoft Research Cambridge & Xbox Incubation.
3. Minimum variance modulation filter for robust speech recognition. Yu-Hsiang Bosco Chiu and Richard M
Stern, Carnegie Mellon University, Pittsburgh, USA.
4. Some recent research work at LIUM based on the use of CMU Sphinx, Yannick Estève ET. Al. LIUM,
University of Le Mans, France
Thank You
For your patience 
Monitor the Progress of Spock and view the code at:
https://github.com/sonal-raj/Spock

Más contenido relacionado

Similar a Advanced Human Computer Interaction Engine

Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Martin Peniak
 
HAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxHAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxDeepakkumaragrahari1
 
Intelligent Embedded Systems (Robotics)
Intelligent Embedded Systems (Robotics)Intelligent Embedded Systems (Robotics)
Intelligent Embedded Systems (Robotics)Adeyemi Fowe
 
Artificial Intelligence Research In Japan
Artificial Intelligence Research In JapanArtificial Intelligence Research In Japan
Artificial Intelligence Research In JapanNat Rice
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Waqas Tariq
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Waqas Tariq
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Waqas Tariq
 
Resume-Aditya Parkhi NCSU_MSCS
Resume-Aditya Parkhi NCSU_MSCSResume-Aditya Parkhi NCSU_MSCS
Resume-Aditya Parkhi NCSU_MSCSaditya140
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxPierre Schaus
 
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...Sony Network Communications Inc.
 
An introduction to AI (artificial intelligence)
An introduction to AI (artificial intelligence)An introduction to AI (artificial intelligence)
An introduction to AI (artificial intelligence)Bellaj Badr
 
Fossasia ai-ml technologies and application for product development-chetan kh...
Fossasia ai-ml technologies and application for product development-chetan kh...Fossasia ai-ml technologies and application for product development-chetan kh...
Fossasia ai-ml technologies and application for product development-chetan kh...Chetan Khatri
 
Art of artificial intelligence and automation
Art of artificial intelligence and automationArt of artificial intelligence and automation
Art of artificial intelligence and automationLiew Wei Da Andrew
 
Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iaetsd Iaetsd
 
Harshal-Govind3.0
Harshal-Govind3.0Harshal-Govind3.0
Harshal-Govind3.0harshgovind
 
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016Comit Projects Ltd
 
New Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human ActionsNew Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human ActionsShreya Shetty
 
IBM Watson & Cognitive Computing - Tech In Asia 2016
IBM Watson & Cognitive Computing - Tech In Asia 2016IBM Watson & Cognitive Computing - Tech In Asia 2016
IBM Watson & Cognitive Computing - Tech In Asia 2016Nugroho Gito
 

Similar a Advanced Human Computer Interaction Engine (20)

AI & ML
AI & MLAI & ML
AI & ML
 
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
 
HAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxHAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptx
 
Intelligent Embedded Systems (Robotics)
Intelligent Embedded Systems (Robotics)Intelligent Embedded Systems (Robotics)
Intelligent Embedded Systems (Robotics)
 
Artificial Intelligence Research In Japan
Artificial Intelligence Research In JapanArtificial Intelligence Research In Japan
Artificial Intelligence Research In Japan
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
 
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
Design and Evaluation Case Study: Evaluating The Kinect Device In The Task of...
 
Resume-Aditya Parkhi NCSU_MSCS
Resume-Aditya Parkhi NCSU_MSCSResume-Aditya Parkhi NCSU_MSCS
Resume-Aditya Parkhi NCSU_MSCS
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- Redux
 
Ai Project ppt.pptx
Ai Project ppt.pptxAi Project ppt.pptx
Ai Project ppt.pptx
 
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...
2018/03/28 Sony's deep learning software "Neural Network Libraries/Console“ a...
 
An introduction to AI (artificial intelligence)
An introduction to AI (artificial intelligence)An introduction to AI (artificial intelligence)
An introduction to AI (artificial intelligence)
 
Fossasia ai-ml technologies and application for product development-chetan kh...
Fossasia ai-ml technologies and application for product development-chetan kh...Fossasia ai-ml technologies and application for product development-chetan kh...
Fossasia ai-ml technologies and application for product development-chetan kh...
 
Art of artificial intelligence and automation
Art of artificial intelligence and automationArt of artificial intelligence and automation
Art of artificial intelligence and automation
 
Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...
 
Harshal-Govind3.0
Harshal-Govind3.0Harshal-Govind3.0
Harshal-Govind3.0
 
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016
Tools for the Future of the Digital Infrastructure Lifecycle - #COMIT2016
 
New Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human ActionsNew Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human Actions
 
IBM Watson & Cognitive Computing - Tech In Asia 2016
IBM Watson & Cognitive Computing - Tech In Asia 2016IBM Watson & Cognitive Computing - Tech In Asia 2016
IBM Watson & Cognitive Computing - Tech In Asia 2016
 

Más de Sonal Raj

Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Sonal Raj
 
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...Sonal Raj
 
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Sonal Raj
 
Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Sonal Raj
 
Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Sonal Raj
 
Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Sonal Raj
 
IT Quiz Mains
IT Quiz MainsIT Quiz Mains
IT Quiz MainsSonal Raj
 
IT Quiz Prelims
IT Quiz PrelimsIT Quiz Prelims
IT Quiz PrelimsSonal Raj
 
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Sonal Raj
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 

Más de Sonal Raj (10)

Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
 
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
 
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
 
Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?
 
Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018
 
Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.
 
IT Quiz Mains
IT Quiz MainsIT Quiz Mains
IT Quiz Mains
 
IT Quiz Prelims
IT Quiz PrelimsIT Quiz Prelims
IT Quiz Prelims
 
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 

Advanced Human Computer Interaction Engine

  • 1. Spock An advanced Human Computer Interaction Engine and API A BTP synopsis by Sonal Raj & Shashi Kant
  • 2. Objective What goals will Spock achieve ?
  • 3. Objectives ▪ Obtain a viable knowledge of the working of existing algorithms and methods of interaction, and build up a concept system to improve these techniques. ▪ Code a Human Computer Interaction Engine capable to techniques like motion and gesture recognition, speech synthesis (biometric authentications, iris control and many other features could be included in future versions and are beyond the scope of this project.). ▪ Implement an Application Programming Interface (API) to extend the functionality of this engine to multitude of applications running on the system to implement it in their own way. Spock Sonal Raj & Shashi Kant 3
  • 4. Motivation Why on earth did we want to do this?
  • 5. Motivations ▪ Definitely not marks or Compulsion !  ▪ Inspired by brilliant pieces of work like “The Sixth Sense” by Pranav Mistry ▪ Vision to overcome the limitations of existing systems like Microsoft Kinect and Nintendo. ▪ Sci-fi movies like Star Trek or the Jarvis interactive computer systems in the Iron Man movie series Spock Sonal Raj (487/10) & Shashi Kant (462/10) 5
  • 6. Motivations Spock Sonal Raj (487/10) & Shashi Kant (462/10) 6 Pranav Mistry’s work at MIT on Sixth Sense can be checked out at: http://www.pranavmistry.com/projects/sixthsense/ Iron Man’s J.A.R.V.I.S computer system can be referenced at : http://marvel- movies.wikia.com/wiki/J.A.R.V.I.S. Mission Statement “This Project’s mission is to develop a technological product which will change the way you interacted with your computers using life-like sensing methods.”
  • 7. Evolution Spock Sonal Raj (487/10) & Shashi Kant (462/10) 7 Natural User Interface (NUI)
  • 8. Existing Work Survey What all led to this?
  • 9. Projects “Sixth Sense” - Pranav Mistry - MIT Labs hello
  • 10. Projects “Sixth Sense” - Pranav Mistry - MIT Labs hello
  • 11. Projects “Sixth Sense” - Pranav Mistry - MIT Labs hello
  • 12. “Sphinx” Carnegie Mellon University “Microsoft Kinect” - MS R&D hello “SIRI for IOS” Apple Inc. CMUSphinx toolkit is a leading speech recognition toolkit with various tools used to build speech applications. Kinect is a line of motion sensing input devices by Microsoft for Xbox 360 and Xbox One video game consoles and Windows PCs. SIRI is a voice assistant for iOS. Uses natural speech synthesis tools.
  • 13. Papers and Publications hello 1. Real-time hand gesture recognition using range cameras, Hervé Lahamy and Derek Litchi, Department of Geomatics Engineering, University of Calgary, NW, Calgary, Alberta 2. Real-Time Human Pose Recognition in Parts from Single Depth Images, Jamie Shotton Andrew Fitzgibbon, Microsoft Research Cambridge & Xbox Incubation. 3. Minimum variance modulation filter for robust speech recognition. Yu-Hsiang Bosco Chiu and Richard M Stern, Carnegie Mellon University, Pittsburgh, USA. 4. Some recent research work at LIUM based on the use of CMU Sphinx, Yannick Estève ET. Al. LIUM, University of Le Mans, France
  • 14. Proposed Work What we are going to do?
  • 15. Tech to be Used Spock Sonal Raj (487/10) & Shashi Kant (462/10) 15 • Languages 1. C/C++ 2. Python 3. Bash • Frameworks • OpenCV • CMUSphinx
  • 16. Impact Areas Spock Sonal Raj (487/10) & Shashi Kant (462/10) 16 • Lively Computing experience to “muggle” computer users. • Robust and Capable devices. • Advanced Search • Security feed analysis • And many more . . .
  • 17.
  • 18. Future Work and Applications ▪ Boon to physically challenged disabled users. The blind can now command a computer with voice and gesture with accurate dictation tools within Spock to control a system, rather than typing on Braille keyboards. ▪ Used in Embedded devices for robot and appliance control ▪ take search to the next level. You can not only search for your keywords within a video or audio file, but with powerful system architecture, you can do it in near real-time ▪ In combination of suitable machine Learning and AI Techniques, it can be used in Defense and Military operations for monitoring cross-border conversations or surveillance data in an automated and much more efficient way compared to manned analysis. Spock Sonal Raj (487/10) & Shashi Kant (462/10) 18
  • 19. Progress so far. Where have we reached?
  • 20. Progress ▪ Technologies learnt and configured ▪ Studied documentations of the OpenCV Framework and the Sphinx Speech toolkit from CMU ▪ Fiddled around with configurations quite a bit. Thereby, their configurations in Linux as well as Windows was carried out and learnt successfully. ▪ Interface implementations of the above frameworks. Spock Sonal Raj (487/10) & Shashi Kant (462/10) 20
  • 21. Progress ▪ Coding Work: Noise Reduction and image synthesis: [Algorithm] This was done by  subtracting the RGB value of the pixels of the previous frame from the RGB values of the pixels of the current frame.  Then this image is converted to octachrome (8 colors only - red, blue, green, cyan, magenta, yellow, white, black). This makes most pixels neutral or grey.  This is followed by the greying of those pixels not surrounded by 20 non- grey pixels, in the function crosshair (IplImage*img1, IplImage* img2). The non-grey pixels that remain represent proper motion and noise is eliminated. A database is provided with the code which contains a set of points for each gesture. Spock Sonal Raj (487/10) & Shashi Kant (462/10) 21
  • 22. Progress ▪ Coding Work: Speech synthesis: ▪ Split the waveform by on utterances by silences ▪ match all possible combination of words with the audio models used in speech recognition acoustic model phonetic dictionary Language Model. ▪ Other concepts used in speech recognition are : Lattice, N-best lists, word confusion networks, speech database, text databases are currently being implemented. Spock Sonal Raj (487/10) & Shashi Kant (462/10) 22
  • 24.
  • 26. References hello Projects 1. Sixth Sense, Pranav Mistry, MIT Labs 2. Microsoft Kinect, Microsoft Research and XBox 3. CMUSphinx, Carnegie Mellon university 4. SIRI and Google Voice Search, Apple and Google, Inc. Papers and Publications. 1. Real-time hand gesture recognition using range cameras, Hervé Lahamy and Derek Litchi, Department of Geomatics Engineering, University of Calgary, NW, Calgary, Alberta 2. Real-Time Human Pose Recognition in Parts from Single Depth Images, Jamie Shotton Andrew Fitzgibbon, Microsoft Research Cambridge & Xbox Incubation. 3. Minimum variance modulation filter for robust speech recognition. Yu-Hsiang Bosco Chiu and Richard M Stern, Carnegie Mellon University, Pittsburgh, USA. 4. Some recent research work at LIUM based on the use of CMU Sphinx, Yannick Estève ET. Al. LIUM, University of Le Mans, France
  • 27. Thank You For your patience  Monitor the Progress of Spock and view the code at: https://github.com/sonal-raj/Spock