Submit Search
Upload
Automatic Subtitle Generation for Sound in Videos
•
0 likes
•
79 views
IRJET Journal
Follow
https://irjet.net/archives/V3/i2/IRJET-V3I2160.pdf
Read less
Read more
Engineering
Report
Share
Report
Share
1 of 5
Download now
Download to read offline
Recommended
IRJET- Why Python Rocks for Research....???
IRJET- Why Python Rocks for Research....???
IRJET Journal
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Speech Based Answer Sheet Evaluation System
IRJET Journal
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
ijtsrd
Introduction of c# day1
Introduction of c# day1
Arun Kumar Singh
IRJET- Audio Data Summarization System using Natural Language Processing
IRJET- Audio Data Summarization System using Natural Language Processing
IRJET Journal
Real Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech Translation
IRJET Journal
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IRJET Journal
Extract the Audio from Video by using python
Extract the Audio from Video by using python
IRJET Journal
Recommended
IRJET- Why Python Rocks for Research....???
IRJET- Why Python Rocks for Research....???
IRJET Journal
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Speech Based Answer Sheet Evaluation System
IRJET Journal
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
ijtsrd
Introduction of c# day1
Introduction of c# day1
Arun Kumar Singh
IRJET- Audio Data Summarization System using Natural Language Processing
IRJET- Audio Data Summarization System using Natural Language Processing
IRJET Journal
Real Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech Translation
IRJET Journal
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IRJET Journal
Extract the Audio from Video by using python
Extract the Audio from Video by using python
IRJET Journal
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
IRJET Journal
Review on content based video lecture retrieval
Review on content based video lecture retrieval
eSAT Journals
IRJET- Vocal Code
IRJET- Vocal Code
IRJET Journal
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
IRJET Journal
Sub1577
Sub1577
International Journal of Science and Research (IJSR)
Playing sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkit
eSAT Journals
IRJET- Transcription of Conferences
IRJET- Transcription of Conferences
IRJET Journal
Pacify based video retrieval system
Pacify based video retrieval system
eSAT Journals
IRJET- Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural Networks
IRJET Journal
Video Summarization
Video Summarization
IRJET Journal
Voice Recognition System for Automobile Safety.
Voice Recognition System for Automobile Safety.
IRJET Journal
Part 6
Part 6
Videoguy
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind People
IRJET Journal
IRJET - Automation in Python using Speech Recognition
IRJET - Automation in Python using Speech Recognition
IRJET Journal
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AI
IRJET Journal
VIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdf
FreeFire293813
A web based approach: Acronym Definition Extraction
A web based approach: Acronym Definition Extraction
IRJET Journal
Sirio Orione and Pan
Sirio Orione and Pan
Andrea Ferracani
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
IOSR Journals
Arneb
Arneb
Andrea Ferracani
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
More Related Content
Similar to Automatic Subtitle Generation for Sound in Videos
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
IRJET Journal
Review on content based video lecture retrieval
Review on content based video lecture retrieval
eSAT Journals
IRJET- Vocal Code
IRJET- Vocal Code
IRJET Journal
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
IRJET Journal
Sub1577
Sub1577
International Journal of Science and Research (IJSR)
Playing sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkit
eSAT Journals
IRJET- Transcription of Conferences
IRJET- Transcription of Conferences
IRJET Journal
Pacify based video retrieval system
Pacify based video retrieval system
eSAT Journals
IRJET- Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural Networks
IRJET Journal
Video Summarization
Video Summarization
IRJET Journal
Voice Recognition System for Automobile Safety.
Voice Recognition System for Automobile Safety.
IRJET Journal
Part 6
Part 6
Videoguy
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind People
IRJET Journal
IRJET - Automation in Python using Speech Recognition
IRJET - Automation in Python using Speech Recognition
IRJET Journal
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AI
IRJET Journal
VIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdf
FreeFire293813
A web based approach: Acronym Definition Extraction
A web based approach: Acronym Definition Extraction
IRJET Journal
Sirio Orione and Pan
Sirio Orione and Pan
Andrea Ferracani
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
IOSR Journals
Arneb
Arneb
Andrea Ferracani
Similar to Automatic Subtitle Generation for Sound in Videos
(20)
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
Review on content based video lecture retrieval
Review on content based video lecture retrieval
IRJET- Vocal Code
IRJET- Vocal Code
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
Sub1577
Sub1577
Playing sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkit
IRJET- Transcription of Conferences
IRJET- Transcription of Conferences
Pacify based video retrieval system
Pacify based video retrieval system
IRJET- Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural Networks
Video Summarization
Video Summarization
Voice Recognition System for Automobile Safety.
Voice Recognition System for Automobile Safety.
Part 6
Part 6
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind People
IRJET - Automation in Python using Speech Recognition
IRJET - Automation in Python using Speech Recognition
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AI
VIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdf
A web based approach: Acronym Definition Extraction
A web based approach: Acronym Definition Extraction
Sirio Orione and Pan
Sirio Orione and Pan
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
Arneb
Arneb
More from IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
React based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
More from IRJET Journal
(20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
React based fullstack edtech web application
React based fullstack edtech web application
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Recently uploaded
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
120cr0395
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
ranjana rawat
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
fenichawla
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
ranjana rawat
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
upamatechverse
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
Prabhanshu Chaturvedi
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
rknatarajan
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
Kamal Acharya
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Dr.Costas Sachpazis
result management system report for college project
result management system report for college project
Tonystark477637
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
pranjaldaimarysona
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
sivaprakash250
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Call Girls in Nagpur High Profile
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
Asutosh Ranjan
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
ranjana rawat
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
ranjana rawat
Recently uploaded
(20)
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
result management system report for college project
result management system report for college project
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
Automatic Subtitle Generation for Sound in Videos
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 914 Automatic Subtitle Generation for Sound in Videos Kanika Mishra1, Prerak Bhagat2, Prof. Atiya Kazi3 Kanika Mishra, Department of Information Technology, University of Mumbai, Finolex Academy of Management & Technology, Ratnagiri, Maharashtra. Prerak Bhagat, Department of Information Technology, University of Mumbai, Finolex Academy of Management & Technology, Ratnagiri, Maharashtra. Prof.Atiya Kazi, Department of Information Technology, University of Mumbai, Finolex Academy of Management & Technology, Ratnagiri, Maharashtra. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract – The last two decades have seen a vast growth in the amount of video content being generated across various platforms and spanning several genres. Apart from the convenient availability of cheap portable media devices, the emergence of dedicated websites tailored specifically for storage and sharing of videoshas increased the value of audio - visual content in eyes of the general populous. However as with any technology, the emergence of audio-visual content as the preferred method of information sharing has brought along with it certain hurdles. Prominent among those is the existence of numerous accents and dialects of any particular language. Another is the sheer number of languages that the content is generated in. Also the hearing impaired audience is unable to receive the audio partof thecontent and thus, unable in most cases, to completely or accurately comprehend the information being presented to them. This situation can be rectified with aid of audio transcripts and necessary translations. The software’s currently available require for most part an active user participation and also have a notable lack of accuracy. The sheer quantity of audio - video information therefore makes it impossible to provide transcriptsforallrelevant data encapsulated in such formats. Hence,theneedfor an automated concept is vividly felt. This thesis report analyses a specific path to automatically generating subtitles for videos using Google's free API's. Due to lack of necessary resources,the translation component has not been included in this specific paper. We propose a three step model to realize our process. The first stage will be extracting the audioto be transcribed from the video file and then converting it into a format compatible withoursecondstageiftheneed be. The second stage uses a speech recognition engine to convert the information from audioformattotextformat. The third and final stage will be the encoding of the generated text into a subtitle format which will include adding time frames and necessary pauses and punctuation marks. The results of the resulting works have been encouraging though there exists a tremendous room for improvements and adjustments. Key Words: Audio Extraction, Speech Recognition, Subtitle Generation. 1. INTRODUCTION Video is one of the most popular multimedia used on PCs. So it is essential to make information encompassed in form of sound in videos comprehendible to people with auditory problems. The most natural way lies in the use of subtitles. However, manual subtitle creation is a tedious activity and requires active participation of the user. Hence, the study of automatic subtitle generation presents as a valid subject of research. 2. LITERATURE SURVEY This is overview of concept we got from some papers about automatic subtitle generation for sound in videos. There are many programming languages that can be used in the creation of AutoSubGen. A brief study on internet suggests
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 915 focus on C/C++ and Java. We now analyze thepositivepoints of the two languages based on the research of J.P. Lewis and Ulrich Neumann [8]. On the one hand, C++ provides remarkable advantages in speed, cross systems performances, and well-tested packages. On the other hand, Java offers an intuitive syntax, portability on multiple Operating Systems and trustedlibraries. They areusedinthe Sphinx speech recognition engine. Java provides the Java Media Framework (JMF) API that allows developers to deal with media tracks (audio and video) in an efficient way. In Lewis’s articles, it appears Java performances are quite similar to C/C++. Besides, in theory, Java performances should be higher. The absence of pointers,theuseof efficient garbage collector and the run-time compilation plays a key role in Java and their extent shouldsoonpermittogobeyond C++ performances. The JMF API features facilitate audio extraction.We present an overview of the JMF API and focus a bit deeperon insightful functions in relation to audio extraction. It exists an interesting tutorial [4] to start with the JMF and hence it is used in the following parts. The official documentations are also pertinent resource [4]. Furthermore, a complete book describing JMF API [3] was published in 1998. JMF gives applications the possible capability to output multimedia content, capture audio and video through microphone and camera, do real-time streaming of media over the Internet, process media and store media into file. 3. METHODOLOGY 3.1 Audio Extraction This module aims to output an audio file from a media file. An activity diagram of the module is presented in Figure 1.It takes as input a file URL and gives audio format of the output. Next, it checks the file content and creates a list of individual tracks composing the initial media file. An exception is thrown if the file content is irrelevant. Once the tracks are separated, the processor analyzes each track, selects the first audio track found and discards the rest. Finally, the audio track is written in a file applying the default or wished format. In Audio Formats Discussions, it was previously talked about the audio formats that Sphinx accepts. At the moment, it principally supportsWAVorRAW files. Hence, the task of obtaining the preferred format is complicated. It often requires various treatments and conversions to reach the mentioned purpose. Audio extraction provides a way to extract audio from a media file but does not offer conversion tools.Otherwise,itisadvisable to use the facilities of VLC for the purpose of getting suitable audio material. It specially uses the java media framework (jmf) for audio extraction. Because JMF provides a platform- neutral framework for handling multimedia [4]. Java Media Framework (jmf): JMF is used for handling streaming media in Java programs. JMF enables Java programs to - • Present multimedia contents, • Capture audio through microphone and video content through camera, • Do instant streaming of media over the Internet, • Process media(such as changing media format, adding special effects), • Store media into a file. Figure 1. Activity Diagram for Audio Extraction. 3.2 Speech Recognition Some speech recognition systems make use of "source- independent speech recognition" while some others use "training" in which a single speaker reads sections of text into the SR system. These system analyze the person's specific voice and use it to find the recognition of that individual’s speech, which results in more accurate transcripts. Systems that do notmakeuseoftrainingmethod is called as "speaker-independent" systems. Systems that
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 916 make use of such training is called as "speaker-dependent" systems. Here in our Project we have used the Line playback device, which lets the voice to the google transcriptor in the chrome web app and the text is generated from it. Google Transcriptor gets the voice from the audio file which is provided by the user, through the LINE playback device and starts converting that audio into thetextformatanddisplays on the front end. We can create a file .txt or .rtf using that text. Using different Sphinx systems [6], for nearly two decades they have developed and proved their reliability within several ASR systems where they were linked in. It is therefore natural we opted for the Sphinx-4 decode. Sphinx- 4 has been entirely written in Java [7], making it totally portable and able to provide a modular architecture, that allows us to modify configurations with ease. The architecture overview wasshowninthe backgroundsection. We aim to procure advantage of Sphinx-4 features in order to satisfy the needs of the Automatic Subtitle Generation Speech recognition module [5]. Subtitle Generation is expected to select the most suitable models considering the audio and the parameters (category, length, etc.) given in input and thereby generate a precise transcript of the audio speech. The Figure 2 resumes the organization of thespeech recognition module. Speech Recognition uses the Hidden Markov Model (HMMs) as an algorithm. Figure 2.Activity Diagram for Speech Recognition. 3.3 Subtitle Generation: This module is expected to get a list of words and their respective speech time-frames from the speech recognition module and then produce a .srt subtitle file. To do so, the module must look at the list of words and use silence (SIL) spoken words as a boundary between for two consecutive sentences. The Figure 3 offers a visual representation of the steps to follow for the subtitle generation for sound in videos. However, we face up to some limiting rule. Indeed, it will not be able to define punctuation in our system since it involves much more speech analysis and deeper design. Figure 3. Activity Diagram for Subtitle Generation. 4. REQUIREMENTS OF PROPOSED SYSTEMS The requirements of our proposed system is expressed through Table I.
4.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 917 TABLE I. REQUIREMENT SPECIFICATIONS No Technology used Description 1. 1GB RAM Storing 2 Java Speech API It allows to incorporation of speech technology into user interfaces for Java based applets and applications. This API defines a cross-platform interface to help command and control recognizers and speech synthesizers. Sun [12] hasalsodeveloped the JSGF(Java Speech Grammar Format) to supply cross-platform grammar of speech recognizers 3 JDK For writing Java applets and applications and run them into java language. 4 Apache Tomcat To manage sessions as well as applications across the network 5 MySQL Database Storing audio and video related text files 6 Net beans It is an IDE use to run the whole project. 5. CONCLUSIONS A complete system including therequiredmodulescouldnot be realized since the audio conversion need moreresources. VLC gave a suitable solution but a custom coded Java component is expected in future projects so that portability and installation of the system become uncomplicated. Nonetheless, the expected output for each phase has been reached. The audio extraction module supply a suitable audio format that can be used by the intended speech recognition module. Speech recognition produce a list of recognized words and their corresponding time-frames in the audio content, although the accuracy cannot be guaranteed. The former list is then used by the intended subtitle generation module to create standard subtitle file that is readable by the most common media players available. As of Now subtitles are being generated for the specific format of the Audio files, But In future this restriction of the format will be removed. English Subtitles of the Regional Languages will be generated, and the accuracy of the project will be more precise. Offline content will be web-based in coming future, and the User can access and generate subtitles for his video in a feasible way. Overall it will be Accurate, less time consuming, No language barrier, Web- based. REFERENCES [1] Boris Guenebaut, “Automatic Subtitle Generation for Sound in Videos”, Tapro 02, University West, pp. 35, 2009. [2]Zoubin Ghahramani,“An introduction to hiddenmarkov models and Bayesian networks”, in World Scientific PublishingCo., Inc. River Edge, NJ, USA, pp.30, 2001. [3] Robert Gordon, Stephen Talley, and Rob Gordon.Essential Jmf:Java Media Framework. Prentice Hall PTR, pp. 17, 1998. [4] Java tm media framework api guide, 1999. URL:http://java.sun.com/javase/technolog/ desktp/mdia/jmf/2.1.1/guide/index.html [5] Frederick Jelinek, Statistical Methods for Speech Recognition. MIT Press, pp. 7, 1999. [6] L. Besacier, C. Bergamini, D. Vaufreydaz, and E. Castelli,“The effect of speech and audio compression on speech recognition performance”. In IEEE Multimedia Signal Processing Workshop, Cannes:France,2001.URL :http://hal.archives ouvertes.fr/docs/00/32/61/65/PDF/Besacie r01b.pdf. [7] Sphinx-4 a speech recognizer written entirely in the javatm programming language, pp. 5, 1999-2008. URL http://cmusphinx.sourceforge.net/sphinx4/.
5.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 02 | Feb-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 918 [8] J.P. Lewis and Ulrich Neumann. Performance of java v/s c++ URL:http://www.idiom.com/˜zilla/Comput er/javaCbenchmark.html.
Download now