1. 1 | P a g e
2. Introduction
2.1 Definition
As the name suggests our application aims to integrate voice
recognition abilities in operating systems. The application let the user
control the laptop / PC using the voice commands. If the device is
connected to a network, the commands can be passed to the device
using an android device.
2.2 Objective
The objective of the application is to give user full access without
touching the device itself. The application takes the voice input from the
user. It uses open source libraries or services to get the most accurate
string that the user said. The application compares the string with the
commands set and does the functionalities that are meant to be done
following the voice command.
2.3 Scope
1) Access to OS for especially abled person:
Providing Operating System access by voice recognition can really
help for blind and people with defected limb (handicapped person). The
system needs only voice inputs and these people can use system as well
as a normal person do.
2. 2 | P a g e
2) Remote access to OS without any physical touch:
System control with voice provide the luxury to access it without
any physical touch. It makes the work easy, interesting and more
interactive.
3) Access to system for a lay person:
You don’t need to learn any technical aspect to access the system.
You are interacting with the system in simple strings which is a part of
our day to day life. It is as simple as a conversation with another human
being.
4) More interesting interaction with the system.
Doing work remotely without any physical touch and commanding
your system via voice is more interactive than pressing keys and moving
mouse.
5) Faster access
You are not reaching to the system. You are bypassing all the steps
to reach the final output. It saves a lot of time.
E.g. compare between these scenarios:
- Reach to the system,
Open Microsoft office PowerPoint,
File->Open,
Brows to your presentation,
Open it.
- Say “Open PowerPoint”,
Say “Open minor Project Presentation”.
3. 3 | P a g e
2.4 Project Profile
Project Title Voice control in OS
Objective
This application allows the user to
control the OS using voice
commands. Voice commands can be
given to the PC or an android
application.
Organization Bugs Computer & Systems
Operating System Windows
User Interface
No UI for PC application.
Android application uses UI.
Internal Guide
Prof. Jigna Patel
Dr. Kamal Mehta
Prof. Prajakta Rathod
External Guide Mr. Pradeep Wadekar
Submitted By
Saurabh Gaur (10BIT033)
Sarang Singhal (10BIT022)
Fahim Sheikh (09BIT155)
Submitted To
Department Of Computer
Engineering, Institute Of
Technology, Nirma University,
Ahmedabad.
4. 4 | P a g e
2.5 Project Team
Saurabh Gaur (10BIT033)
Sarang Singhal (10BIT022)
Fahim Sheikh (09BIT155)
2.6 Project Scheduling
5. 5 | P a g e
3. System Analysis
3.1 Feasibility Study
A feasibility study is a short, focused study which aims to answer a
number of questions:-
1) How efficient is the voice recognition? Does it actually gives faster
remote access to the user?
2) Is the voice recognition facility is a burden to process?
3) Is the process really user friendly? Or user ends up cramming the
commands? Can we add user personalization?
3.1.1 Operational feasibility
Operational feasibility measures how well the solution will work in the
market and how will end-user feels about system?
Proposed application helps the user to control the PC without touching
it. Using PC’s recorder as a command receiver doesn’t help that much in
many cases. PC’s recorder generally catches correct and accurate strings
within 4 meters of range and thus help in close PC control, but doesn’t
helps the idea of controlling the device with flexibility. User might end
up saying the command 4-5 times to get the application catch the correct
command.
To overcome this range issue, one Android application is made which
commutes with the device using network. User have to enter only the IP
address of the machine and application transfer the string from phone
to PC being control.
6. 6 | P a g e
This extension brings in these advantages,
1) Recorder in the phone is of much better quality than the
PC/Laptop. So, the captured strings are more accurate.
2) PC/Laptop recorder has 4-5 meter range. Connecting android
device increase the range. Command can be commuted in a wifi
network range.
3) Adding a new device in the control transit to a new idea, an Android
application which controls the PC and device itself.
3.1.2 Technical feasibility
To implement the application, the main functionality is of the voice
recognition module. We have tested open source libraries and services
to get the accurate string results in the output.
Making a whole new voice recognition can be a separate project. It’s
better to use available open source libraries.
3.1.3 Financial and Economic feasibility
Economic feasibility looks at the financial aspects of the project.
Economic feasibility concerns with the returns from the investments in a
project. It determines whether it is worthwhile to invest the money in
the proposed system. It is not worthwhile spending a lot of money on a
project for no returns. To carry out an economic feasibility for a system,
it is necessary to place actual money value against any activities needed
to implement the project.
7. 7 | P a g e
3.2 Requirement Specification
Java Application in PC/Laptop:
For some task administrative privileges are needed.
Android application:
It uses WIFI services of the phone and google voice recognition services.
Android versions 4+ have the voice recognition engines installed by
default.
8. 8 | P a g e
4. System Design
4.1 Component of the application
4.1.1 PC application
This application uses no UI. Logical components are given below.
The application works in the background. It gets the user voice
commands and the string output is generated. The string is then
compared with available commands and after the correct match, the
command lines for the work to be done are executed.
9. 9 | P a g e
4.1.2 Android Application
Application Logo
Application trigger:
By pressing this button,
user can initiate the
command recognition
Choice:
Radio choices provided
to select the device
being controlled.
IP Address of the
server machine:
The text box becomes
visible when PC is being
controlled by the
mobile over network.
User has to give the IP
address of the device.
10. 10 | P a g e
Google Voice services:
In the PC version of the application, we are using Sphinx
open source libraries. In mobile we found the google
voice recognition services more efficient.
Benefits:
- Better dictation.
- Noise avoidance.
- Multiple suggestions.
- Flexible data structure.
16. 16 | P a g e
Java Server Application for your Windows PC/Laptop:
Second file is for PC which works as server in the Mantra System.
“Mantra.bat” is the file name.
Step 1:
It is a bat file. You have to double click on the file and after that a
windows will pop-up at the top-right corner of your desktop.
(Maybe you will see some images of command prompt flashing. You
don’t have to worry about that as this is all because of the “bat” file.)
Figure 1
17. 17 | P a g e
Here is the sample of what your desktop will look like:
Now as you can see after running “Mantra.bat” file u have this interface
at your PC screen:
At this point of time the server is not running so u r getting server’s IP as
0.0.0.0 and no client is connected so blank space.
18. 18 | P a g e
Step 2:
There is a button named “Start Server” on the Mantra’s window.
Click on that to start the server.
After Clicking on start server u will have the following interface:
When you get this screen that’s mean your server is running.
You will get your Server’s IP which u can use in the android application.
Figure 2
19. 19 | P a g e
Step 3:
As you send commands from your android device to the PC, a client-
server connection will be established and after that u will get the Client’s
IP which is connected with the server.
After all this finally while running your desktop Java Mantra Server
application will look like this:
Extras:
You can minimize the window.
On click red-cross the server will stop working.
If you only want to stop the server not the application just click on
stop server.
20. 20 | P a g e
Implementation:
Following are the methods and logic behind all this functioning
including client-server connection, receiving commands from android
devices, execution of these commands on the operating system.
Connection between Android Device and Windows System:
Socket programming is the technology used to connect the android
and the java device.
The endpoint in an inter-process communication is called a socket,
or a network socket for disambiguation. Since most communication
between computers is based on the Internet Protocol, an almost
equivalent term is Internet socket. The data transmission between two
sockets is organized by communications protocols, usually implemented
in the operating system of the participating computers. Application
programs write to and read from these sockets. Therefore, network
programming is essentially socket programming. (Source: Wikipedia)
Here android device is working as client and the desktop java
application is working as the server.
21. 21 | P a g e
Server Side Socket programming:
This is the code which is working behind the server side socket
programming.
Following are the important points about the code:
Pass only port address to get Server Socket.
Using server socket, if the IP and Port address of a request is valid,
accept Client’s socket.
22. 22 | P a g e
Client Side Socket Programming:
This is the code which is working behind the socket programming for
client side.
Following are the points regarding the code:
InetAddress’ instance define the Server’s IP.
Port address is fixed right now (its 8080).
Both addresses are passed in Socket constructor to get
connectivity.
23. 23 | P a g e
Sending and Receiving commands:
For the communication between the android and windows devices we
used the above sockets to send and receive messages.
Messaging at Client Side:
This is the code which is working behind the sending messages for client
to server.
Following are the points regarding the code:
To send message at server side using server socket get the output
stream.
Using buffer writer and print writer, for easy writing, we send the
message to the server side.
24. 24 | P a g e
Messaging at Server Side:
Here is the code working behind the receiving messages from the client.
Following are the some points regarding the code:
Using client socket first get the input stream (Standard IO).
Using input stream to read data as per our requirement we used
buffer reader.
25. 25 | P a g e
Testing
Introduction:
Software testing has a dual function, it is used to establish the
presence of defects in program and it is used to help judge whether or
not the program is usable in practice. Thus software testing is used for
validation and verification, which ensure that software confirms to its
specification and meets the need of the software customer.
Developer resorted Alpha testing, which usually comes in after the
basic design of the program has been completed. The project scientist
will look over the program and give suggestions and ideas to improve or
to correct the design. They also report and give ideas to get rid off around
any major problems. There is bound to be a number of bugs after a
program have been created.
Analyze and check system representation such as the requirements
document, design diagram and the program source code. They may be
applied at all stages of the process.
26. 26 | P a g e
Testing Plan:
Analysis and design department makes all analysis for the system
and forward the test cases flow of the system and scope of the
system to development department.
Development department implements all the forms and sends to
QA department for testing.
QA department checks the form for test cases and also performs
integrated testing. If any error or bug found it returns to
development department otherwise sends to analysis department.
Development department receives bug reports and after
completing currently running modules solves those bug reports.
Analysis department receives error free forms and stores it
permanent.
After completing form level testing system integrity testing starts.
Thus the system is tested per cycle and then it is developed further.
27. 27 | P a g e
Testing Strategy:
The development process repeats this testing sub-process a number
of times for the following phases.
Unit Testing.
Integration Testing.
System Testing.
Acceptance Testing.
28. 28 | P a g e
Test Cases:
Following are the simple test cases from the bunch of the test cases
prepared for the testing of the system. Using these cases we can ensure
that the system is bug and error free. These test were carried out with
the sample data.
Case No. 1
Name: Voice to text in Desktop Java Application-Pass
Description: The test will check the code converting the voice
commands into text.
Test Data Used: Laptop’s Microphone and voice commands
Expected Output: String of the voice command.
Actual Output: String of the voice command.
Pass/Fail PASS
Case No. 2
Name: Voice to text in Desktop Java Application-Fail
Description: The test will check the code converting the voice
commands into text.
Test Data Used: Laptop’s Microphone and voice commands
Expected Output: Give the best match case and continue listening.
Actual Output: Text somewhat similar to the voice command.
Pass/Fail PASS
29. 29 | P a g e
Case No. 3
Name: Voice to text in Android Application-Pass
Description: The test will check the code converting the voice
commands into text.
Test Data Used: Mobile’s Microphone and voice commands
Expected Output: String of the voice command.
Actual Output: String of the voice command.
Pass/Fail PASS
Case No. 4
Name: Voice to text in Android Application-Fail
Description: The test will check the code converting the voice
commands into text.
Test Data Used: Mobile’s Microphone and Voice commands
Expected Output: Give the best match case and continue listening.
Actual Output: Text somewhat similar to the voice command.
Pass/Fail PASS
Case No. 5
Name: Connection between Mobile and PC
Description: The test will check whether PC and mobile are
getting connected with each other or not.
Test Data Used: IP address of the server.
Expected Output: “Hi” text at server side.
Actual Output: “Hi” received at server.
Pass/Fail PASS
Case No. 6
30. 30 | P a g e
Name: Final Testing, the complete working.
Description: The test will check the complete working of the
project i.e. Receiving voice commands and
converting it into text at Client (Android Mobile),
Sending the text to the PC and execution of the
command.
Test Data Used: Voice commands.
Expected Output: Execution of the Command at Server Side.
Actual Output: Command Executed at Server Side.
Pass/Fail PASS
31. 31 | P a g e
APPENDIX
Tools Used:
Eclipse.
NetBeans.
Sphinix Voice to Text Libraries.
Google Voice to Text Libraries.