In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
Question Answering System using machine learning approach
1. A WEB BASED BILINGUAL
QUESTION ANSWERING SYSTEM
USING MACHINE LEARNING
APPROACH
BY: GARIMA NANDA
2. Agenda
Introduction
Applications
Motivation
Problem Statement
How QA is different from SE
Architecture
Implementation
Experimental Results
Conclusion and Future Work
References
3. Introduction
Natural Language Processing (NLP) is a field of computer science and
computational linguistics concerned with the Interaction between computer
systems and human being.
A Question Answering System is fairly an IR system in which a query is
stated to the system and it relocates the correct or closest results to the
specific query asked in Natural Language.
It is one of the consequences of Natural Language Interface to Database (NLIDB).
The main aim of QA is to present the user with a short answer to a question rather than
a list of possibly relevant documents.
5. Applications
Applications of Natural Language Processing:
Speech recognition.
Artificial intelligence and Expert System.
Natural language interface to database.
Fields in which QAS is employed:
Agriculture: Interface for providing agriculture field related information to the
farmers
Sports: Can be used at school level for students.
Railways: For providing automated information regarding the railways or flights.
Some existing QA systems in specific domains-
BASEBALL
LUNAR
START
SHRDLU
6. Motivation
As it becomes more and more difficult to find answers on the WWW using standard search
engines, question answering technology will become increasingly important.
Greater relevance to find correct answer.
Potentially allows everyone to participate in today’s information revolution.
Benefit to people not knowing formal query language.
Find answer to General Knowledge type of questions.
The accuracy of QA system can be found.
Finds answers to such types of questions: Who? What? Where? How? and
क्या, कब, कौन, कै से etc.
7. Problem Statement
To design a Graphical User Interface that can generate Bilingual Natural Language
output.
To design a QAS using General Knowledge Database.
To develop a system which automatically predict the misspelled words and jumbled
phrases with their actual word related to the query
To develop a system which gives correct and concise result related to the questions.
Evaluate the performance and accuracy of the overall system.
Apply the best similarity measure in the system to handle the queries in many ways.
8. How QAS is different from SE
QA System Search Engine
Query in Natural Language(Question) Queries based on keywords
Present answers to users Users find answers from retrieved results
Some NL process used to determine results Mostly keywords and ranking to retrieve results
10. Architecture contd…
Three main phases of Architecture-
Along with the Knowledge Base (SWD + Entities +
Trained Data).
Accessing
Natural
Query Phase
Phase I
Feature
Extraction
Phase
Phase II
Classification
Phase
Phase III
11. Architecture contd…
Identify the nature or type of query.
Query statement is read, processed and tokenized.
Removal of Bag-of-words from the query, resulting into tokens.
Accessing NL Query
phase
12. Architecture contd…
Feature extraction; to extract features (i.e. known
keywords are extracted using similarity).
Data gets trained here resulting into feature vector
Feature Extraction
Phase
13. Architecture contd…
In Classification phase, FV is compared to the Feature Space(i.e
Trained Data) resulting in Labels.
Labels will be directly represent the answer of the query.
Finally in this way the answer is predicted after comparison and result
is displayed.
Classification Phase
14. Implementation
Web Based Graphical User Interface has been implemented into the
system so as increase its scope and portability.
Auto complete feature along with the query input text box has been
implemented.
Queries with misspelled words and jumbled phrases accepted by the
implemented system.
The Machine Learning Approach has been used so as to train the system
as a result of which the accuracy as well as the efficiency of the system
has been increased.
15. Implementation contd…
The Web based Graphical User Interface for QAS has been shown below with the
help of a snapshot.
Language Selection Screen
22. Experimental Results
Test Set Total Correct Overall Accuracy %
TS1 25 23 92
TS2 50 44 88
TS1, TS2- test sets
TS1 carries those questions which are queried by a known user and TS2 are those
questions which are given by that user who is unaware about the domain
Overall accuracy is computed when a complete test set is given as input to the QA
system.
24. CONCLUSION AND FUTURE WORK
The Question answering system for Hindi Natural Language gives a vast
idea of QA System with Overall accuracy of 92% and threshold of 0.9.
The concepts of Overall Accuracy and similarity are used here.
This is far better than the concepts being used in earlier systems.
The system has been made Web Based and has been provided with the
functionality of AutoComplete feature to increase the scope and make it
more user friendly respectively.
Our system removes the limitation of existing work.
Making the system Multilingual by incorporating other languages such as
English, French, Spanish languages along with these two languages.
Further different Classifiers can be tested on our system
25. References
Sunil A. Khillare, Bharat A. Shelke, and C. Namrata Mahender, "Comparitive Study on Question
Answering Systems and Techniques," International Journal of Advanced Research in Computer
Science and Software Engineering, vol. 4, pp. 775-778, November 2014.
Jovita, Linda, Andrei Hartawan, DerwinSuhartono," Using Vector Space Model in Question
Answering System,"International Conference on Computer Science and Computational
Intelligence (ICCSCI 2015), ScienceDirect, pp. 305-311.
Show-Jane Yen, Yu-ChiehWu, Jie-Chi Yang, Yue-Shi Lee, Chung-Jung Lee, Jui-Jung Liu," A
support vector machine-based context-ranking model for question answering," Information
Sciences, ScienceDirect, pp. 77-87, 2013.
Asma Ben Abacha a, Pierre Zweigenbaum," MEANS: A medical questionanswering system
combining NLP techniques and semantic Web technologies," Information Processing and
Management, ScienceDirect, pp. 570-594, 2015.
Sneha Bagde, Mohit Dua, and Zorawar Singh Virk, " Comparison of Different Similarity
Functions on Hindi QA System," International conference on ICT for Sustainable Development
(ICT4SD), Springer, pp. 657-663, vol. 408, February 2016.
26. Smith Mahboob Alam Khalid, Valentin Jijkoun and Maarten de Rijke, “ Machine
Learning for Question Answering from Tabular Data, “ 18th International Workshop
on Database and Expert Systems Applications, IEEE, 2007.
Sanjay K Dwivedi, Vaishali Singh, “Research and Reviews in Question answering
system,” International Conference on Computational Intelligence: Modelling
Techniques and Applications, ScienceDirect, pp. 417-424, 2013.
Er. Amit Chaudhary, Er. Annu Battan," Natural Language Interface to Databases-
An Introduction," International Journal of Advanced Research in Computer
Science and Software Engineering, Vol. 4, Issue 7, July 2014.
Rajender Kumar, Mohit Dua, Shivani Jindal," D-HIRD: DomainIndependent Hindi
Language Interface to Relational Database," International Conference on
Computation of Power, Energy, Information and Communication (ICCPEIC), IEEE,
2014.
Mohit Dua, Sandeep Kumar, Zorawar Singh Virk, “ Hindi Language Graphical User
Interface to Database Management System,” International Conference on
Machine Learning and Applications, IEEE, 2013.