SlideShare una empresa de Scribd logo
1 de 4
Descargar para leer sin conexión
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 527
SEMANTIC ANALYZER FOR MARATHI TEXT
Pallavi Bagul1
, Prachi Mahajan2
, Archana Mishra3
, Medinee Kulkarni4
, Gauri Dhopavkar5
1
Computer Technology, YCCE Nagpur- 441110, Maharashtra, India
2
Computer Technology, YCCE Nagpur- 441110, Maharashtra, India
3
Computer Technology, YCCE Nagpur- 441110, Maharashtra, India
4
Computer Technology, YCCE Nagpur- 441110, Maharashtra, India
5
Computer Technology, YCCE Nagpur- 441110, Maharashtra, India
Abstract
This paper represents a Semantic Analyzer for checking the semantic correctness of the given input text. We describe our system as
the one which analyzes the text by comparing it with the meaning of the words given in the WordNet. The Semantic Analyzer thus
developed not only detects and displays semantic errors in the text but it also corrects them.
Keywords: Part of Speech (POS) Tagger, Morphological Analyzer, Syntactic Analyzer, Semantic Analyzer, Natural
Language (NL)
-----------------------------------------------------------------------***----------------------------------------------------------------------
1. INTRODUCTION
Natural Language (NL) is a very important and essential tool
to represent information. Computer is not able to understand
NL. Now when we say understanding is a mental process, it
means how human beings recognize objects (mental and
physical) and links between them. Since computer does not
have a human mentality, so it cannot understand by definition.
NL processing (NLP) is a general problem and to be more
specific we can separate it by categories according to
increasing level or complexity of such processing:
 Morphology and morphological processing
 Syntax and syntactical processing
 Semantics and semantic processing
Morphology is a sub-discipline of linguistics that studies word
structure. During morphological processing we are basically
considering words in a text separately and trying to identify
morphological classes in which these words belong to. One of
the widespread tasks here is lemmatizing or stemming which
is used in many web search engines. In this case all
morphological variations of a given word (known as word-
forms) are collapsed to one lemma or stem [5].
Syntax as part of grammar is a description of how words are
grouped and connected to each other in a sentence. Syntax
usually entails the transformation of a linear sequence of
tokens into a hierarchical syntax tree. A token is akin to an
individual word or punctuation mark in a natural language.
Main problems on this level are: part of speech tagging (POS
tagging), chunking or detecting syntactic categories (verb,
noun phrases) and sentence assembling (constructing syntax
tree) [8].
Semantics and its understanding as a study of meaning covers
most complex tasks like: finding synonyms, word sense
disambiguation, constructing question-answering systems,
translating from one NL to another, populating base of
knowledge. Basically one needs to complete morphological
and syntactical analysis before trying to solve any semantic
problem. Formalization of NL leads us to solutions of all these
problems [3].
2. APPROACHES
Broadly speaking there are two basic ways in which Semantic
Analysis can be carried. They are classified as:
2.1 Supervised Semantic Analysis
The supervised models require a pre-annotated corpus which
is used for training the application so as to learn information
about the various words, their tags, frequencies, rule sets etc.
The performance of the model generally increases when we
increase the size of the corpus. Such annotated resources are
scarce and expensive to create, motivating the need for
unsupervised or semi-supervised techniques.
2.2 Unsupervised Semantic Analysis:
Unsupervised approaches do not depend on pre-annotated
corpus instead it relies on distributional similarity of contexts
to decide on semantic relatedness of terms, but this
information may be sparse and not reliable always [13].
However, unsupervised methods have their own challenges
they are not always able to discover semantic equivalences of
lexical entries or logical forms or, on the contrary, cluster
semantically different or even opposite expressions.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 528
2.3 Semi- Supervised Semantic Analysis:
In this case, groups of unannotated texts with overlapping and
non-contradictory semantics provide a valuable source of
information. This form of weak supervision helps to discover
implicit clustering of lexical entries and predicates, which
presents a challenge for purely unsupervised techniques [12].
3. LITERATURE SURVEY
Considerable amount of work has already been done in the
field of Semantic analysis for English text. Different
approaches along with modifications have been tried and
implemented. However, if we look at the same scenario for
South-Asian languages such as Marathi and Hindi, we find out
that not much work has been done. The main reason for this is
the unavailability of a considerable amount of annotated
corpora of sound quality, and very high level of ambiguity in
those languages. In the following sections, we describe some
POS tagging models, Morphological analysis model, syntactic
analysis models and semantic analysis model that have been
implemented for English and other languages along with their
performances.
In the year 2011, Akira Shimazu, Syozo Naito, and Hirosato
Nomura of Musashino Electrical Communication Laboratory,
Japan developed a Japanese Language Semantic Analyzer that
was based on an extended case frame model. The case frame
model consists of a relatively large collection of case relations,
modalities and conjunctive relations. The analyzer uses a
frame type knowledge base for analyzing. It also utilizes
plausibility scores for dealing with ambiguities and local scene
frames for the prediction of omitted case elements [3].
Ivan Titov and Mikhail Kozhevnikov developed a
Bootstrapping Semantic Analyzers from Non-Contradictory
Texts in the year 2010. They urged that groups of unannotated
texts with overlapping and non-contradictory semantics
provide a valuable source of information. This form of weak
supervision helps to discover implicit clustering of lexical
entries and predicates, which presents a challenge for purely
unsupervised techniques [14]. They considered the generative
semantics-text correspondence model and demonstrate that
exploiting the noncontradiction relation between texts leads to
substantial improvements over natural baselines on a problem
of analyzing human-written weather forecasts.
A Semantic Analyzer for aiding emotion recognition in
Chinese Language is developed by Jiajun Yan, David B.
Bracewell, Fuji Ren and Shingo Kuroiwa in the year 2006.
The analyzer developed, uses a decision tree to assign
semantic dependency relations between headwords and
modifiers. It is able to achieve an accuracy of 83.5%. The
semantic information is combined with rules for Chinese verbs
containing emotion to describe the emotion of the people in
the sentence. The rules give information on how to assign
emotion to agents, receivers, etc. depending on the verb in the
sentence [4].
Semantic analysis is also used for retrieving data from text and
applying that to ER modeling. Semantic analysis involves a
process whereby meaning representations are created and
assigned to linguistic inputs. There are some semantic roles
defined which are helpful in interpreting possible elements of
the ER model. These semantic roles may also indicate the
types of entities from the given text. After recognizing the
meaning the particular roles may be defined to identify the
elements of ER modeling [9].
Semantic analysis of text and speech by Anssi Klapuri in 2007
states various approaches on semantic analysis: i) statistical
approach ii) information retrieval iii) domain knowledge
driven analysis [10].
4. METHODOLOGY
4.1 Block Diagram
Fig. 1 Block Diagram
4.2 POS Tagger
The input string (Raw Text) is tokenized and a word net is
used for detecting the part of speech of each token in the
sentence. The wordnet [15] stores the part of speech of each
word using 4 bits where 1000 signifies noun, 0100 signifies
Adjective, 0010 signifies Adverb and 0001 signifies a verb.
We also have combinations like 1100 which means that the
designated word can be used both as a noun and as an
Adjective, we also have is 0110 which means that the
designated word can be used both as an Adjective and as an
Adverb.
POS Tagger
Morpholocical
Analyzer
Syntactic
Analyzer
Semantic
Anayzer
Semantically
corrected
text
Raw
Text
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 529
This ambiguity is resolved using Marathi grammar rules. If we
have 0110 ambiguity then if the next token is a noun or an
adjective then the ambiguous word becomes an adjective. If
the next token is a verb then the ambiguous word becomes an
adverb. If we have 1100 ambiguity then if the next token is a
noun then the ambiguous word becomes an adjective
otherwise it becomes an adverb.
4.3 Morphological Analyzer
The POS Tagger‟s output is then passed to the Morphological
Analyzer which detects the root-word along with the gender
and the tense. The first step in this module is to find the root
word of each token in the given sentence this is done with the
help of the word net.
Using the WordNet the gender of each token in the sentence is
detected. The word net is trained using a tourism related
corpus. The subject, object and verb in the input sentence is
detected. For better accuracy the gender of the sentence is
detected with respect to the gender of the subject.
The tense of the given input sentence is detected using the
Marathi grammar rules.
4.4 Syntactic Analyzer
Morphological Analyzer‟s output is used by the Syntactic
Analyzer to detect whether the output is syntactically correct
or not. It is detected whether the subject, object and verb are
placed at proper positions in the sentence. The sentence
structure should be in accordance to the Marathi grammar
rules where subject comes in first position followed by the
object and the verb. If it is not in proper order we change the
original order.
Finally we check whether the tense assigned to the sentence is
in accordance with the sentence structure if not then the
sentence structure is modified such that it retains its meaning
and has the correct tense associated with it.
4.5 Semantic Analyzer
The output of the Syntactic Analyzer is used by the Semantic
Analyzer to check the semantic correctness of the input text
and also correct it if it is found to be incorrect. All the tokens
in the input sentence must semantically support each other.
This means that if two tokens in the sentence are mismatching
then it is not semantically correct and needs to be corrected.
If the tokens in the sentence are antinomies of each other and
they are being used in the same context then they need to be
corrected as the sentence is not semantically correct. For this a
list of all the synonyms and antonyms has to be maintained
using the tourism corpus. Finally, the semantically corrected
text is presented as output to the user.
5. RESULTS
The output of Semantic Analyzer cannot be obtained before
completing POS Tagging, Morphological Analysis and
Syntactic Analysis.
5.1 POS Tagger
Input sentence:
लाल फु ल निळा आहे .
By this phase, each token i.e. word is assigned its suitable part
of speech:
लाल-Adjective फु ल-Noun निळा-Adjective आहे-Verb
5.2 Morphological Analyzer
In this phase, root word for each token is derived and also
gender of the whole sentence is found out.
लाल- लाल, फु ल- फु ल, निळा- निळा, आहे-असणे.
लाल फु ल निळा आहे .- Neuter sentence
5.3 Syntactic Analyzer
In this phase, the sentence is checked if it is syntactically
correct or not. If not, it is corrected.
लाल फु ल निळा आहे .- Syntactically incorrect sentence
Corrected sentence: लाल फु ल निळे आहे
5.4 Semantic Analyzer
In this phase, the sentence is checked if it is semantically
correct or not. If not, it is corrected.
लाल फु ल निळे आहे.- Semantically incorrect sentence
Corrected sentence: लाल फु ल निळे िाही आहे
6. CONCLUSIONS
This paper presents a Semantic Analyzer for Marathi
Language which undergoes the phases: POS tagger,
Morphological Analyzer, Syntactic Analyzer and Semantic
Analyzer respectively. POS tagger used in this work follows
rule based approach. This work allows single sentence to
undergo semantic analysis.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 530
REFERENCES
[1] Adwait Ratnaparkhi, “A maximum entropy model for
part-of-speech tagging,” In Proceedings of the
Conference on Empirical Methods in Natural Language
Processing, University of Pennsylvania, 1996, pp. 133-
-142.
[2] M. Deroualt and B. Merialdo, “Natural Language
Modeling For Phoneme-To-Text Transposition”, IEEE
transactions on Pattern Analysis and Machine
Intelligence, 1986.
[3] Akira Shimazu, Syozo Naito, and Hirosato Nomura,
“Japanese Language Semantic Analyzer based on an
Extended Case Frame Model”, 3-9-11.
[4] Jiajun Yan, David B. Bracewell, Fuji Ren and Shingo
Kuroiwa,“A Semantic Analyzer for aiding emotion
recognition in Chinese”, 2006.
[5] Koskenniemi .K, „Two –Level Morphology: A general
Computational,” Model for Word Recognition and
Production, University of Helsinki, Helsinki, 1983.
[6] Vishal Goyal, Gurpreet Singh Lehal, “Hindi
Morphological Analyzer and Generator,” IEEE
Computer Society Press, pp. 1156–1159, 2008.
[7] Niraj Aswani, Robert Gaizauskas, “Developing
Morphological Analysers for South Asian Languages:
Experimenting with the Hindi and Gujarati
Languages,” In Proceedings of LREC, 2010.
[8] J. P. Thorne, P. Bratley and H. Dewar, “The Syntactic
Analysis of English by Machine,” International
Conference on NLP, 1996.
[9] N.Omar, P.Hanna, P.Mc Kevitt, “Semantic Analysis in
the Automation of ER Modelling through Natural
Language Processing," Computing & Informatics,
2006. ICOCI '06. International Conference, June 2006.
[10] Anssi Klapuri, “Semantic analysis of text and speech,”
SGN-9206 Signal processing graduate seminar II, Fall ,
2007.
[11] Debbarma, K. Patra, B. G. Debbarma, S. Kumari and
Purkayastha, “Morphological analysis of Kokborok for
universal networking language dictionary,” 1st
International Conference on Recent Advances in
Information Technology (RAIT), pages 474-477,
IEEE., 2012
[12] Sugatu Basu, Arindam Banjeree, and Raymond
Mooney, “Active semi-supervision for pairwise
constrained clustering,” In Proc. of the SIAM
International Conference on Data Mining (SDM),
pages 333–344, 2004.
[13] Hoifung Poon and Pedro Domingos, “Unsupervised
semantic parsing,” In Proceedings of the 2009
Conference on Empirical Methods in Natural Language
Processing (EMNLP-09), 2009.
[14] Ivan Titov Mikhail Kozhevnikov, “Bootstrapping
Semantic Analyzers from Non-Contradictory Texts,”
Proceedings of the 48th Annual Meeting of the
Association for Computational Linguistics, pages 958–
967, 11-16 July 2010.
[15] http://www.cfilt.iitb.ac.in/wordnet/webmwn/

Más contenido relacionado

La actualidad más candente

Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
TELKOMNIKA JOURNAL
 
The Process of Information extraction through Natural Language Processing
The Process of Information extraction through Natural Language ProcessingThe Process of Information extraction through Natural Language Processing
The Process of Information extraction through Natural Language Processing
Waqas Tariq
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
ATHMAN HAJ-HAMOU
 

La actualidad más candente (20)

J1803015357
J1803015357J1803015357
J1803015357
 
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
 
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSAN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
Myanmar news summarization using different word representations
Myanmar news summarization using different word representations Myanmar news summarization using different word representations
Myanmar news summarization using different word representations
 
A statistical model for gist generation a case study on hindi news article
A statistical model for gist generation  a case study on hindi news articleA statistical model for gist generation  a case study on hindi news article
A statistical model for gist generation a case study on hindi news article
 
SCTUR: A Sentiment Classification Technique for URDU
SCTUR: A Sentiment Classification Technique for URDUSCTUR: A Sentiment Classification Technique for URDU
SCTUR: A Sentiment Classification Technique for URDU
 
G1803013542
G1803013542G1803013542
G1803013542
 
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
 
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion
 
The Process of Information extraction through Natural Language Processing
The Process of Information extraction through Natural Language ProcessingThe Process of Information extraction through Natural Language Processing
The Process of Information extraction through Natural Language Processing
 
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
 
Using ontology for natural language processing
Using ontology for natural language processingUsing ontology for natural language processing
Using ontology for natural language processing
 
Cl35491494
Cl35491494Cl35491494
Cl35491494
 
ANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTKANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTK
 
A Survey on Word Sense Disambiguation
A Survey on Word Sense DisambiguationA Survey on Word Sense Disambiguation
A Survey on Word Sense Disambiguation
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 

Destacado

New electromagnetic force sensor measuring the density of liquids
New electromagnetic force sensor measuring the density of liquidsNew electromagnetic force sensor measuring the density of liquids
New electromagnetic force sensor measuring the density of liquids
eSAT Publishing House
 

Destacado (20)

Snakes Myths & Facts in Marathi by Santosh Takale
Snakes Myths & Facts in Marathi by Santosh TakaleSnakes Myths & Facts in Marathi by Santosh Takale
Snakes Myths & Facts in Marathi by Santosh Takale
 
Condition assessment of concrete with ndt – case
Condition assessment of concrete with ndt – caseCondition assessment of concrete with ndt – case
Condition assessment of concrete with ndt – case
 
Contribution to the valorization of moroccan wood in industry of laminated wo...
Contribution to the valorization of moroccan wood in industry of laminated wo...Contribution to the valorization of moroccan wood in industry of laminated wo...
Contribution to the valorization of moroccan wood in industry of laminated wo...
 
Automatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyAutomatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliency
 
Effect of fly ash on the properties of cement
Effect of fly ash on the properties of cementEffect of fly ash on the properties of cement
Effect of fly ash on the properties of cement
 
An experiental investigation of effect of cutting parameters and tool materia...
An experiental investigation of effect of cutting parameters and tool materia...An experiental investigation of effect of cutting parameters and tool materia...
An experiental investigation of effect of cutting parameters and tool materia...
 
Low power and high performance detff using common
Low power and high performance detff using commonLow power and high performance detff using common
Low power and high performance detff using common
 
Effects of non ionized electromagnetic radiation on satellites
Effects of non ionized electromagnetic radiation on satellitesEffects of non ionized electromagnetic radiation on satellites
Effects of non ionized electromagnetic radiation on satellites
 
First order sigma delta modulator with low-power
First order sigma delta modulator with low-powerFirst order sigma delta modulator with low-power
First order sigma delta modulator with low-power
 
Experimental investigation of performance and
Experimental investigation of performance andExperimental investigation of performance and
Experimental investigation of performance and
 
Power balancing optimal selective forwarding
Power balancing optimal selective forwardingPower balancing optimal selective forwarding
Power balancing optimal selective forwarding
 
A study on qos aware routing in wireless mesh network
A study on qos aware routing in wireless mesh networkA study on qos aware routing in wireless mesh network
A study on qos aware routing in wireless mesh network
 
Image compression using negative format
Image compression using negative formatImage compression using negative format
Image compression using negative format
 
Security constrained optimal load dispatch using hpso technique for thermal s...
Security constrained optimal load dispatch using hpso technique for thermal s...Security constrained optimal load dispatch using hpso technique for thermal s...
Security constrained optimal load dispatch using hpso technique for thermal s...
 
Energy efficient routing used for wireless sensor networks exploitation in ge...
Energy efficient routing used for wireless sensor networks exploitation in ge...Energy efficient routing used for wireless sensor networks exploitation in ge...
Energy efficient routing used for wireless sensor networks exploitation in ge...
 
Parametric study of response of an asymmetric building for various earthquake...
Parametric study of response of an asymmetric building for various earthquake...Parametric study of response of an asymmetric building for various earthquake...
Parametric study of response of an asymmetric building for various earthquake...
 
Identification of dynamic rigidity for high speed
Identification of dynamic rigidity for high speedIdentification of dynamic rigidity for high speed
Identification of dynamic rigidity for high speed
 
An improved color image encryption algorithm with
An improved color image encryption algorithm withAn improved color image encryption algorithm with
An improved color image encryption algorithm with
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network traffic
 
New electromagnetic force sensor measuring the density of liquids
New electromagnetic force sensor measuring the density of liquidsNew electromagnetic force sensor measuring the density of liquids
New electromagnetic force sensor measuring the density of liquids
 

Similar a Semantic analyzer for marathi text

Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
IJwest
 
Enriching search results using ontology
Enriching search results using ontologyEnriching search results using ontology
Enriching search results using ontology
IAEME Publication
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzer
IAESIJAI
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
cscpconf
 

Similar a Semantic analyzer for marathi text (20)

Information extraction using discourse
Information extraction using discourseInformation extraction using discourse
Information extraction using discourse
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
 
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
 
NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...
NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...
NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...
 
Ontology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and TrendsOntology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and Trends
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
 
Implementation of Semantic Analysis Using Domain Ontology
Implementation of Semantic Analysis Using Domain OntologyImplementation of Semantic Analysis Using Domain Ontology
Implementation of Semantic Analysis Using Domain Ontology
 
Enriching search results using ontology
Enriching search results using ontologyEnriching search results using ontology
Enriching search results using ontology
 
Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...Sentence level sentiment polarity calculation for customer reviews by conside...
Sentence level sentiment polarity calculation for customer reviews by conside...
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzer
 
Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learning
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
A Novel approach for Document Clustering using Concept Extraction
A Novel approach for Document Clustering using Concept ExtractionA Novel approach for Document Clustering using Concept Extraction
A Novel approach for Document Clustering using Concept Extraction
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professional
 
Evaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on BrexitEvaluating sentiment analysis and word embedding techniques on Brexit
Evaluating sentiment analysis and word embedding techniques on Brexit
 
A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And Applications
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 

Más de eSAT Publishing House

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
eSAT Publishing House
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
eSAT Publishing House
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
eSAT Publishing House
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
eSAT Publishing House
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
eSAT Publishing House
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
eSAT Publishing House
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
eSAT Publishing House
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
eSAT Publishing House
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
eSAT Publishing House
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
eSAT Publishing House
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
eSAT Publishing House
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
eSAT Publishing House
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
eSAT Publishing House
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
eSAT Publishing House
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
eSAT Publishing House
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
eSAT Publishing House
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
eSAT Publishing House
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
eSAT Publishing House
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
eSAT Publishing House
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
eSAT Publishing House
 

Más de eSAT Publishing House (20)

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
 

Último

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Último (20)

A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 

Semantic analyzer for marathi text

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 527 SEMANTIC ANALYZER FOR MARATHI TEXT Pallavi Bagul1 , Prachi Mahajan2 , Archana Mishra3 , Medinee Kulkarni4 , Gauri Dhopavkar5 1 Computer Technology, YCCE Nagpur- 441110, Maharashtra, India 2 Computer Technology, YCCE Nagpur- 441110, Maharashtra, India 3 Computer Technology, YCCE Nagpur- 441110, Maharashtra, India 4 Computer Technology, YCCE Nagpur- 441110, Maharashtra, India 5 Computer Technology, YCCE Nagpur- 441110, Maharashtra, India Abstract This paper represents a Semantic Analyzer for checking the semantic correctness of the given input text. We describe our system as the one which analyzes the text by comparing it with the meaning of the words given in the WordNet. The Semantic Analyzer thus developed not only detects and displays semantic errors in the text but it also corrects them. Keywords: Part of Speech (POS) Tagger, Morphological Analyzer, Syntactic Analyzer, Semantic Analyzer, Natural Language (NL) -----------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Natural Language (NL) is a very important and essential tool to represent information. Computer is not able to understand NL. Now when we say understanding is a mental process, it means how human beings recognize objects (mental and physical) and links between them. Since computer does not have a human mentality, so it cannot understand by definition. NL processing (NLP) is a general problem and to be more specific we can separate it by categories according to increasing level or complexity of such processing:  Morphology and morphological processing  Syntax and syntactical processing  Semantics and semantic processing Morphology is a sub-discipline of linguistics that studies word structure. During morphological processing we are basically considering words in a text separately and trying to identify morphological classes in which these words belong to. One of the widespread tasks here is lemmatizing or stemming which is used in many web search engines. In this case all morphological variations of a given word (known as word- forms) are collapsed to one lemma or stem [5]. Syntax as part of grammar is a description of how words are grouped and connected to each other in a sentence. Syntax usually entails the transformation of a linear sequence of tokens into a hierarchical syntax tree. A token is akin to an individual word or punctuation mark in a natural language. Main problems on this level are: part of speech tagging (POS tagging), chunking or detecting syntactic categories (verb, noun phrases) and sentence assembling (constructing syntax tree) [8]. Semantics and its understanding as a study of meaning covers most complex tasks like: finding synonyms, word sense disambiguation, constructing question-answering systems, translating from one NL to another, populating base of knowledge. Basically one needs to complete morphological and syntactical analysis before trying to solve any semantic problem. Formalization of NL leads us to solutions of all these problems [3]. 2. APPROACHES Broadly speaking there are two basic ways in which Semantic Analysis can be carried. They are classified as: 2.1 Supervised Semantic Analysis The supervised models require a pre-annotated corpus which is used for training the application so as to learn information about the various words, their tags, frequencies, rule sets etc. The performance of the model generally increases when we increase the size of the corpus. Such annotated resources are scarce and expensive to create, motivating the need for unsupervised or semi-supervised techniques. 2.2 Unsupervised Semantic Analysis: Unsupervised approaches do not depend on pre-annotated corpus instead it relies on distributional similarity of contexts to decide on semantic relatedness of terms, but this information may be sparse and not reliable always [13]. However, unsupervised methods have their own challenges they are not always able to discover semantic equivalences of lexical entries or logical forms or, on the contrary, cluster semantically different or even opposite expressions.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 528 2.3 Semi- Supervised Semantic Analysis: In this case, groups of unannotated texts with overlapping and non-contradictory semantics provide a valuable source of information. This form of weak supervision helps to discover implicit clustering of lexical entries and predicates, which presents a challenge for purely unsupervised techniques [12]. 3. LITERATURE SURVEY Considerable amount of work has already been done in the field of Semantic analysis for English text. Different approaches along with modifications have been tried and implemented. However, if we look at the same scenario for South-Asian languages such as Marathi and Hindi, we find out that not much work has been done. The main reason for this is the unavailability of a considerable amount of annotated corpora of sound quality, and very high level of ambiguity in those languages. In the following sections, we describe some POS tagging models, Morphological analysis model, syntactic analysis models and semantic analysis model that have been implemented for English and other languages along with their performances. In the year 2011, Akira Shimazu, Syozo Naito, and Hirosato Nomura of Musashino Electrical Communication Laboratory, Japan developed a Japanese Language Semantic Analyzer that was based on an extended case frame model. The case frame model consists of a relatively large collection of case relations, modalities and conjunctive relations. The analyzer uses a frame type knowledge base for analyzing. It also utilizes plausibility scores for dealing with ambiguities and local scene frames for the prediction of omitted case elements [3]. Ivan Titov and Mikhail Kozhevnikov developed a Bootstrapping Semantic Analyzers from Non-Contradictory Texts in the year 2010. They urged that groups of unannotated texts with overlapping and non-contradictory semantics provide a valuable source of information. This form of weak supervision helps to discover implicit clustering of lexical entries and predicates, which presents a challenge for purely unsupervised techniques [14]. They considered the generative semantics-text correspondence model and demonstrate that exploiting the noncontradiction relation between texts leads to substantial improvements over natural baselines on a problem of analyzing human-written weather forecasts. A Semantic Analyzer for aiding emotion recognition in Chinese Language is developed by Jiajun Yan, David B. Bracewell, Fuji Ren and Shingo Kuroiwa in the year 2006. The analyzer developed, uses a decision tree to assign semantic dependency relations between headwords and modifiers. It is able to achieve an accuracy of 83.5%. The semantic information is combined with rules for Chinese verbs containing emotion to describe the emotion of the people in the sentence. The rules give information on how to assign emotion to agents, receivers, etc. depending on the verb in the sentence [4]. Semantic analysis is also used for retrieving data from text and applying that to ER modeling. Semantic analysis involves a process whereby meaning representations are created and assigned to linguistic inputs. There are some semantic roles defined which are helpful in interpreting possible elements of the ER model. These semantic roles may also indicate the types of entities from the given text. After recognizing the meaning the particular roles may be defined to identify the elements of ER modeling [9]. Semantic analysis of text and speech by Anssi Klapuri in 2007 states various approaches on semantic analysis: i) statistical approach ii) information retrieval iii) domain knowledge driven analysis [10]. 4. METHODOLOGY 4.1 Block Diagram Fig. 1 Block Diagram 4.2 POS Tagger The input string (Raw Text) is tokenized and a word net is used for detecting the part of speech of each token in the sentence. The wordnet [15] stores the part of speech of each word using 4 bits where 1000 signifies noun, 0100 signifies Adjective, 0010 signifies Adverb and 0001 signifies a verb. We also have combinations like 1100 which means that the designated word can be used both as a noun and as an Adjective, we also have is 0110 which means that the designated word can be used both as an Adjective and as an Adverb. POS Tagger Morpholocical Analyzer Syntactic Analyzer Semantic Anayzer Semantically corrected text Raw Text
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 529 This ambiguity is resolved using Marathi grammar rules. If we have 0110 ambiguity then if the next token is a noun or an adjective then the ambiguous word becomes an adjective. If the next token is a verb then the ambiguous word becomes an adverb. If we have 1100 ambiguity then if the next token is a noun then the ambiguous word becomes an adjective otherwise it becomes an adverb. 4.3 Morphological Analyzer The POS Tagger‟s output is then passed to the Morphological Analyzer which detects the root-word along with the gender and the tense. The first step in this module is to find the root word of each token in the given sentence this is done with the help of the word net. Using the WordNet the gender of each token in the sentence is detected. The word net is trained using a tourism related corpus. The subject, object and verb in the input sentence is detected. For better accuracy the gender of the sentence is detected with respect to the gender of the subject. The tense of the given input sentence is detected using the Marathi grammar rules. 4.4 Syntactic Analyzer Morphological Analyzer‟s output is used by the Syntactic Analyzer to detect whether the output is syntactically correct or not. It is detected whether the subject, object and verb are placed at proper positions in the sentence. The sentence structure should be in accordance to the Marathi grammar rules where subject comes in first position followed by the object and the verb. If it is not in proper order we change the original order. Finally we check whether the tense assigned to the sentence is in accordance with the sentence structure if not then the sentence structure is modified such that it retains its meaning and has the correct tense associated with it. 4.5 Semantic Analyzer The output of the Syntactic Analyzer is used by the Semantic Analyzer to check the semantic correctness of the input text and also correct it if it is found to be incorrect. All the tokens in the input sentence must semantically support each other. This means that if two tokens in the sentence are mismatching then it is not semantically correct and needs to be corrected. If the tokens in the sentence are antinomies of each other and they are being used in the same context then they need to be corrected as the sentence is not semantically correct. For this a list of all the synonyms and antonyms has to be maintained using the tourism corpus. Finally, the semantically corrected text is presented as output to the user. 5. RESULTS The output of Semantic Analyzer cannot be obtained before completing POS Tagging, Morphological Analysis and Syntactic Analysis. 5.1 POS Tagger Input sentence: लाल फु ल निळा आहे . By this phase, each token i.e. word is assigned its suitable part of speech: लाल-Adjective फु ल-Noun निळा-Adjective आहे-Verb 5.2 Morphological Analyzer In this phase, root word for each token is derived and also gender of the whole sentence is found out. लाल- लाल, फु ल- फु ल, निळा- निळा, आहे-असणे. लाल फु ल निळा आहे .- Neuter sentence 5.3 Syntactic Analyzer In this phase, the sentence is checked if it is syntactically correct or not. If not, it is corrected. लाल फु ल निळा आहे .- Syntactically incorrect sentence Corrected sentence: लाल फु ल निळे आहे 5.4 Semantic Analyzer In this phase, the sentence is checked if it is semantically correct or not. If not, it is corrected. लाल फु ल निळे आहे.- Semantically incorrect sentence Corrected sentence: लाल फु ल निळे िाही आहे 6. CONCLUSIONS This paper presents a Semantic Analyzer for Marathi Language which undergoes the phases: POS tagger, Morphological Analyzer, Syntactic Analyzer and Semantic Analyzer respectively. POS tagger used in this work follows rule based approach. This work allows single sentence to undergo semantic analysis.
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Issue: 03 | Mar-2014, Available @ http://www.ijret.org 530 REFERENCES [1] Adwait Ratnaparkhi, “A maximum entropy model for part-of-speech tagging,” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, University of Pennsylvania, 1996, pp. 133- -142. [2] M. Deroualt and B. Merialdo, “Natural Language Modeling For Phoneme-To-Text Transposition”, IEEE transactions on Pattern Analysis and Machine Intelligence, 1986. [3] Akira Shimazu, Syozo Naito, and Hirosato Nomura, “Japanese Language Semantic Analyzer based on an Extended Case Frame Model”, 3-9-11. [4] Jiajun Yan, David B. Bracewell, Fuji Ren and Shingo Kuroiwa,“A Semantic Analyzer for aiding emotion recognition in Chinese”, 2006. [5] Koskenniemi .K, „Two –Level Morphology: A general Computational,” Model for Word Recognition and Production, University of Helsinki, Helsinki, 1983. [6] Vishal Goyal, Gurpreet Singh Lehal, “Hindi Morphological Analyzer and Generator,” IEEE Computer Society Press, pp. 1156–1159, 2008. [7] Niraj Aswani, Robert Gaizauskas, “Developing Morphological Analysers for South Asian Languages: Experimenting with the Hindi and Gujarati Languages,” In Proceedings of LREC, 2010. [8] J. P. Thorne, P. Bratley and H. Dewar, “The Syntactic Analysis of English by Machine,” International Conference on NLP, 1996. [9] N.Omar, P.Hanna, P.Mc Kevitt, “Semantic Analysis in the Automation of ER Modelling through Natural Language Processing," Computing & Informatics, 2006. ICOCI '06. International Conference, June 2006. [10] Anssi Klapuri, “Semantic analysis of text and speech,” SGN-9206 Signal processing graduate seminar II, Fall , 2007. [11] Debbarma, K. Patra, B. G. Debbarma, S. Kumari and Purkayastha, “Morphological analysis of Kokborok for universal networking language dictionary,” 1st International Conference on Recent Advances in Information Technology (RAIT), pages 474-477, IEEE., 2012 [12] Sugatu Basu, Arindam Banjeree, and Raymond Mooney, “Active semi-supervision for pairwise constrained clustering,” In Proc. of the SIAM International Conference on Data Mining (SDM), pages 333–344, 2004. [13] Hoifung Poon and Pedro Domingos, “Unsupervised semantic parsing,” In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP-09), 2009. [14] Ivan Titov Mikhail Kozhevnikov, “Bootstrapping Semantic Analyzers from Non-Contradictory Texts,” Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 958– 967, 11-16 July 2010. [15] http://www.cfilt.iitb.ac.in/wordnet/webmwn/