Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Guided By,
Mrs. Gauri M. Dhopavkar
Presented By,
Ritikesh Bhaskarwar Vimal Shah
Ashwin Borkar Shashil Pohankar
Department of ComputerTechnology
YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING,
Nagpur
(An Autonomous Institution Affiliated t...
Natural language processing
 Natural language processing (NLP) is a
field of computer science, artificial
intelligence, a...
POS Tagging :
 Part-of-Speech (POS) tagging is the
process of assigning a part-of-speech like
noun, verb, pronoun or othe...
ते फू ल खूप
सुगंधी
आहे
Marathi POS
Tagger
ते-unidentified
फू ल-noun
खूप-adjective
सुगंधी-
adjective
आहे-verb
THE POSTAGGIN...
Need of Marathi POS Tagging :
 Lack of significant tools for Indian
languages
 Dependence of other NLP activities on
POS...
Overview of
POS tagging
Methods for POSTagging
1.Rule Based 2.Stochastic
 The rule based POS tagging
models apply a set of hand
written rules and...
Methods for POSTagging
(cntd.)
3. Hiden Markov Model 4. Maximum Entropy Model
 The HMM model trains on
annotated corpora ...
Architecture and Design :
 Marathi sentence is taken as input , then
the tokens are created followed by
tagging and findi...
Detail of Identified Module :
 Tokenizer :This module is used to get the
tokens of the input sentence.Also, calls
the oth...
Details of identified modules (cntd.)
 Root word : This module is used for
finding the root word of each token
finding it...
Experimentation and Results :
1.
• 1000: If first bit is 1, then we assign a tag as a noun to
the particular word.
• 1100:...
Advantages :
 A POS tagger can be seen as a first-step
towards tightening the integration
between speech recognition and ...
Advantages (cntd.):
 A typical NLP system consists of
tokenization, sentence delimitation, part-of-
speech (POS) tagging,...
Limitations :
 User Cannot enter more than one sentence
i.e. cannot enter paragraph.
 It is not able to detect and repor...
Applications :
 Information Retrieval
 Speech synthesis
 Word Sense Disambiguation (WSD)
 Machine Translation (MT)
-Te...
Snapshots
Conclusion and Future Scope :
 The POS tagger described here is very
simple and efficient for automatic tagging,
but the ...
Presentation1
Presentation1
Presentation1
Presentation1
Próximo SlideShare
Cargando en…5
×

de

Presentation1 Slide 1 Presentation1 Slide 2 Presentation1 Slide 3 Presentation1 Slide 4 Presentation1 Slide 5 Presentation1 Slide 6 Presentation1 Slide 7 Presentation1 Slide 8 Presentation1 Slide 9 Presentation1 Slide 10 Presentation1 Slide 11 Presentation1 Slide 12 Presentation1 Slide 13 Presentation1 Slide 14 Presentation1 Slide 15 Presentation1 Slide 16 Presentation1 Slide 17 Presentation1 Slide 18 Presentation1 Slide 19 Presentation1 Slide 20 Presentation1 Slide 21 Presentation1 Slide 22 Presentation1 Slide 23
Próximo SlideShare
Google cloud
Siguiente
Descargar para leer sin conexión y ver en pantalla completa.

1 recomendación

Compartir

Descargar para leer sin conexión

Presentation1

Descargar para leer sin conexión

Natural Language Processing, , , , , ,Marathi POS tagger

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Presentation1

  1. 1. Guided By, Mrs. Gauri M. Dhopavkar Presented By, Ritikesh Bhaskarwar Vimal Shah Ashwin Borkar Shashil Pohankar
  2. 2. Department of ComputerTechnology YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING, Nagpur (An Autonomous Institution Affiliated to RashtrasantTukadoji Maharaj Nagpur University)
  3. 3. Natural language processing  Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages.  Natural Language Processing (NLP) is the computerized approach to analysing text that is based on both a set of theories and a set of technologies
  4. 4. POS Tagging :  Part-of-Speech (POS) tagging is the process of assigning a part-of-speech like noun, verb, pronoun or other lexical class marker to each word in a sentence.  After POS tags are identified, the next step is chunking, which involves dividing sentences into non-overlapping non- recursive phrases.
  5. 5. ते फू ल खूप सुगंधी आहे Marathi POS Tagger ते-unidentified फू ल-noun खूप-adjective सुगंधी- adjective आहे-verb THE POSTAGGING EXAMPLE
  6. 6. Need of Marathi POS Tagging :  Lack of significant tools for Indian languages  Dependence of other NLP activities on POS tagging  Failure of existing techniques on Indian Languages
  7. 7. Overview of POS tagging
  8. 8. Methods for POSTagging 1.Rule Based 2.Stochastic  The rule based POS tagging models apply a set of hand written rules and use contextual information to assign POS tags to words.  A stochastic approach includes frequency, probability or statistics. The simplest stochastic approach finds out the most frequently used tag for a specific word in the annotated training data and uses this information to tag that word in the unannotated text.
  9. 9. Methods for POSTagging (cntd.) 3. Hiden Markov Model 4. Maximum Entropy Model  The HMM model trains on annotated corpora to find out the transition and emission probabilities  The Maximum Entropy Model (MEM) is based on the principle of Maximum Entropy, which states that when choosing between a number of different probabilistic models for a set of data, the most valid model is the one which makes fewest arbitrary assumptions about the nature of the data
  10. 10. Architecture and Design :  Marathi sentence is taken as input , then the tokens are created followed by tagging and finding ambiguity. TOKENIZING TAGGING FINDING AMBIGUOUS WORDS FINDING PROBABILITY ASSIGN TAGS ACCORDING TO PROBABILITY VIEW THE RESULT INPUT
  11. 11. Detail of Identified Module :  Tokenizer :This module is used to get the tokens of the input sentence.Also, calls the other modules when required.  Tagging :These modules is used for assigning certain tags to tokens and also search for ambiguous words and also find their types and assign some special symbols to them.
  12. 12. Details of identified modules (cntd.)  Root word : This module is used for finding the root word of each token finding it from the Marathi wordnet.  Probability : This module calculates the probability and accordingly assigns the tag, according to the higher probability of word. • Showing the results :This module shows the result.The words are shown with tags.
  13. 13. Experimentation and Results : 1. • 1000: If first bit is 1, then we assign a tag as a noun to the particular word. • 1100: In this case, the word can be used as both unidentified. 2. • 0100: If second bit is 1, then we assign a tag as an adjective to the particular word. • 0110: In this case, the word can be used as other words. 3. • 0010: If third bit is 1, then we assign a tag as an adverb to the particular word. • 0001: If fourth bit is 1, then we assign a tag as a verb to the particular word.
  14. 14. Advantages :  A POS tagger can be seen as a first-step towards tightening the integration between speech recognition and natural language processing.  A POS tagger in the language model aids in the identification of boundary tones and speech repairs, redefining the speech recognition problem.
  15. 15. Advantages (cntd.):  A typical NLP system consists of tokenization, sentence delimitation, part-of- speech (POS) tagging, phrase chunking, parsing, and concept mapping. As one of the initial steps, POS tagging determines the part of speech for each token in a sentence.  Managers, educators, Trainers, Sales people are able to accurately assess the needs of a group, improves questioning techniques thus improving their skills to achieve more consistent results.
  16. 16. Limitations :  User Cannot enter more than one sentence i.e. cannot enter paragraph.  It is not able to detect and report the gender of the word i.e. Morphological analysis in not done.  When ambiguity is encountered it is searched for the POS of the ambiguous word if it contains less or no word with the correct POS and there are more number of words for other POS then it shows incorrect POS for the ambiguous word.
  17. 17. Applications :  Information Retrieval  Speech synthesis  Word Sense Disambiguation (WSD)  Machine Translation (MT) -Text to Text -Speech to Speech
  18. 18. Snapshots
  19. 19. Conclusion and Future Scope :  The POS tagger described here is very simple and efficient for automatic tagging, but the morphological complexity of the Marathi make it hard.The performance of the current system is good and result achieved by this method are excellent. In future we wish to improve the accuracy our system by adding more tagged sentence in our training corpus.
  • FareenShaikh5

    Apr. 10, 2019

Natural Language Processing, , , , , ,Marathi POS tagger

Vistas

Total de vistas

279

En Slideshare

0

De embebidos

0

Número de embebidos

1

Acciones

Descargas

5

Compartidos

0

Comentarios

0

Me gusta

1

×