Twitter is a popular social networking website where users posts and interact with messages known as “tweets”. To automate the analysis of such data, the area of Sentiment Analysis has emerged. It aims at identifying opinionative data in the Web and classifying them according to their polarity, i.e., whether they carry a positive or negative connotation. We will attempt to conduct sentiment analysis on “tweets” using various different machine learning algorithms.
2. Introduction
With the huge amount of increase in the web technologies, the no of
people expressing their views and the opinion via web are increasing. This
information is useful for everyone like businesses, governments and
individuals with 500+ million tweets per day , twitter is becoming a major
source of information. Input to our model is the raw data extracted from
tweets. For the same, we automate the process of tweet extraction and
categorizing it into two categories i.e. positive or negative. The content in
twitter generated by the user is about different kinds of products, event,
people and political affairs.
Performing sentiment analysis on tweets is considered best due to the
following reasons:
1. Tweets are abstract in nature.
2. Analysis in real time can be done.
3. A vast variety of tweets for performing the analysis.
4. Tweepy
Tweepy is open-sourced and enables Python to communicate with Twitter
platform and use its API. It is twitter library in python.
TextBlob
TextBlob is a Python (2 and 3) library for processing textual data. It provides
a simple API for diving into common natural language processing (NLP) tasks
such as part-of-speech tagging, noun phrase extraction, sentiment analysis,
classification, translation, and more.
NLTK
The Natural Language Toolkit (NLTK) is a platform used for building Python
programs that work with human language datat. It contains text processing
libraries for tokenization, parsing, classification, stemming, tagging and
semantic reasoning.
Chicksexer
It is a python package. It receives the string of a person name and returns
probability estimate of its gender.
5. Reason for Choosing Python
Python is open source.
All development tools for python is free.
Development is possible most OS platform.
Huge resources available
6. Application Snapshots
This chapter highlights the results of the project & the
snapshots for each of the activities are shown along with
the discussion of each activity describing its working.
Each snapshot describes every single step of the
“Sentiment Analysis OF Twitter Data”.
15. Learning from the project
Python development fundamental.
Familiarity with Django Framework.
Python programming Language basic
Use of xml for layout definition
Use of mysqlite for database management in
Different technique of debugging web application
16. Conclusion
The project exposed us to the latest technology in the area of
analysis of twitter data. Thus, this project successfully
demonstrated a web based “analysis system” where tweets are
used as analysis medium. We faced many challenges and
problem during the development time. It was a nice learning
curve to work on python development.