SlideShare a Scribd company logo
1 of 7
Download to read offline
Analytics: Key to go from generating big data to
deriving business value
Deepali Arora1, Piyush Malik2,
1Dept. of Electrical and Computer Engineering, University of Victoria, P.O. Box 3055 STN CSC, Victoria, B.C
{darora}@ece.uvic.ca
2Business Analytics and Strategy, IBM Global Business Services, 4400 N 1st Street, San Jose, CA
{Piyush.Malik}@us.ibm.com
Abstract—The potential to extract actionable insights from big
data has gained increased attention of researchers in academia
as well as several industrial sectors. The field has become
interesting and problems look even more exciting to solve ever
since organizations have been trying to tame large volumes
of complex and fast arriving big data streams through newer
computing paradigms. However, extracting meaningful and ac-
tionable information from big data is a challenging and daunting
task. The ability to generate value from large volumes of data is
an art which combined with analytical skills needs to be mastered
in order to gain competitive advantage in business. The ability of
organizations to leverage the emerging technologies and integrate
big data into their enterprise architectures effectively depends
on the maturity level of the technology and business teams,
capabilities they develop as well as the strategies they adopt.
In this paper, through selected use cases, we demonstrate how
statistical analyses, machine learning algorithms, optimization
and text mining algorithms can be applied to extract meaningful
insights from the data available through social media, online
commerce, telecommunication industry, smart utility meters and
used for variety of business benefits, including improving security.
The nature of applied analytical techniques largely depends on
the underlying nature of the problem so a one-size-fits-all solution
hardly exists. Deriving information from big data is also subject
to challenges associated with data security and privacy. These and
other challenges are discussed in context of the selected problems
to illustrate the potential of big data analytics.
I. INTRODUCTION
The analysis of big data and the associated potential to ex-
tract actionable information has gained attention of researchers
in both academia and industry [1], [2], [3], [4]. Researchers
in both academia/industry have emphasized on developing
new tools and techniques for better storing, managing, and
analyzing big data [5]. However, the business community
is looking for ways to improve their profits by leveraging
information hidden in big data through analytics [6]. Mas-
sive amount of data are generated on a daily basis from
various sources including (but not limited to) online shop-
ping transactions, gas and electric meters, electronic health
records, social networking interactions, weather and satellite
data, embedded sensors in industrial machinery as well as in
automobiles and aircrafts, data center computing equipment
as well as telecommunication industry equipment. According
to International Data Corporation (IDC), cumulative digital
data is predicted to grow from 4.4 zeta-bytes (ZB) in 2013 to
44 ZB by the year 2020 [7]. Data is now considered as the
“new oil” of the economy, defined mainly by four prominent
characteristics- volume, velocity, variety and veracity [8].
While better understanding of the knowledge hidden within the
large datasets generated from various sources can potentially
help businesses, deriving useful information from these data
is a big challenge.
Before any kind of actionable insights from data can
be derived using advanced analysis techniques, several pre-
processing steps are involved. These steps include data collec-
tion, data preparation and cleansing, data storage, and manage-
ment [9]. The analysis of data can be broadly classified into
three categories based on the depth of analysis: 1) descriptive
analytics which exploits the historical trends to extract useful
information from the data, 2) predictive analytics that focuses
on predicting future probability of occurrence of pattern or
trends, and 3) prescriptive analytics which focuses on decision
making by gaining insights into the system behavior [9].
Regardless of the depth of the analysis, extracting information
from data requires a solid understanding of techniques com-
prising of statistical analysis, optimization, machine learning,
text-mining algorithms, etc.
A number of studies have highlighted the tools/algorithms
that can be used to derive solutions for various problems
associated with big data [1], [2], [3], [4], [10]. For example, [1]
and [2] presented a brief review of the challenges and issues
surrounding big data. Some of the popular tools, frameworks
and technologies that can be used to aggregate, manage
and analyze big data, includes Hadoop and its ecosystem
of techniques and tools such as Pig, Hive, Hbase, Spark,
High Performance Cluster Computing (HPCC)), in-memory
computing engines and NoSQL databases, cloud based data
service engines, etc. are still nascent and continually evolving
under the open source software movement. A brief overview
of how big data can be used to derive value for various
organizations including government, educational institutions
and industries is presented in [11]. Possibilities and challenges
in implementing big data related technologies in organizations,
including storage of the data, lack of skilled people and time
involved in processing of huge datasets are discussed in [12].
[3] and [4] presented the general overview of how big data
can be used to generate value for businesses. An in-depth
tutorial on big data analytics is presented by Hu et al. [13],
2015 IEEE First International Conference on Big Data Computing Service and Applications
978-1-4799-8128-1/15 $31.00 © 2015 IEEE
DOI 10.1109/BigDataService.2015.62
446
who assessed different techniques that can be used for data
acquisition and pre-processing, data storage and management
and different analytics techniques to derive information.
While these studies provide a good overview of the big
data opportunity, issues and challenges involved, value to
businesses, and how various techniques can be used for data
storage, processing or analytics in general, none of these
studies have discussed the application of different algorithms
to derive value for specific applications and this is the main
focus of this paper. In this paper using five different use cases,
we illustrate how big data analytics has been used in obtaining
meaningful information. The use cases considered in this
paper include sentiment analysis for social media, preventing
customer churn in the telecommunication sector, enhancing
customers’ online shopping experience, generating value from
smart utility meters and improving data security. The objective
of this paper is not to present new algorithms for any of
the selected industry use cases but rather, to provide a brief
overview and are illustrative of the existing algorithms and
methods that can be be applied to derive value from big data.
Detailed discussion of how different innovative algorithms can
be applied to realize value for each of these use cases and
challenges associated with them is beyond the scope of the
current paper and could be presented in an extended version
of this paper in the future.
This paper is organized as follows: Application of data
analytics to different domains is discussed in Section 2,
Section 3 highlights some of the challenges around big data
analytics and finally, conclusions are presented in Section 4.
II. APPLICATION OF BIG DATA ANALYTICS
A. Sentiment analysis in social networks
The explosion of data in the form of blogs, online forums
and on social media channels such as Facebook, Twitter,
Linkedin, Instagram, Pintrest, Youtube, etc has given con-
sumers a new way of expressing their opinions about any
product or service and consequently may influence other
potential buyers. Investigation of users’ opinions or sentiments
about any product or service, expressed in textual form,
on these websites/blogs is referred to as sentiment analysis
[14]. Sentiment analysis combines natural language processing
with artificial intelligence capability and text analytics to
evaluate statements found across various social platforms to
determine whether they are positive or negative with respect to
a particular brand, product or service [15]. Sentiment analysis
thus provides business intelligence which can be used to
make impactful decisions. In addition, consumers routinely
look for online reviews before buying any product or service.
Developing techniques that can better automate the process of
analyzing user generated web content about a given product
or service is now the focus of research in both academia and
industry. Several companies are also involved in designing
algorithms/tools that can perform sentiment analysis either
online for free or at nominal costs. One such example is IBM
Watson’s user modeling service that uses linguistic analytics
to generate psychographic profiles and extract cognitive and
social characteristics based on users emails, text messages,
tweets, forum posts, etc [16]. Some of the other examples of
sentiment analysis tools includes Google analytics, Tweetstats,
Social Mention, and Twendz [17].
There are three main classification levels in sentiment anal-
ysis: the document-level, the sentence-level, and the aspect-
level sentiment analysis [18] and the methodologies that can
be used to detect them are broadly classified into three main
categories, i.e., lexicon based techniques, machine learning
techniques and hybrid approaches [19]. The lexicon-based
approach relies on a collection of known and pre-compiled
sentiment terms, machine learning approaches are based on
application of different algorithms that can be trained and the
hybrid approaches are based on the combination of these two
approaches [20]. A number of studies have used lexicon based
approaches [21], machine learning based supervised [22], or
unsupervised [23], [24] approaches, and combined machine
learning and lexicon based [25], [26] approaches to classify
sentiments into positive or negative categories.
Sentiment analysis has been used by researchers in finding
people’s opinion expressed on social media sites including
Twitter about products/services launched by a company [27]
and in real world industrial application (based on second
author’s experience) in which one of IBM’s clients leveraged
sentiments from social media to identify influencers of a
public policy. The general methodology in both these use
cases involved four main steps: gathering data, generating
features, designing a classifier that can differentiate between
different sentiments i.e., positive, negative or neutral, and
finally deriving a sentiment score.
However, deriving information from the user created web
content remains a daunting task as the sentiments may carry
varying meanings in different disciplines and cultures. Thus,
to derive meaningful results, data features such as individ-
ual keywords and their frequency of occurrences; parts of
speech such as adjectives, adverbs; opinion words and phrases
including good or bad, likes, dislikes; and negations [28],
[29] should be carefully derived following feature selection
techniques [18]. Supervised machine learning approaches such
as classification algorithms can then be designed by converting
the sentiment analysis problem to a simple text classification
problem. For a standard text classification problem, the subset
of data is used to form a training record set defining different
classes. These classes are related to the underlying feature
values. The classification model can then be used to predict the
class label for any new instance. Several classification models
are discussed in the literature [18]. Some of the commonly
used classifiers include the Naive Bayes classifier, support
vector machines (SVM), maximum entropy based classifier,
decision trees, and neural networks [18]. Similarly unsuper-
vised techniques can also be used to derive users’ sentiments
about products/services [23], [24]. The power of integrating
sentiments and intelligence trends from social media was
recently hailed as the reason for IBM and Twitter to forge an
alliance to incorporate Twitter analytics into their consulting
business [30].
447
B. Preventing customer churn in telecommunication sector
The strong competition amongst telecommunication service
providers has compelled them to offer packages that could
potentially attract either more customers or at least help
them retain their existing ones. Since cost of acquiring a
new customer is relatively high compared to retaining the
existing customers [31], companies are developing new and
competitive ways to retain their customers and maintain long
term relationship with them to avoid customer churn. Churners
are the customers who leave their existing telecommunication
service provider and switch to new ones for different reasons
[32]. Customers generally switch services for lower prices
or better services. Predicting customer churn is important for
companies as it directly affects their revenues. It can also help
companies take action by offering better service or attractive
packages to prevent their existing customer from switching
to different service provider. Literature reveals [33] that on
average the telecommunication companies face around 2.2 %
of customer churn each month. Designing algorithms that can
predict and in turn prevent customer churn is important to the
telecommunication industry.
The problem of predicting churn and non-churn customers
has been addressed in number of studies [31], [32]. However,
with increasing competition, the companies are now turning to-
wards machine learning algorithms to gain early insights about
their customers’ behavior such that timely actions can be taken
to prevent customer churn. One simple approach to predict if
the user is churn or non-churn customer, is to formulate it as a
two class classifier problem using underlying feature values to
predict the outcome. Some of the possible features that can be
used to define churn and non-churn classes, includes duration
of customers calls, services subscribed, usage pattern, and
demographics [31]. A comprehensive review of the approaches
that can be followed to predict churning customer is presented
in [31], [32].
Telecommunication service providers can also use infor-
mation about customers usage pattern or services subscribed
and demographics to design and offer customized packages
to their users [34]. One possible approach is to use clustering
algorithms for customer segmentation based on the services
they use [35], [36], where clustering refers to partitioning of
data points into small number of clusters with some similarity.
This allows companies to identify customers for promotion
of the products in future, in retaining their customers and
attracting new customers by offering customized packages to
the targeted audiences based on their usage behaviors.
A real world example at Celcom, a telecommunication
service provider in Asia that is using predictive personalized
analytics to predict churn probability of its customers. They
are also offering personalized incentives and geolocation based
cross brand promotional offers and coupons and offers, thereby
increasing engagement and loyalty with its client base [37].
C. Enhancing customers’ online shopping experience
With the advancements in technology and introduction of
smartphones and tablets, online shopping has become conve-
nient, ubiquitous and so much popular that it is predicted to
grow to $370 billion in 2017 [38]. Businesses are now using
advanced analytics to predict customer behaviors and for car-
rying out customer segmentation based on the characteristics
of the customer groups [39]. While data from online clicks
on stores’ inventory does yield information about what user
is looking for, it still doesn’t provide companies the complete
information about their consumers as many of them still go
to retail malls to buy a product [40]. Retailers need to merge
both offline and online data to design algorithms for better
understanding of their customers’ behaviors and for designing
product recommendation engines for different audiences [41].
One of the approaches followed to predict customer be-
havior is the use of the transactional data. For example, [42]
developed a model using hierarchical clustering and a hidden
Markov model (HMM) to predict customer behavior based on
transactional data. [43] also used Markov model to predict the
probability of click to conversion based on the time spent by
the customer on site. [44] compared the performance of ag-
gregate (developing one model for all customers), segmented
(developing models for different segments of customers) and
1-to-1 (developing models for individual users) marketing
approaches across a broad range of experimental settings
including multiple segmentation levels, real-world marketing
datasets, dependent variables, different types of classifiers,
segmentation/clustering techniques, and different predictive
measures. Their results showed both 1-to-1 and segmentation
approaches significantly outperform the aggregate modelling
approaches. However, in the presence of little transactional
data, the segmentation models outperformed both 1-to-1 and
aggregate modelling approaches.
Once a retailer knows the underlying behavior of a con-
sumer, then based on the products that a customer selected in
the past, they can design recommender systems to assist them
in selecting similar products [45]. The underlying assumption
is that the consumers follow patterns similar to their past
spending habits and are likely to repeat it in the future. Using
different machine learning techniques such as classification,
genetic algorithms, clustering or K-nearest neighbor algo-
rithms [45], retailers can potentially identify different customer
segments and predict customers’ preference and spending
abilities. This can help retailers in better advertising of their
products to the right audiences.
The data mining techniques can also be used to market
products to consumers based on their demographics informa-
tion combined with their online activities. By combining the
information about geographic location of a user, the time of
day/week they visit store, the products they buy, and mapping
those attributes against the actual sales data it is possible to
highlight hidden interactions between online and offline sales
activity of a consumer. However, combining online and offline
information is a real challenge for retailers [46].
While online retailers like Amazon and eBay are already
using sophisticated data analytic techniques to enhance cus-
tomers’ online shopping experience, the traditional brick and
mortar retailers are also now realizing the benefits of analytics
for increased profits. The acquisition of Kosmix labs by Wal-
mart in 2011 is one such example [47]. Recently, a mid-scale
retailer, Macy’s have also leveraged big data analytics for bet-
448
ter inventory management based on customers’ segmentation
characteristics. They developed a unique Omnichannel strategy
where customers can order via different channels and pick up
their order in a store of their choice; through a central online
fulfillment center. In-store customer localization abilities using
either WiFi or beacons as underlying technologies are also
emerging that would further assist in enhancing consumers’
shopping experiences in future [48], [49].
D. Generating value from smart utility meters
With rapid deployment of smart electricity and gas meters,
especially in developed countries, the utility companies are
also leaning towards extracting and utilizing the information
generated from smart meter data for increased profits, im-
proved customer satisfaction and better resource management
[50]. A meter is called smart or intelligent due to its ability
to measure the electricity usage in real time at much smaller
time intervals than traditional meters (which keeps the record
of cumulative electricity consumption) [50]. Smart meters also
allow to remotely control the consumption of electricity and
to switch off supply when needed. To convert the data into
actionable insights, utility companies need to adapt techniques
for accurate and timely collection, transfer, storage, processing
and analyses of data. Many established companies including
IBM, SAP, Oracle, as well as startups like Autogrid are
currently assisting utility companies in designing solutions for
better understanding the hidden potential of the data generated
from smart meters [51], [52].
Several machine learning algorithms have been proposed
in the literature for better management and control of data
for utility companies [50]. [53] suggested that by grouping
customers based on usage readings following clustering tech-
niques, the utility companies can identify consumer for tar-
geted services. Knowledge of customer usage patterns can also
assist utility companies in designing better demand response
tariff plans. For example, utility companies can encourage
consumers with flexible consumption patterns to minimize
their usage during the peak hours by offering incentives [54].
Likewise, consumers with high energy usage pattern can be
penalized if they are unable to curtail their consumption by
limiting use of household appliances during the peak energy
usage hours. Machine learning algorithms such as independent
component analysis [55] and clustering techniques [56] have
also been used to identify the type of demand faced by
different consumer groups during the day [55]. Multiple linear
regression models have also been used to predict the usage of
power in households [56]. Support vector machine classifier
have been used to distinguish user groups based on their usage
patterns [57]. Customers’ load profiles can potentially assist
in identifying and detecting irregularities or abnormalities
caused either due to faulty metering or human intervention
and fraud [51]. Finally, machine learning techniques can also
be used to predict congestion or instability conditions within
a network. This information can be used by utility companies
to identify overloaded or ageing components and carry out in-
time preventive maintenance to avoid power losses and lost
revenues [58].
Real-world examples that illustrate data analysis use for
utility companies include EnerNoc and Comverge, which
are assisting utility companies by designing tools such as
demand response programs that can encourage customers
in reducing load demands during peak times, such as late
afternoon during a heat wave when the air conditioning load
stresses the grid’s capacity. In exchange for lowering power
consumption, consumers are offered rebates. Leveraging big
data technologies, AutoGrid software service also analyzes
grid usage patterns to predict power demand a day ahead thus
encouraging both utilities and consumers to participate in load-
shedding programs to prevent outages [59].
E. Improving Security
Cybercrime costs $118 billion annually and this figure
is expected to grow significantly [60]. With easy access to
information available online, sophisticated cybercrimes are
occurring at an alarming rate due to which traditional security
solutions are no longer sufficient to defend against these esca-
lating threats. Incidents of hacking, identity theft and stealing
credit card data from retailers and banks are in the news quite
regularly but recent sophisticated and organized breaches at
Sony involving an unreleased movie have shaken the world.
While a lot still needs to be done to prevent cyberterrorism,
Big data analytics in security now offers promising solutions
towards efficient detection of suspicious activities over the
network. It is expected that big data analytics will impact
various aspects of information security such as network mon-
itoring, user authentication and control, authorization, identity
management, fraud detection, data loss prevention and control
[61]. Using big data analytics to detect threats and design
security solutions, the enterprises are now able to prevent their
systems from future threats.
A number of data mining techniques to detect cyber crimes
are proposed in the literature [61]. For example, classification
models such as Naive Bayes, support vector machines, neural
networks, decision trees have long been used to detect spam
emails [62], (spamming implies sending unsolicited emails).
Support vector machine techniques have also been used to
prevent Denial of Service (DoS) attacks, where DoS attack
refers to the process of making system inaccessible to other
users [63], [64]. While [63] used Enhanced Multi Class
Support Vector Machines (EMCSVM) to predict various kinds
of DoS attacks, [64] proposed radial-basis function neural
network (RBFNN) and support vector machines (SVM), to
solve the DoS problem with an ability to detect or predict new
attacks based on the patterns similar to the attack patterns that
appeared in the past. Classification models have also been used
to detect Malware [65] and phishing URLs [66] and emails
[67].
Data mining techniques have also been used for anomaly
detection to search for unusual patterns and network behaviors
[68]. While feature selection approaches are used to prioritize
features that can assist in differentiating normal behavior from
the one affected by the presence of anomalies, classifiers are
used to differentiate between patterns [69]. These anomalies
could be present either due to internal system failure or due
449
to external attacks. In case of external attacks, identifying
the intruders that carry out these malicious activities and
identifying the types of attacks are other major issues. Machine
learning approaches can now also be used for both intruder
detection [70] and finding the types of attacks [71].
Finally, as more companies turn towards cloud computing
for storage and processing of big data, the security of cloud
becomes essential. Cloud computing is vulnerable to security
threats including insecure application and programming in-
terfaces, malicious insiders, shared technology vulnerabilities,
data leakages and account hacking [72].
A number of companies are also working on designing
solutions to protect users from cybercrime. For instance,
IBMs’ QRadar security intelligence platform is designed to
deliver the benefits of next-generation security information and
event management technology to various companies [73]. En-
terprises use QRadar solutions to collect and correlate billions
of events and network flows per day in deployments that span
multiple locations. By analyzing structured, enriched security
data alongside unstructured data from across the enterprise
using QRadar solutions, the malicious activities hidden deep
in the masses of an organization’s data can be potentially
detected.
III. CHALLENGES IN BIG DATA ANALYTICS
While big data analyses provide value to businesses there
are issues surrounding it in general that must be carefully dealt
with to exploit its full potential [31], [1]. One of the primary
concerns around big data is security and privacy. Access to
large data implies the potential to identify individuals and
also their profile on the basis of their behavior, likes, dislikes,
daily routine, etc. Thus companies must take extra precautions
to prevent the confidentiality of users’ sensitive information.
Another major challenge is data access and storage. With
huge volumes of data being generated, it is not feasible to
store it on a single machine compelling companies to rely
on the cloud for storage. Cloud computing can be used
to manage and store these large datasets but again privacy
around cloud is an open research problem. The risk of storing
sensitive information on the cloud without sufficient security
measures have been unfortunately illustrated in a number
of instances. Eliminating single point of failure by creating
multiple copies of data and storing on different nodes is also
a challenge as these nodes have to be synchronized to retrieve
data efficiently. Since data is available in different formats,
extracting them and combining in a format that can be easily
imported for analysis is another challenge. Finally, the skillset
(which is a culmination of advanced statistical techniques,
data optimization methods, machine learning algorithms and
thorough understanding of business value) required to extract
meaningful information from big data is seldom available.
While these challenges are applicable in general to all
industrial domains, there are also challenges specific to each
of the applications considered in this study, which are briefly
discussed below.
• Sentiment analysis: Sentiment analysis classifies text
into three main classes i.e., positive, negative and neutral
but given the subjectivity of text classification in reality
text can be classified into many categories [74]. There-
fore instead of simple two-class classifiers, multi-class
classifiers should be used for better results. Designing
a classifier for sentiment analysis in the presence of
limited amount of data available for training a classifier is
quite challenging [14]. Moreover, the training data used
for designing a classifier should be selected carefully as
the same word may have different meaning in different
domains based on the context [75]. Sarcastic or ironic
sentences often lead to wrong classification. Using only
words rather than sentences also has the potential to
erroneous classification. Finally, making general conclu-
sions about any product/services based on the limited
number of tweets or posts available on the web can yield
misleading results and the results must be checked for
statistical significance.
• Predicting customer churn: Cost constraints dictate that
telecommunication companies focus more on retaining
existing customers rather than acquiring new ones and
thus starts offering promotions to the existing customers
who are likely to churn. However, finding the real cause
for customer churn is not always easy because identify-
ing underlying variables that best describe a customer’s
behavioral profiles is a challenging task and may not
always yield users’ true intentions thus leading to wrong
predictions. Moreover, integration of data from miscella-
neous sources such as customer base, call center inbound
and outbound calls, billing, etc., to gather information
about a customer is not always straightforward. With
high competition available, companies are now offering
service plans suitable for different customer segments
but designing algorithms to group customers with similar
preferences based on partial information alone may not
yield feasible solutions.
• Enhancing online shopping experience: Despite its
popularity, online shopping still has to overcome certain
challenges to encourage customers. One of the main chal-
lenge in predicting customers’ behavior is merging online
data with offline transaction data as these datasets may
not be managed by a single entity. Customers’ security
and privacy concerns around using their transactional
data for predicting their spending behavior also need
to be addressed satisfactorily. Analyzing data to predict
customers’ preference of products, to promote similar
products or relevant coupons to targeted audiences, is a
challenging issue which only gets worse with time due
to users’ changing shopping preferences.
• Smart utility meters: One of the major challenge faced
by the utility companies is merging data that resides in
disparate databases among various departments of utility
companies. Credibility of data is another major challenge
that could have devastating effect on firm’s reputation.
Since the data generated by smart meters may yield
abnormalities due to the faulty behaviors caused either
by natural conditions or by human interference, thus
making decisions based on faulty data can potentially
impact utility companies’ revenues. Lack of infrastructure
450
to support data processing and analysis, generated from
smart meters, is another major challenge faced by utility
companies. Predicting customers’ profile patterns includ-
ing number of people living in a household, appliances
they use and the time of usage of different appliances
based on their electricity usage bills for promotional
offers could also raise privacy concerns for users.
• Security: Although the application of big data analytics
in improving security looks promising it has its own
challenges [76]. One of the major challenges faced by
organizations is the data leakage caused by third party
intervention. Data loss is even more vulnerable if it is
housed in the cloud. Ownership of information hosted on
cloud is another major issue faced by organizations and
trust boundaries need to be established carefully between
the data owners and the data storage owners. With large
datasets stored on cloud, proper security measures must
be taken to prevent re-identification of users based on the
information available through different datasets.
IV. CONCLUSIONS
The unprecedented growth in data in almost every sector
provides businesses a unique opportunity to use analytics
to decipher hidden insights that can be used for making
better decisions. In this paper through five different use cases,
we have illustrated how analytics can be applied to derive
value from big data for various industrial applications. The
examples considered in this study include sentiment analy-
sis for social media, preventing churn of telecommunication
customers, enhancing customers’ online shopping experience,
generating value from smart utility meters and improving
security. While a number of different techniques have been
proposed in the existing literature to derive value for these
use cases, classification and clustering models have been most
widely used for these applications. The continuing growth of
studies that attempt to derive value from big data suggest that
big data analytics can provide useful insights for businesses,
potentially also leading to increased revenues and business
advantages over competition. However, big data analytics also
faces challenges that need to be addressed, in conjunction, in
order to exploit the full potential of the hidden insights within
these large datasets.
REFERENCES
[1] A. Katal, M. Wazid, and R. Goudar, “Big data: Issues, challenges,
tools and good practices,” in Contemporary Computing (IC3), Sixth
International Conference on, Aug 2013, pp. 404–409.
[2] S. Sagiroglu and D. Sinanc, “Big data: A review,” in Collaboration
Technologies and Systems (CTS), International Conference on, May
2013, pp. 42–47.
[3] F. Muhtaroglu, S. Demir, M. Obali, and C. Girgin, “Business model
canvas perspective on big data applications,” in Big Data, IEEE Inter-
national Conference on, Oct 2013, pp. 32–37.
[4] A. Rajpurohit, “Big data for business managers; bridging the gap be-
tween potential and value,” in Big Data, IEEE International Conference
on, Oct 2013, pp. 29–31.
[5] Z. Liu, P. Yang, and L. Zhang, “A sketch of big data technologies,” in
Internet Computing for Engineering and Science, Seventh International
Conference on, Sept 2013, pp. 26–29.
[6] S. Dhar and S. Mazumdar, “Challenges and best practices for enterprise
adoption of big data technologies,” in Technology Management Confer-
ence (ITMC), 2014 IEEE International, June 2014, pp. 1–4.
[7] The digital universe of opportunities: Rich data and the increasing
value of the internet of things. [Online]. Available: http://www.emc.
com/leadership/digital-universe/2014iview/executive-summ%ary.htm
[8] P. Malik, “Governing big data: Principles and practices,” IBM Journal
of Research and Development, vol. 57, no. 3/4, pp. 1:1–1:13, May 2013.
[9] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward scalable systems for
big data analytics: A technology tutorial,” Access, IEEE, vol. 2, pp.
652–687, 2014.
[10] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica,
“Spark: Cluster computing with working sets,” in Proceedings of the
2nd USENIX Conference on Hot Topics in Cloud Computing, ser.
HotCloud’10, 2010, pp. 10–15.
[11] N. Y. Xin and L. Y. Ling, “How we could realize big data value,”
in Instrumentation and Measurement, Sensor Network and Automation
(IMSNA), 2013 2nd International Symposium on, Dec 2013, pp. 425–
427.
[12] J. Wielki, “Implementation of the big data concept in organizations -
possibilities, impediments and challenges,” in Computer Science and
Information Systems (FedCSIS), 2013 Federated Conference on, Sept
2013, pp. 985–989.
[13] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward scalable systems for
big data analytics: A technology tutorial,” Access, IEEE, vol. 2, pp.
652–687, 2014.
[14] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-
based methods for sentiment analysis,” Comput. Linguist., vol. 37, no. 2,
pp. 267–307, 2011.
[15] M. Hu and B. Liu, “Mining and summarizing customer reviews,” in
Proceedings of the Tenth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 2004, pp. 168–177.
[16] User modeling improves understanding of people’s prefer-
ences to help engage users on their own terms. [On-
line]. Available: http://www.ibm.com/smarterplanet/us/en/ibmwatson/
developercloud/user-mo%deling.html
[17] Five sentiment analysis tools that wont cost you a
cent. [Online]. Available: http://www.fieldassignment.com/2011/04/
free-sentiment-analysis-tools.ht%ml
[18] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis
algorithms and applications: A survey,” Ain Shams Engineering Journal,
2014. [Online]. Available: http://www.sciencedirect.com/science/article/
pii/S2090447914000550
[19] E. Boiy, P. Hens, K. Deschacht, and M. francine Moens, “Automatic
sentiment analysis in on-line text,” in In Proceedings of the 11th
International Conference on Electronic Publishing, 2007, pp. 349–360.
[20] D. Maynard and A. Funk, “Automatic detection of political opinions in
tweets,” in The Semantic Web: ESWC 2011 Workshops, vol. 7117, 2012,
pp. 88–99.
[21] B. Liu, Sentiment Analysis and Opinion Mining. Morgan and Claypool
Publishers, 2012.
[22] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: Sentiment
classification using machine learning techniques,” in Proceedings of
the ACL-02 Conference on Empirical Methods in Natural Language
Processing, 2002, pp. 79–86.
[23] M. Usha and M. Indra Devi, “Analysis of sentiments using unsupervised
learning techniques,” in Information Communication and Embedded
Systems, International Conference on, Feb 2013, pp. 241–245.
[24] G. Li and F. Liu, “A clustering-based approach on sentiment analysis,”
in Intelligent Systems and Knowledge Engineering, International Con-
ference on, Nov 2010, pp. 331–337.
[25] L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu. (2011) Combining
lexicon-based and learning-based methods for twitter sentiment
analysis. [Online]. Available: http://www.hpl.hp.com/techreports/2011/
HPL-2011-89.html
[26] P. P. Balage Filho, L. V. Avanc¸o, M. d. G. V. Nunes, and T. A. S.
Pardo, “NILC USP: An improved hybrid system for sentiment analysis
in twitter messages,” in Proceedings of the 8th International Workshop
on Semantic Evaluation. Association for Computational Linguistics
and Dublin City University, 2014, pp. 428–432.
[27] M. Neethu and R. Rajasree, “Sentiment analysis in twitter using machine
learning techniques,” in Computing, Communications and Networking
Technologies (ICCCNT),2013 Fourth International Conference on, July
2013, pp. 1–5.
[28] C. Z. Charu C. Aggarwal, Mining Text Data. Springer, 2012.
[29] Y. Mejova and P. Srinivasan, “Exploring feature definition and selection
for sentiment classifiers,” in ICWSM’11, 2011, pp. 1–6.
[30] Twitter, ibm announce a new data analytics part-
nership. [Online]. Available: http://fortune.com/2014/10/29/
twitter-ibm-data-analytics-partnership/
451
[31] N. Kamalraj and A. Malathi, “A survey on churn prediction techniques in
communication sector,” International Journal of Computer Applications,
vol. 64, no. 5, pp. 39–42, February 2013, full text available.
[32] W. Bandara, A. Perera, and D. Alahakoon, “Churn prediction method-
ologies in the telecommunications sector: A survey,” in Advances in
ICT for Emerging Regions, International Conference on, Dec 2013, pp.
172–176.
[33] C.-P. Wei and I.-T. Chiu, “Turning telecommunications call details
to churn prediction: a data mining approach,” Expert Systems with
Applications, vol. 23, no. 2, pp. 103 – 112, 2002.
[34] C. Zhao, Y. Wu, and H. Gao, “Study on knowledge acquisition of
the telecom customers’ consuming behaviour based on data mining,”
in Wireless Communications, Networking and Mobile Computing, 4th
International Conference on, Oct 2008, pp. 1–5.
[35] J. Zhao, W. Zhang, and Y. Liu, “Improved k-means cluster algorithm in
telecommunications enterprises customer segmentation,” in Information
Theory and Information Security, IEEE International Conference on,
Dec 2010, pp. 167–169.
[36] L. Ye, C. Qiu-ru, X. Hai-xu, L. Yi-jun, and Y. Zhi-min, “Telecom
customer segmentation with k-means clustering,” in Computer Science
Education, 7th International Conference on, July 2012, pp. 648–651.
[37] Celcom loyalty deals. [Online]. Available: http://www2.nst.com.my/
nation/celcom-loyalty-deals-1.558917
[38] J. Li. (2013) Study: Online shopping behavior in the
digital era. [Online]. Available: http://www.iacquire.com/blog/
study-online-shopping-behavior-in-the-digi%tal-era
[39] P. Yang, Q. lun Zheng, H. Peng, and Q. Tan, “A stepwise learning
approach to automatic discovery of interest data blocks,” in Machine
Learning and Cybernetics, 2004. Proceedings of 2004 International
Conference on, vol. 3, Aug. 2004, pp. 1441–1446.
[40] (2014) Making online shopping smarter with ad-
vanced analytics. [Online]. Available: www.cognizant.com/.../
Making-Online-Shopping-Smarter-with-Advanced-anal%ytics.pdf
[41] R. Dewan, M. Freimer, and Y. Jiang, “Using online competitor’s inven-
tory information for pricing,” in System Sciences, 40th Annual Hawaii
International Conference on, Jan 2007, pp. 210a–210a.
[42] M. Mestre and P. Vitoria, “Tracking of consumer behaviour in e-
commerce,” in Information Fusion, 16th International Conference on,
July 2013, pp. 1214–1221.
[43] M. Gupta, H. Mittal, P. Singla, and A. Bagchi, “Characterizing compar-
ison shopping behavior: A case study,” in Data Engineering Workshops
(ICDEW), 2014 IEEE 30th International Conference on, March 2014,
pp. 115–122.
[44] T. Jiang and A. Tuzhilin, “Segmenting customers from population to
individuals: Does 1-to-1 keep your customers forever?” Knowledge and
Data Engineering, IEEE Transactions on, vol. 18, no. 10, pp. 1297–
1311, Oct 2006.
[45] H.-W. Yang, Z. geng Pan, X.-Z. Wang, and B. Xu, “A personalized
products selection assistance based on e-commerce machine learning,”
in Machine Learning and Cybernetics, 2004. Proceedings of 2004
International Conference on, vol. 4, Aug. 2004, pp. 2629–2633.
[46] P. Henry and H. Luo, “Wifi: what’s next?” Communications Magazine,
IEEE, vol. 40, no. 12, pp. 66–72, Dec 2002.
[47] Wal-mart paid 300 million-plus for kos-
mix. [Online]. Available: http://allthingsd.com/20110418/
exclusive-wal-mart-paid-300-million-plus%-for-kosmix/
[48] Beacons, beacons, everywhere beacons. [Online].
Available: http://www.mediapost.com/publications/article/231059/
beacons-beacons-ev%erywhere-beacons.html
[49] Stores sniff out smartphones to follow shoppers. [On-
line]. Available: http://www.technologyreview.com/news/520811/
stores-sniff-out-smartphone%s-to-follow-shoppers/
[50] D. Alahakoon and X. Yu, “Advanced analytics for harnessing the
power of smart meter big data,” in Intelligent Energy Systems, IEEE
International Workshop on, Nov 2013, pp. 40–45.
[51] Generating big value from big data in energy and utilities.
[Online]. Available: http://www-01.ibm.com/software/data/bigdata/
industry-energy.html3
[52] Utilities and big data: Using analytics for increased customer
satisfaction. [Online]. Available: http://www.oracle.com/us/industries/
utilities/big-data-analytics-custom%er-wp-2075868.pdf
[53] S. Valero, M. Ortiz, C. Senabre, C. Alvarez, F. Franco, and A. Gabaldon,
“Methods for customer and demand response policies selection in new
electricity markets,” Generation, Transmission Distribution, IET, vol. 1,
no. 1, pp. 104–110, January 2007.
[54] A. Albert and R. Rajagopal, “Smart meter driven segmentation: What
your consumption says about you,” Power Systems, IEEE Transactions
on, vol. 28, no. 4, pp. 4019–4030, Nov 2013.
[55] H. Liao and D. Niebur, “Load profile estimation in electric transmission
networks using independent component analysis,” Power Systems, IEEE
Transactions on, vol. 18, no. 2, pp. 707–715, May 2003.
[56] C. Beckel, L. Sadamori, T. Staake, and S. Santini, “Revealing household
characteristics from smart meter data,” Energy, 2014.
[57] S. K. T. J. Nagi, K. S. Yap and S. K. Ahmed, “2ndinternational power
engineering and optimization conference,” in Power Load Forecasting
using Hybrid Self-Organizing Maps and Support Vector Machines, June
2008.
[58] F. Zhao, G. Wang, C. Deng, and Y. Zhao, “A real-time intelligent
abnormity diagnosis platform in electric power system,” in Advanced
Communication Technology (ICACT), 2014 16th International Confer-
ence on, Feb 2014, pp. 83–87.
[59] M. LaMonica. Bringing big data to smart meters.
[Online]. Available: http://www.technologyreview.com/view/506476/
bringing-big-data-to-smart-%meters/
[60] Cyber security analytics. [Online]. Available: http://www.teradata.com/
Cyber-Security-Analytics/
[61] T. Mahmood and U. Afzal, “Security analytics: Big data analytics for
cybersecurity: A review of trends, techniques and tools,” in Information
Assurance (NCIA), 2013 2nd National Conference on, Dec 2013, pp.
129–134.
[62] P. Panigrahi, “A comparative study of supervised machine learning
techniques for spam e-mail filtering,” in Computational Intelligence
and Communication Networks, Fourth International Conference on, Nov
2012, pp. 506–512.
[63] T. Subbulakshmi, S. Shalinie, V. GanapathiSubramanian, K. BalaKrish-
nan, D. AnandKumar, and K. Kannathal, “Detection of ddos attacks
using enhanced support vector machines with real time generated
dataset,” in Advanced Computing (ICoAC), 2011 Third International
Conference on, Dec 2011, pp. 17–22.
[64] G. Tsang, P. Chan, D. Yeung, and E. Tsang, “Denial of service detection
by support vector machines and radial-basis function neural network,” in
Machine Learning and Cybernetics, Proceedings of 2004 International
Conference on, vol. 7, Aug 2004, pp. 4263–4268.
[65] M. Mas’ud, S. Sahib, M. Abdollah, S. Selamat, and R. Yusof, “Analysis
of features selection and machine learning classifier in android malware
detection,” in Information Science and Applications, International Con-
ference on, May 2014, pp. 1–5.
[66] J. James, L. Sandhya, and C. Thomas, “Detection of phishing urls
using machine learning techniques,” in Control Communication and
Computing, International Conference on, Dec 2013, pp. 304–309.
[67] A. Almomani, B. Gupta, S. Atawneh, A. Meulenberg, and E. Almomani,
“A survey of phishing email filtering techniques,” Communications
Surveys Tutorials, IEEE, vol. 15, no. 4, pp. 2070–2090, Fourth 2013.
[68] B. Thuraisingham, “Data mining for security applications,” in Machine
Learning and Applications, 2004. Proceedings. 2004 International Con-
ference on, Dec 2004, pp. 3–4.
[69] A. Aziz, A. Hassanien, S.-O. Hanaf, and M. Tolba, “Multi-layer hybrid
machine learning techniques for anomalies detection and classification
approach,” in Hybrid Intelligent Systems (HIS), 2013 13th International
Conference on, Dec 2013, pp. 215–220.
[70] L. Khan, M. Awad, and B. Thuraisingham, “A new intrusion detection
system using support vector machines and hierarchical clustering,” The
VLDB Journal, vol. 16, no. 4, pp. 507–521, Oct. 2007.
[71] T. Subbulakshmi, S. Shalinie, V. GanapathiSubramanian, K. BalaKrish-
nan, D. AnandKumar, and K. Kannathal, “Detection of ddos attacks
using enhanced support vector machines with real time generated
dataset,” in Advanced Computing, Third International Conference on,
Dec 2011, pp. 17–22.
[72] M. Khorshed, A. Ali, and S. Wasimi, “Trust issues that create threats for
cyber attacks in cloud computing,” in Parallel and Distributed Systems,
IEEE 17th International Conference on, Dec 2011, pp. 900–905.
[73] Ibm security intelligence with big data. [Online]. Available: http:
//www-03.ibm.com/security/solution/intelligence-big-data/
[74] J. T. Mr. Saifee Vohra, “Applications and challenges for sentiment
analysis : A survey,” International Journal of Engineering Research and
Technology, vol. 2, 2013.
[75] H. R. P, “Opinion mining and sentiment analysis - challenges and
applications,” International Journal of Application or Innovation in
Engineering and Management (IJAIEM), vol. 3, 2014.
[76] A. A. Cardenas, P. K. Manadhata, and S. P. Rajan, “Big data analytics
for security,” IEEE Security and Privacy, vol. 11, no. 6, pp. 74–76, 2013.
452

More Related Content

What's hot

The effect of technology-organization-environment on adoption decision of bi...
The effect of technology-organization-environment on  adoption decision of bi...The effect of technology-organization-environment on  adoption decision of bi...
The effect of technology-organization-environment on adoption decision of bi...IJECEIAES
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIIJCSEA Journal
 
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...IJERA Editor
 
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAA REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAIJMIT JOURNAL
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET Journal
 
IRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET Journal
 
Electronics health records and business analytics a cloud based approach
Electronics health records and business analytics a cloud based approachElectronics health records and business analytics a cloud based approach
Electronics health records and business analytics a cloud based approachIAEME Publication
 
TLNBusinessAnalytics_researchPoster_Final
TLNBusinessAnalytics_researchPoster_FinalTLNBusinessAnalytics_researchPoster_Final
TLNBusinessAnalytics_researchPoster_FinalYi Qi
 
Mining Social Media Data for Understanding Drugs Usage
Mining Social Media Data for Understanding Drugs  UsageMining Social Media Data for Understanding Drugs  Usage
Mining Social Media Data for Understanding Drugs UsageIRJET Journal
 
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...Secured Scheduling Technique of Network Resource Management in Vehicular Comm...
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...Gagan Bansal
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A ReviewIRJET Journal
 
Poster ECIS 2016
Poster ECIS 2016Poster ECIS 2016
Poster ECIS 2016Rui Silva
 
Concept of information
Concept of informationConcept of information
Concept of informationreeta nagari
 
A case study of using the hybrid model of scrum and six sigma in software dev...
A case study of using the hybrid model of scrum and six sigma in software dev...A case study of using the hybrid model of scrum and six sigma in software dev...
A case study of using the hybrid model of scrum and six sigma in software dev...IJECEIAES
 
IRJET- Methodologies used on News Articles :A Survey
IRJET- Methodologies used on News Articles :A SurveyIRJET- Methodologies used on News Articles :A Survey
IRJET- Methodologies used on News Articles :A SurveyIRJET Journal
 
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...IJMIT JOURNAL
 
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian network
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkImpact of big data congestion in IT: An adaptive knowledgebased Bayesian network
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkIJECEIAES
 

What's hot (18)

The effect of technology-organization-environment on adoption decision of bi...
The effect of technology-organization-environment on  adoption decision of bi...The effect of technology-organization-environment on  adoption decision of bi...
The effect of technology-organization-environment on adoption decision of bi...
 
DEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AIDEALING CRISIS MANAGEMENT USING AI
DEALING CRISIS MANAGEMENT USING AI
 
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
 
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAA REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATA
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its Challenges
 
IRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its ChallengesIRJET- Analysis of Big Data Technology and its Challenges
IRJET- Analysis of Big Data Technology and its Challenges
 
Electronics health records and business analytics a cloud based approach
Electronics health records and business analytics a cloud based approachElectronics health records and business analytics a cloud based approach
Electronics health records and business analytics a cloud based approach
 
TLNBusinessAnalytics_researchPoster_Final
TLNBusinessAnalytics_researchPoster_FinalTLNBusinessAnalytics_researchPoster_Final
TLNBusinessAnalytics_researchPoster_Final
 
Mining Social Media Data for Understanding Drugs Usage
Mining Social Media Data for Understanding Drugs  UsageMining Social Media Data for Understanding Drugs  Usage
Mining Social Media Data for Understanding Drugs Usage
 
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...Secured Scheduling Technique of Network Resource Management in Vehicular Comm...
Secured Scheduling Technique of Network Resource Management in Vehicular Comm...
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A Review
 
Poster ECIS 2016
Poster ECIS 2016Poster ECIS 2016
Poster ECIS 2016
 
Concept of information
Concept of informationConcept of information
Concept of information
 
A case study of using the hybrid model of scrum and six sigma in software dev...
A case study of using the hybrid model of scrum and six sigma in software dev...A case study of using the hybrid model of scrum and six sigma in software dev...
A case study of using the hybrid model of scrum and six sigma in software dev...
 
IRJET- Methodologies used on News Articles :A Survey
IRJET- Methodologies used on News Articles :A SurveyIRJET- Methodologies used on News Articles :A Survey
IRJET- Methodologies used on News Articles :A Survey
 
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...
CHALLENGES FOR MANAGING COMPLEX APPLICATION PORTFOLIOS: A CASE STUDY OF SOUTH...
 
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian network
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian networkImpact of big data congestion in IT: An adaptive knowledgebased Bayesian network
Impact of big data congestion in IT: An adaptive knowledgebased Bayesian network
 

Similar to Full Paper: Analytics: Key to go from generating big data to deriving business value

Encroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data TechnologyEncroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data TechnologyMangaiK4
 
Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...IRJET Journal
 
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
 
Big Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewBig Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewIRJET Journal
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxjesssueann
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxgauthierleppington
 
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET Journal
 
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET Journal
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET Journal
 
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentIRJET Journal
 
SMUPI-BIS: a synthesis model for users’ perceived impact of business intelli...
SMUPI-BIS: a synthesis model for users’ perceived impact of  business intelli...SMUPI-BIS: a synthesis model for users’ perceived impact of  business intelli...
SMUPI-BIS: a synthesis model for users’ perceived impact of business intelli...nooriasukmaningtyas
 
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDYMACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDYIAEME Publication
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSijistjournal
 
A Review of Big Data Analytics in Sector of Higher Education
A Review of Big Data Analytics in Sector of Higher EducationA Review of Big Data Analytics in Sector of Higher Education
A Review of Big Data Analytics in Sector of Higher EducationIJERA Editor
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET Journal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...ijdpsjournal
 

Similar to Full Paper: Analytics: Key to go from generating big data to deriving business value (20)

Encroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data TechnologyEncroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data Technology
 
Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...
 
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
 
Big Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewBig Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A Review
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
 
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
 
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate Environment
 
SMUPI-BIS: a synthesis model for users’ perceived impact of business intelli...
SMUPI-BIS: a synthesis model for users’ perceived impact of  business intelli...SMUPI-BIS: a synthesis model for users’ perceived impact of  business intelli...
SMUPI-BIS: a synthesis model for users’ perceived impact of business intelli...
 
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDYMACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
 
A Review of Big Data Analytics in Sector of Higher Education
A Review of Big Data Analytics in Sector of Higher EducationA Review of Big Data Analytics in Sector of Higher Education
A Review of Big Data Analytics in Sector of Higher Education
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 

Recently uploaded

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 

Recently uploaded (20)

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 

Full Paper: Analytics: Key to go from generating big data to deriving business value

  • 1. Analytics: Key to go from generating big data to deriving business value Deepali Arora1, Piyush Malik2, 1Dept. of Electrical and Computer Engineering, University of Victoria, P.O. Box 3055 STN CSC, Victoria, B.C {darora}@ece.uvic.ca 2Business Analytics and Strategy, IBM Global Business Services, 4400 N 1st Street, San Jose, CA {Piyush.Malik}@us.ibm.com Abstract—The potential to extract actionable insights from big data has gained increased attention of researchers in academia as well as several industrial sectors. The field has become interesting and problems look even more exciting to solve ever since organizations have been trying to tame large volumes of complex and fast arriving big data streams through newer computing paradigms. However, extracting meaningful and ac- tionable information from big data is a challenging and daunting task. The ability to generate value from large volumes of data is an art which combined with analytical skills needs to be mastered in order to gain competitive advantage in business. The ability of organizations to leverage the emerging technologies and integrate big data into their enterprise architectures effectively depends on the maturity level of the technology and business teams, capabilities they develop as well as the strategies they adopt. In this paper, through selected use cases, we demonstrate how statistical analyses, machine learning algorithms, optimization and text mining algorithms can be applied to extract meaningful insights from the data available through social media, online commerce, telecommunication industry, smart utility meters and used for variety of business benefits, including improving security. The nature of applied analytical techniques largely depends on the underlying nature of the problem so a one-size-fits-all solution hardly exists. Deriving information from big data is also subject to challenges associated with data security and privacy. These and other challenges are discussed in context of the selected problems to illustrate the potential of big data analytics. I. INTRODUCTION The analysis of big data and the associated potential to ex- tract actionable information has gained attention of researchers in both academia and industry [1], [2], [3], [4]. Researchers in both academia/industry have emphasized on developing new tools and techniques for better storing, managing, and analyzing big data [5]. However, the business community is looking for ways to improve their profits by leveraging information hidden in big data through analytics [6]. Mas- sive amount of data are generated on a daily basis from various sources including (but not limited to) online shop- ping transactions, gas and electric meters, electronic health records, social networking interactions, weather and satellite data, embedded sensors in industrial machinery as well as in automobiles and aircrafts, data center computing equipment as well as telecommunication industry equipment. According to International Data Corporation (IDC), cumulative digital data is predicted to grow from 4.4 zeta-bytes (ZB) in 2013 to 44 ZB by the year 2020 [7]. Data is now considered as the “new oil” of the economy, defined mainly by four prominent characteristics- volume, velocity, variety and veracity [8]. While better understanding of the knowledge hidden within the large datasets generated from various sources can potentially help businesses, deriving useful information from these data is a big challenge. Before any kind of actionable insights from data can be derived using advanced analysis techniques, several pre- processing steps are involved. These steps include data collec- tion, data preparation and cleansing, data storage, and manage- ment [9]. The analysis of data can be broadly classified into three categories based on the depth of analysis: 1) descriptive analytics which exploits the historical trends to extract useful information from the data, 2) predictive analytics that focuses on predicting future probability of occurrence of pattern or trends, and 3) prescriptive analytics which focuses on decision making by gaining insights into the system behavior [9]. Regardless of the depth of the analysis, extracting information from data requires a solid understanding of techniques com- prising of statistical analysis, optimization, machine learning, text-mining algorithms, etc. A number of studies have highlighted the tools/algorithms that can be used to derive solutions for various problems associated with big data [1], [2], [3], [4], [10]. For example, [1] and [2] presented a brief review of the challenges and issues surrounding big data. Some of the popular tools, frameworks and technologies that can be used to aggregate, manage and analyze big data, includes Hadoop and its ecosystem of techniques and tools such as Pig, Hive, Hbase, Spark, High Performance Cluster Computing (HPCC)), in-memory computing engines and NoSQL databases, cloud based data service engines, etc. are still nascent and continually evolving under the open source software movement. A brief overview of how big data can be used to derive value for various organizations including government, educational institutions and industries is presented in [11]. Possibilities and challenges in implementing big data related technologies in organizations, including storage of the data, lack of skilled people and time involved in processing of huge datasets are discussed in [12]. [3] and [4] presented the general overview of how big data can be used to generate value for businesses. An in-depth tutorial on big data analytics is presented by Hu et al. [13], 2015 IEEE First International Conference on Big Data Computing Service and Applications 978-1-4799-8128-1/15 $31.00 © 2015 IEEE DOI 10.1109/BigDataService.2015.62 446
  • 2. who assessed different techniques that can be used for data acquisition and pre-processing, data storage and management and different analytics techniques to derive information. While these studies provide a good overview of the big data opportunity, issues and challenges involved, value to businesses, and how various techniques can be used for data storage, processing or analytics in general, none of these studies have discussed the application of different algorithms to derive value for specific applications and this is the main focus of this paper. In this paper using five different use cases, we illustrate how big data analytics has been used in obtaining meaningful information. The use cases considered in this paper include sentiment analysis for social media, preventing customer churn in the telecommunication sector, enhancing customers’ online shopping experience, generating value from smart utility meters and improving data security. The objective of this paper is not to present new algorithms for any of the selected industry use cases but rather, to provide a brief overview and are illustrative of the existing algorithms and methods that can be be applied to derive value from big data. Detailed discussion of how different innovative algorithms can be applied to realize value for each of these use cases and challenges associated with them is beyond the scope of the current paper and could be presented in an extended version of this paper in the future. This paper is organized as follows: Application of data analytics to different domains is discussed in Section 2, Section 3 highlights some of the challenges around big data analytics and finally, conclusions are presented in Section 4. II. APPLICATION OF BIG DATA ANALYTICS A. Sentiment analysis in social networks The explosion of data in the form of blogs, online forums and on social media channels such as Facebook, Twitter, Linkedin, Instagram, Pintrest, Youtube, etc has given con- sumers a new way of expressing their opinions about any product or service and consequently may influence other potential buyers. Investigation of users’ opinions or sentiments about any product or service, expressed in textual form, on these websites/blogs is referred to as sentiment analysis [14]. Sentiment analysis combines natural language processing with artificial intelligence capability and text analytics to evaluate statements found across various social platforms to determine whether they are positive or negative with respect to a particular brand, product or service [15]. Sentiment analysis thus provides business intelligence which can be used to make impactful decisions. In addition, consumers routinely look for online reviews before buying any product or service. Developing techniques that can better automate the process of analyzing user generated web content about a given product or service is now the focus of research in both academia and industry. Several companies are also involved in designing algorithms/tools that can perform sentiment analysis either online for free or at nominal costs. One such example is IBM Watson’s user modeling service that uses linguistic analytics to generate psychographic profiles and extract cognitive and social characteristics based on users emails, text messages, tweets, forum posts, etc [16]. Some of the other examples of sentiment analysis tools includes Google analytics, Tweetstats, Social Mention, and Twendz [17]. There are three main classification levels in sentiment anal- ysis: the document-level, the sentence-level, and the aspect- level sentiment analysis [18] and the methodologies that can be used to detect them are broadly classified into three main categories, i.e., lexicon based techniques, machine learning techniques and hybrid approaches [19]. The lexicon-based approach relies on a collection of known and pre-compiled sentiment terms, machine learning approaches are based on application of different algorithms that can be trained and the hybrid approaches are based on the combination of these two approaches [20]. A number of studies have used lexicon based approaches [21], machine learning based supervised [22], or unsupervised [23], [24] approaches, and combined machine learning and lexicon based [25], [26] approaches to classify sentiments into positive or negative categories. Sentiment analysis has been used by researchers in finding people’s opinion expressed on social media sites including Twitter about products/services launched by a company [27] and in real world industrial application (based on second author’s experience) in which one of IBM’s clients leveraged sentiments from social media to identify influencers of a public policy. The general methodology in both these use cases involved four main steps: gathering data, generating features, designing a classifier that can differentiate between different sentiments i.e., positive, negative or neutral, and finally deriving a sentiment score. However, deriving information from the user created web content remains a daunting task as the sentiments may carry varying meanings in different disciplines and cultures. Thus, to derive meaningful results, data features such as individ- ual keywords and their frequency of occurrences; parts of speech such as adjectives, adverbs; opinion words and phrases including good or bad, likes, dislikes; and negations [28], [29] should be carefully derived following feature selection techniques [18]. Supervised machine learning approaches such as classification algorithms can then be designed by converting the sentiment analysis problem to a simple text classification problem. For a standard text classification problem, the subset of data is used to form a training record set defining different classes. These classes are related to the underlying feature values. The classification model can then be used to predict the class label for any new instance. Several classification models are discussed in the literature [18]. Some of the commonly used classifiers include the Naive Bayes classifier, support vector machines (SVM), maximum entropy based classifier, decision trees, and neural networks [18]. Similarly unsuper- vised techniques can also be used to derive users’ sentiments about products/services [23], [24]. The power of integrating sentiments and intelligence trends from social media was recently hailed as the reason for IBM and Twitter to forge an alliance to incorporate Twitter analytics into their consulting business [30]. 447
  • 3. B. Preventing customer churn in telecommunication sector The strong competition amongst telecommunication service providers has compelled them to offer packages that could potentially attract either more customers or at least help them retain their existing ones. Since cost of acquiring a new customer is relatively high compared to retaining the existing customers [31], companies are developing new and competitive ways to retain their customers and maintain long term relationship with them to avoid customer churn. Churners are the customers who leave their existing telecommunication service provider and switch to new ones for different reasons [32]. Customers generally switch services for lower prices or better services. Predicting customer churn is important for companies as it directly affects their revenues. It can also help companies take action by offering better service or attractive packages to prevent their existing customer from switching to different service provider. Literature reveals [33] that on average the telecommunication companies face around 2.2 % of customer churn each month. Designing algorithms that can predict and in turn prevent customer churn is important to the telecommunication industry. The problem of predicting churn and non-churn customers has been addressed in number of studies [31], [32]. However, with increasing competition, the companies are now turning to- wards machine learning algorithms to gain early insights about their customers’ behavior such that timely actions can be taken to prevent customer churn. One simple approach to predict if the user is churn or non-churn customer, is to formulate it as a two class classifier problem using underlying feature values to predict the outcome. Some of the possible features that can be used to define churn and non-churn classes, includes duration of customers calls, services subscribed, usage pattern, and demographics [31]. A comprehensive review of the approaches that can be followed to predict churning customer is presented in [31], [32]. Telecommunication service providers can also use infor- mation about customers usage pattern or services subscribed and demographics to design and offer customized packages to their users [34]. One possible approach is to use clustering algorithms for customer segmentation based on the services they use [35], [36], where clustering refers to partitioning of data points into small number of clusters with some similarity. This allows companies to identify customers for promotion of the products in future, in retaining their customers and attracting new customers by offering customized packages to the targeted audiences based on their usage behaviors. A real world example at Celcom, a telecommunication service provider in Asia that is using predictive personalized analytics to predict churn probability of its customers. They are also offering personalized incentives and geolocation based cross brand promotional offers and coupons and offers, thereby increasing engagement and loyalty with its client base [37]. C. Enhancing customers’ online shopping experience With the advancements in technology and introduction of smartphones and tablets, online shopping has become conve- nient, ubiquitous and so much popular that it is predicted to grow to $370 billion in 2017 [38]. Businesses are now using advanced analytics to predict customer behaviors and for car- rying out customer segmentation based on the characteristics of the customer groups [39]. While data from online clicks on stores’ inventory does yield information about what user is looking for, it still doesn’t provide companies the complete information about their consumers as many of them still go to retail malls to buy a product [40]. Retailers need to merge both offline and online data to design algorithms for better understanding of their customers’ behaviors and for designing product recommendation engines for different audiences [41]. One of the approaches followed to predict customer be- havior is the use of the transactional data. For example, [42] developed a model using hierarchical clustering and a hidden Markov model (HMM) to predict customer behavior based on transactional data. [43] also used Markov model to predict the probability of click to conversion based on the time spent by the customer on site. [44] compared the performance of ag- gregate (developing one model for all customers), segmented (developing models for different segments of customers) and 1-to-1 (developing models for individual users) marketing approaches across a broad range of experimental settings including multiple segmentation levels, real-world marketing datasets, dependent variables, different types of classifiers, segmentation/clustering techniques, and different predictive measures. Their results showed both 1-to-1 and segmentation approaches significantly outperform the aggregate modelling approaches. However, in the presence of little transactional data, the segmentation models outperformed both 1-to-1 and aggregate modelling approaches. Once a retailer knows the underlying behavior of a con- sumer, then based on the products that a customer selected in the past, they can design recommender systems to assist them in selecting similar products [45]. The underlying assumption is that the consumers follow patterns similar to their past spending habits and are likely to repeat it in the future. Using different machine learning techniques such as classification, genetic algorithms, clustering or K-nearest neighbor algo- rithms [45], retailers can potentially identify different customer segments and predict customers’ preference and spending abilities. This can help retailers in better advertising of their products to the right audiences. The data mining techniques can also be used to market products to consumers based on their demographics informa- tion combined with their online activities. By combining the information about geographic location of a user, the time of day/week they visit store, the products they buy, and mapping those attributes against the actual sales data it is possible to highlight hidden interactions between online and offline sales activity of a consumer. However, combining online and offline information is a real challenge for retailers [46]. While online retailers like Amazon and eBay are already using sophisticated data analytic techniques to enhance cus- tomers’ online shopping experience, the traditional brick and mortar retailers are also now realizing the benefits of analytics for increased profits. The acquisition of Kosmix labs by Wal- mart in 2011 is one such example [47]. Recently, a mid-scale retailer, Macy’s have also leveraged big data analytics for bet- 448
  • 4. ter inventory management based on customers’ segmentation characteristics. They developed a unique Omnichannel strategy where customers can order via different channels and pick up their order in a store of their choice; through a central online fulfillment center. In-store customer localization abilities using either WiFi or beacons as underlying technologies are also emerging that would further assist in enhancing consumers’ shopping experiences in future [48], [49]. D. Generating value from smart utility meters With rapid deployment of smart electricity and gas meters, especially in developed countries, the utility companies are also leaning towards extracting and utilizing the information generated from smart meter data for increased profits, im- proved customer satisfaction and better resource management [50]. A meter is called smart or intelligent due to its ability to measure the electricity usage in real time at much smaller time intervals than traditional meters (which keeps the record of cumulative electricity consumption) [50]. Smart meters also allow to remotely control the consumption of electricity and to switch off supply when needed. To convert the data into actionable insights, utility companies need to adapt techniques for accurate and timely collection, transfer, storage, processing and analyses of data. Many established companies including IBM, SAP, Oracle, as well as startups like Autogrid are currently assisting utility companies in designing solutions for better understanding the hidden potential of the data generated from smart meters [51], [52]. Several machine learning algorithms have been proposed in the literature for better management and control of data for utility companies [50]. [53] suggested that by grouping customers based on usage readings following clustering tech- niques, the utility companies can identify consumer for tar- geted services. Knowledge of customer usage patterns can also assist utility companies in designing better demand response tariff plans. For example, utility companies can encourage consumers with flexible consumption patterns to minimize their usage during the peak hours by offering incentives [54]. Likewise, consumers with high energy usage pattern can be penalized if they are unable to curtail their consumption by limiting use of household appliances during the peak energy usage hours. Machine learning algorithms such as independent component analysis [55] and clustering techniques [56] have also been used to identify the type of demand faced by different consumer groups during the day [55]. Multiple linear regression models have also been used to predict the usage of power in households [56]. Support vector machine classifier have been used to distinguish user groups based on their usage patterns [57]. Customers’ load profiles can potentially assist in identifying and detecting irregularities or abnormalities caused either due to faulty metering or human intervention and fraud [51]. Finally, machine learning techniques can also be used to predict congestion or instability conditions within a network. This information can be used by utility companies to identify overloaded or ageing components and carry out in- time preventive maintenance to avoid power losses and lost revenues [58]. Real-world examples that illustrate data analysis use for utility companies include EnerNoc and Comverge, which are assisting utility companies by designing tools such as demand response programs that can encourage customers in reducing load demands during peak times, such as late afternoon during a heat wave when the air conditioning load stresses the grid’s capacity. In exchange for lowering power consumption, consumers are offered rebates. Leveraging big data technologies, AutoGrid software service also analyzes grid usage patterns to predict power demand a day ahead thus encouraging both utilities and consumers to participate in load- shedding programs to prevent outages [59]. E. Improving Security Cybercrime costs $118 billion annually and this figure is expected to grow significantly [60]. With easy access to information available online, sophisticated cybercrimes are occurring at an alarming rate due to which traditional security solutions are no longer sufficient to defend against these esca- lating threats. Incidents of hacking, identity theft and stealing credit card data from retailers and banks are in the news quite regularly but recent sophisticated and organized breaches at Sony involving an unreleased movie have shaken the world. While a lot still needs to be done to prevent cyberterrorism, Big data analytics in security now offers promising solutions towards efficient detection of suspicious activities over the network. It is expected that big data analytics will impact various aspects of information security such as network mon- itoring, user authentication and control, authorization, identity management, fraud detection, data loss prevention and control [61]. Using big data analytics to detect threats and design security solutions, the enterprises are now able to prevent their systems from future threats. A number of data mining techniques to detect cyber crimes are proposed in the literature [61]. For example, classification models such as Naive Bayes, support vector machines, neural networks, decision trees have long been used to detect spam emails [62], (spamming implies sending unsolicited emails). Support vector machine techniques have also been used to prevent Denial of Service (DoS) attacks, where DoS attack refers to the process of making system inaccessible to other users [63], [64]. While [63] used Enhanced Multi Class Support Vector Machines (EMCSVM) to predict various kinds of DoS attacks, [64] proposed radial-basis function neural network (RBFNN) and support vector machines (SVM), to solve the DoS problem with an ability to detect or predict new attacks based on the patterns similar to the attack patterns that appeared in the past. Classification models have also been used to detect Malware [65] and phishing URLs [66] and emails [67]. Data mining techniques have also been used for anomaly detection to search for unusual patterns and network behaviors [68]. While feature selection approaches are used to prioritize features that can assist in differentiating normal behavior from the one affected by the presence of anomalies, classifiers are used to differentiate between patterns [69]. These anomalies could be present either due to internal system failure or due 449
  • 5. to external attacks. In case of external attacks, identifying the intruders that carry out these malicious activities and identifying the types of attacks are other major issues. Machine learning approaches can now also be used for both intruder detection [70] and finding the types of attacks [71]. Finally, as more companies turn towards cloud computing for storage and processing of big data, the security of cloud becomes essential. Cloud computing is vulnerable to security threats including insecure application and programming in- terfaces, malicious insiders, shared technology vulnerabilities, data leakages and account hacking [72]. A number of companies are also working on designing solutions to protect users from cybercrime. For instance, IBMs’ QRadar security intelligence platform is designed to deliver the benefits of next-generation security information and event management technology to various companies [73]. En- terprises use QRadar solutions to collect and correlate billions of events and network flows per day in deployments that span multiple locations. By analyzing structured, enriched security data alongside unstructured data from across the enterprise using QRadar solutions, the malicious activities hidden deep in the masses of an organization’s data can be potentially detected. III. CHALLENGES IN BIG DATA ANALYTICS While big data analyses provide value to businesses there are issues surrounding it in general that must be carefully dealt with to exploit its full potential [31], [1]. One of the primary concerns around big data is security and privacy. Access to large data implies the potential to identify individuals and also their profile on the basis of their behavior, likes, dislikes, daily routine, etc. Thus companies must take extra precautions to prevent the confidentiality of users’ sensitive information. Another major challenge is data access and storage. With huge volumes of data being generated, it is not feasible to store it on a single machine compelling companies to rely on the cloud for storage. Cloud computing can be used to manage and store these large datasets but again privacy around cloud is an open research problem. The risk of storing sensitive information on the cloud without sufficient security measures have been unfortunately illustrated in a number of instances. Eliminating single point of failure by creating multiple copies of data and storing on different nodes is also a challenge as these nodes have to be synchronized to retrieve data efficiently. Since data is available in different formats, extracting them and combining in a format that can be easily imported for analysis is another challenge. Finally, the skillset (which is a culmination of advanced statistical techniques, data optimization methods, machine learning algorithms and thorough understanding of business value) required to extract meaningful information from big data is seldom available. While these challenges are applicable in general to all industrial domains, there are also challenges specific to each of the applications considered in this study, which are briefly discussed below. • Sentiment analysis: Sentiment analysis classifies text into three main classes i.e., positive, negative and neutral but given the subjectivity of text classification in reality text can be classified into many categories [74]. There- fore instead of simple two-class classifiers, multi-class classifiers should be used for better results. Designing a classifier for sentiment analysis in the presence of limited amount of data available for training a classifier is quite challenging [14]. Moreover, the training data used for designing a classifier should be selected carefully as the same word may have different meaning in different domains based on the context [75]. Sarcastic or ironic sentences often lead to wrong classification. Using only words rather than sentences also has the potential to erroneous classification. Finally, making general conclu- sions about any product/services based on the limited number of tweets or posts available on the web can yield misleading results and the results must be checked for statistical significance. • Predicting customer churn: Cost constraints dictate that telecommunication companies focus more on retaining existing customers rather than acquiring new ones and thus starts offering promotions to the existing customers who are likely to churn. However, finding the real cause for customer churn is not always easy because identify- ing underlying variables that best describe a customer’s behavioral profiles is a challenging task and may not always yield users’ true intentions thus leading to wrong predictions. Moreover, integration of data from miscella- neous sources such as customer base, call center inbound and outbound calls, billing, etc., to gather information about a customer is not always straightforward. With high competition available, companies are now offering service plans suitable for different customer segments but designing algorithms to group customers with similar preferences based on partial information alone may not yield feasible solutions. • Enhancing online shopping experience: Despite its popularity, online shopping still has to overcome certain challenges to encourage customers. One of the main chal- lenge in predicting customers’ behavior is merging online data with offline transaction data as these datasets may not be managed by a single entity. Customers’ security and privacy concerns around using their transactional data for predicting their spending behavior also need to be addressed satisfactorily. Analyzing data to predict customers’ preference of products, to promote similar products or relevant coupons to targeted audiences, is a challenging issue which only gets worse with time due to users’ changing shopping preferences. • Smart utility meters: One of the major challenge faced by the utility companies is merging data that resides in disparate databases among various departments of utility companies. Credibility of data is another major challenge that could have devastating effect on firm’s reputation. Since the data generated by smart meters may yield abnormalities due to the faulty behaviors caused either by natural conditions or by human interference, thus making decisions based on faulty data can potentially impact utility companies’ revenues. Lack of infrastructure 450
  • 6. to support data processing and analysis, generated from smart meters, is another major challenge faced by utility companies. Predicting customers’ profile patterns includ- ing number of people living in a household, appliances they use and the time of usage of different appliances based on their electricity usage bills for promotional offers could also raise privacy concerns for users. • Security: Although the application of big data analytics in improving security looks promising it has its own challenges [76]. One of the major challenges faced by organizations is the data leakage caused by third party intervention. Data loss is even more vulnerable if it is housed in the cloud. Ownership of information hosted on cloud is another major issue faced by organizations and trust boundaries need to be established carefully between the data owners and the data storage owners. With large datasets stored on cloud, proper security measures must be taken to prevent re-identification of users based on the information available through different datasets. IV. CONCLUSIONS The unprecedented growth in data in almost every sector provides businesses a unique opportunity to use analytics to decipher hidden insights that can be used for making better decisions. In this paper through five different use cases, we have illustrated how analytics can be applied to derive value from big data for various industrial applications. The examples considered in this study include sentiment analy- sis for social media, preventing churn of telecommunication customers, enhancing customers’ online shopping experience, generating value from smart utility meters and improving security. While a number of different techniques have been proposed in the existing literature to derive value for these use cases, classification and clustering models have been most widely used for these applications. The continuing growth of studies that attempt to derive value from big data suggest that big data analytics can provide useful insights for businesses, potentially also leading to increased revenues and business advantages over competition. However, big data analytics also faces challenges that need to be addressed, in conjunction, in order to exploit the full potential of the hidden insights within these large datasets. REFERENCES [1] A. Katal, M. Wazid, and R. Goudar, “Big data: Issues, challenges, tools and good practices,” in Contemporary Computing (IC3), Sixth International Conference on, Aug 2013, pp. 404–409. [2] S. Sagiroglu and D. Sinanc, “Big data: A review,” in Collaboration Technologies and Systems (CTS), International Conference on, May 2013, pp. 42–47. [3] F. Muhtaroglu, S. Demir, M. Obali, and C. Girgin, “Business model canvas perspective on big data applications,” in Big Data, IEEE Inter- national Conference on, Oct 2013, pp. 32–37. [4] A. Rajpurohit, “Big data for business managers; bridging the gap be- tween potential and value,” in Big Data, IEEE International Conference on, Oct 2013, pp. 29–31. [5] Z. Liu, P. Yang, and L. Zhang, “A sketch of big data technologies,” in Internet Computing for Engineering and Science, Seventh International Conference on, Sept 2013, pp. 26–29. [6] S. Dhar and S. Mazumdar, “Challenges and best practices for enterprise adoption of big data technologies,” in Technology Management Confer- ence (ITMC), 2014 IEEE International, June 2014, pp. 1–4. [7] The digital universe of opportunities: Rich data and the increasing value of the internet of things. [Online]. Available: http://www.emc. com/leadership/digital-universe/2014iview/executive-summ%ary.htm [8] P. Malik, “Governing big data: Principles and practices,” IBM Journal of Research and Development, vol. 57, no. 3/4, pp. 1:1–1:13, May 2013. [9] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward scalable systems for big data analytics: A technology tutorial,” Access, IEEE, vol. 2, pp. 652–687, 2014. [10] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster computing with working sets,” in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, ser. HotCloud’10, 2010, pp. 10–15. [11] N. Y. Xin and L. Y. Ling, “How we could realize big data value,” in Instrumentation and Measurement, Sensor Network and Automation (IMSNA), 2013 2nd International Symposium on, Dec 2013, pp. 425– 427. [12] J. Wielki, “Implementation of the big data concept in organizations - possibilities, impediments and challenges,” in Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on, Sept 2013, pp. 985–989. [13] H. Hu, Y. Wen, T.-S. Chua, and X. Li, “Toward scalable systems for big data analytics: A technology tutorial,” Access, IEEE, vol. 2, pp. 652–687, 2014. [14] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon- based methods for sentiment analysis,” Comput. Linguist., vol. 37, no. 2, pp. 267–307, 2011. [15] M. Hu and B. Liu, “Mining and summarizing customer reviews,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 168–177. [16] User modeling improves understanding of people’s prefer- ences to help engage users on their own terms. [On- line]. Available: http://www.ibm.com/smarterplanet/us/en/ibmwatson/ developercloud/user-mo%deling.html [17] Five sentiment analysis tools that wont cost you a cent. [Online]. Available: http://www.fieldassignment.com/2011/04/ free-sentiment-analysis-tools.ht%ml [18] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, 2014. [Online]. Available: http://www.sciencedirect.com/science/article/ pii/S2090447914000550 [19] E. Boiy, P. Hens, K. Deschacht, and M. francine Moens, “Automatic sentiment analysis in on-line text,” in In Proceedings of the 11th International Conference on Electronic Publishing, 2007, pp. 349–360. [20] D. Maynard and A. Funk, “Automatic detection of political opinions in tweets,” in The Semantic Web: ESWC 2011 Workshops, vol. 7117, 2012, pp. 88–99. [21] B. Liu, Sentiment Analysis and Opinion Mining. Morgan and Claypool Publishers, 2012. [22] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: Sentiment classification using machine learning techniques,” in Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 2002, pp. 79–86. [23] M. Usha and M. Indra Devi, “Analysis of sentiments using unsupervised learning techniques,” in Information Communication and Embedded Systems, International Conference on, Feb 2013, pp. 241–245. [24] G. Li and F. Liu, “A clustering-based approach on sentiment analysis,” in Intelligent Systems and Knowledge Engineering, International Con- ference on, Nov 2010, pp. 331–337. [25] L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu. (2011) Combining lexicon-based and learning-based methods for twitter sentiment analysis. [Online]. Available: http://www.hpl.hp.com/techreports/2011/ HPL-2011-89.html [26] P. P. Balage Filho, L. V. Avanc¸o, M. d. G. V. Nunes, and T. A. S. Pardo, “NILC USP: An improved hybrid system for sentiment analysis in twitter messages,” in Proceedings of the 8th International Workshop on Semantic Evaluation. Association for Computational Linguistics and Dublin City University, 2014, pp. 428–432. [27] M. Neethu and R. Rajasree, “Sentiment analysis in twitter using machine learning techniques,” in Computing, Communications and Networking Technologies (ICCCNT),2013 Fourth International Conference on, July 2013, pp. 1–5. [28] C. Z. Charu C. Aggarwal, Mining Text Data. Springer, 2012. [29] Y. Mejova and P. Srinivasan, “Exploring feature definition and selection for sentiment classifiers,” in ICWSM’11, 2011, pp. 1–6. [30] Twitter, ibm announce a new data analytics part- nership. [Online]. Available: http://fortune.com/2014/10/29/ twitter-ibm-data-analytics-partnership/ 451
  • 7. [31] N. Kamalraj and A. Malathi, “A survey on churn prediction techniques in communication sector,” International Journal of Computer Applications, vol. 64, no. 5, pp. 39–42, February 2013, full text available. [32] W. Bandara, A. Perera, and D. Alahakoon, “Churn prediction method- ologies in the telecommunications sector: A survey,” in Advances in ICT for Emerging Regions, International Conference on, Dec 2013, pp. 172–176. [33] C.-P. Wei and I.-T. Chiu, “Turning telecommunications call details to churn prediction: a data mining approach,” Expert Systems with Applications, vol. 23, no. 2, pp. 103 – 112, 2002. [34] C. Zhao, Y. Wu, and H. Gao, “Study on knowledge acquisition of the telecom customers’ consuming behaviour based on data mining,” in Wireless Communications, Networking and Mobile Computing, 4th International Conference on, Oct 2008, pp. 1–5. [35] J. Zhao, W. Zhang, and Y. Liu, “Improved k-means cluster algorithm in telecommunications enterprises customer segmentation,” in Information Theory and Information Security, IEEE International Conference on, Dec 2010, pp. 167–169. [36] L. Ye, C. Qiu-ru, X. Hai-xu, L. Yi-jun, and Y. Zhi-min, “Telecom customer segmentation with k-means clustering,” in Computer Science Education, 7th International Conference on, July 2012, pp. 648–651. [37] Celcom loyalty deals. [Online]. Available: http://www2.nst.com.my/ nation/celcom-loyalty-deals-1.558917 [38] J. Li. (2013) Study: Online shopping behavior in the digital era. [Online]. Available: http://www.iacquire.com/blog/ study-online-shopping-behavior-in-the-digi%tal-era [39] P. Yang, Q. lun Zheng, H. Peng, and Q. Tan, “A stepwise learning approach to automatic discovery of interest data blocks,” in Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on, vol. 3, Aug. 2004, pp. 1441–1446. [40] (2014) Making online shopping smarter with ad- vanced analytics. [Online]. Available: www.cognizant.com/.../ Making-Online-Shopping-Smarter-with-Advanced-anal%ytics.pdf [41] R. Dewan, M. Freimer, and Y. Jiang, “Using online competitor’s inven- tory information for pricing,” in System Sciences, 40th Annual Hawaii International Conference on, Jan 2007, pp. 210a–210a. [42] M. Mestre and P. Vitoria, “Tracking of consumer behaviour in e- commerce,” in Information Fusion, 16th International Conference on, July 2013, pp. 1214–1221. [43] M. Gupta, H. Mittal, P. Singla, and A. Bagchi, “Characterizing compar- ison shopping behavior: A case study,” in Data Engineering Workshops (ICDEW), 2014 IEEE 30th International Conference on, March 2014, pp. 115–122. [44] T. Jiang and A. Tuzhilin, “Segmenting customers from population to individuals: Does 1-to-1 keep your customers forever?” Knowledge and Data Engineering, IEEE Transactions on, vol. 18, no. 10, pp. 1297– 1311, Oct 2006. [45] H.-W. Yang, Z. geng Pan, X.-Z. Wang, and B. Xu, “A personalized products selection assistance based on e-commerce machine learning,” in Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on, vol. 4, Aug. 2004, pp. 2629–2633. [46] P. Henry and H. Luo, “Wifi: what’s next?” Communications Magazine, IEEE, vol. 40, no. 12, pp. 66–72, Dec 2002. [47] Wal-mart paid 300 million-plus for kos- mix. [Online]. Available: http://allthingsd.com/20110418/ exclusive-wal-mart-paid-300-million-plus%-for-kosmix/ [48] Beacons, beacons, everywhere beacons. [Online]. Available: http://www.mediapost.com/publications/article/231059/ beacons-beacons-ev%erywhere-beacons.html [49] Stores sniff out smartphones to follow shoppers. [On- line]. Available: http://www.technologyreview.com/news/520811/ stores-sniff-out-smartphone%s-to-follow-shoppers/ [50] D. Alahakoon and X. Yu, “Advanced analytics for harnessing the power of smart meter big data,” in Intelligent Energy Systems, IEEE International Workshop on, Nov 2013, pp. 40–45. [51] Generating big value from big data in energy and utilities. [Online]. Available: http://www-01.ibm.com/software/data/bigdata/ industry-energy.html3 [52] Utilities and big data: Using analytics for increased customer satisfaction. [Online]. Available: http://www.oracle.com/us/industries/ utilities/big-data-analytics-custom%er-wp-2075868.pdf [53] S. Valero, M. Ortiz, C. Senabre, C. Alvarez, F. Franco, and A. Gabaldon, “Methods for customer and demand response policies selection in new electricity markets,” Generation, Transmission Distribution, IET, vol. 1, no. 1, pp. 104–110, January 2007. [54] A. Albert and R. Rajagopal, “Smart meter driven segmentation: What your consumption says about you,” Power Systems, IEEE Transactions on, vol. 28, no. 4, pp. 4019–4030, Nov 2013. [55] H. Liao and D. Niebur, “Load profile estimation in electric transmission networks using independent component analysis,” Power Systems, IEEE Transactions on, vol. 18, no. 2, pp. 707–715, May 2003. [56] C. Beckel, L. Sadamori, T. Staake, and S. Santini, “Revealing household characteristics from smart meter data,” Energy, 2014. [57] S. K. T. J. Nagi, K. S. Yap and S. K. Ahmed, “2ndinternational power engineering and optimization conference,” in Power Load Forecasting using Hybrid Self-Organizing Maps and Support Vector Machines, June 2008. [58] F. Zhao, G. Wang, C. Deng, and Y. Zhao, “A real-time intelligent abnormity diagnosis platform in electric power system,” in Advanced Communication Technology (ICACT), 2014 16th International Confer- ence on, Feb 2014, pp. 83–87. [59] M. LaMonica. Bringing big data to smart meters. [Online]. Available: http://www.technologyreview.com/view/506476/ bringing-big-data-to-smart-%meters/ [60] Cyber security analytics. [Online]. Available: http://www.teradata.com/ Cyber-Security-Analytics/ [61] T. Mahmood and U. Afzal, “Security analytics: Big data analytics for cybersecurity: A review of trends, techniques and tools,” in Information Assurance (NCIA), 2013 2nd National Conference on, Dec 2013, pp. 129–134. [62] P. Panigrahi, “A comparative study of supervised machine learning techniques for spam e-mail filtering,” in Computational Intelligence and Communication Networks, Fourth International Conference on, Nov 2012, pp. 506–512. [63] T. Subbulakshmi, S. Shalinie, V. GanapathiSubramanian, K. BalaKrish- nan, D. AnandKumar, and K. Kannathal, “Detection of ddos attacks using enhanced support vector machines with real time generated dataset,” in Advanced Computing (ICoAC), 2011 Third International Conference on, Dec 2011, pp. 17–22. [64] G. Tsang, P. Chan, D. Yeung, and E. Tsang, “Denial of service detection by support vector machines and radial-basis function neural network,” in Machine Learning and Cybernetics, Proceedings of 2004 International Conference on, vol. 7, Aug 2004, pp. 4263–4268. [65] M. Mas’ud, S. Sahib, M. Abdollah, S. Selamat, and R. Yusof, “Analysis of features selection and machine learning classifier in android malware detection,” in Information Science and Applications, International Con- ference on, May 2014, pp. 1–5. [66] J. James, L. Sandhya, and C. Thomas, “Detection of phishing urls using machine learning techniques,” in Control Communication and Computing, International Conference on, Dec 2013, pp. 304–309. [67] A. Almomani, B. Gupta, S. Atawneh, A. Meulenberg, and E. Almomani, “A survey of phishing email filtering techniques,” Communications Surveys Tutorials, IEEE, vol. 15, no. 4, pp. 2070–2090, Fourth 2013. [68] B. Thuraisingham, “Data mining for security applications,” in Machine Learning and Applications, 2004. Proceedings. 2004 International Con- ference on, Dec 2004, pp. 3–4. [69] A. Aziz, A. Hassanien, S.-O. Hanaf, and M. Tolba, “Multi-layer hybrid machine learning techniques for anomalies detection and classification approach,” in Hybrid Intelligent Systems (HIS), 2013 13th International Conference on, Dec 2013, pp. 215–220. [70] L. Khan, M. Awad, and B. Thuraisingham, “A new intrusion detection system using support vector machines and hierarchical clustering,” The VLDB Journal, vol. 16, no. 4, pp. 507–521, Oct. 2007. [71] T. Subbulakshmi, S. Shalinie, V. GanapathiSubramanian, K. BalaKrish- nan, D. AnandKumar, and K. Kannathal, “Detection of ddos attacks using enhanced support vector machines with real time generated dataset,” in Advanced Computing, Third International Conference on, Dec 2011, pp. 17–22. [72] M. Khorshed, A. Ali, and S. Wasimi, “Trust issues that create threats for cyber attacks in cloud computing,” in Parallel and Distributed Systems, IEEE 17th International Conference on, Dec 2011, pp. 900–905. [73] Ibm security intelligence with big data. [Online]. Available: http: //www-03.ibm.com/security/solution/intelligence-big-data/ [74] J. T. Mr. Saifee Vohra, “Applications and challenges for sentiment analysis : A survey,” International Journal of Engineering Research and Technology, vol. 2, 2013. [75] H. R. P, “Opinion mining and sentiment analysis - challenges and applications,” International Journal of Application or Innovation in Engineering and Management (IJAIEM), vol. 3, 2014. [76] A. A. Cardenas, P. K. Manadhata, and S. P. Rajan, “Big data analytics for security,” IEEE Security and Privacy, vol. 11, no. 6, pp. 74–76, 2013. 452