The presentation includes the introduction to the topic, the various dimensions of big data, its evolution from big data 1.0 to bid data 3.0 and its impact on various industries, uses as well as the challenges it faces. The concluding slide gives a brief on the future of big data.
2. Introduction
• Big Data represents technological advancement focussed on the massive data
being generated at breakneck speeds & variety .
• It came to the forefront as one of the rapidly growing IT pillars of the future such
as blockchain and was driven by Iot & pervasive use of social media
• Lead to shift in companies attitude now focussing on making optimal use of data
and becoming data driven
• Data comes in 2 forms- a)structured b)unstructured
• Growing at an exponential rate and a 50$+ billion dollar market currently
• Its roots can be traced back to as early as 1995 when it started taking shape.
2
3. Dimensions of Big Data
Volume
• Volume refers to the amount of data an organization or an individual collects
and/or generates
• Any data exceeding 1TB is called as big data.
Variety
• Data are mostly classified into 3 types namely - Unstructured, Semi structured and
Structured.
• Unstructured - Text, photo, video, audio, sensor data, and clickstream data
• Semi structured - Extensible Business Reporting Language (XBRL)
• Structured - Traditional databases (Relational database, NoSQL database)
...
3
4. Velocity
• Velocity means the rate at which the data is being generated
• With increase in technology the velocity of data has also increased
• The enhanced capability of data generation from connected devices will continue
to accelerate the velocity
Veracity (Introduced by IBM)
• Veracity refers to the uncertainty and unreliability of data sources.
• These uncertainty arise due to latency, redundancy, inaccuracy and deception of
data.
...
4
5. Variability and Complexity (Introduced by SAS)
• Variation in rate of data flow is called variability. can fluctuat with unpredicted
eaks and troughs.
• Complexity refers to the number of data sources. reduction in this is necessary
Value (Introduced by Oracle)
• The value of data can not be judged initially, data cannot be of high value in its
initial from but using data analytics it can be transformed into high value asset.
• Everything is upon IT professionals and managers to extract the value out of the
given data
5
7. Evolution of Big Data
The advent of the World Wide Web (WWW) in the early 1990s led to the
explosive growth of data and the development of big data analytics and evolved
through three major stages.
Big Data 1.0
• Arrival of e-commerce in 1994
• Online firms were the main contributors of the web content and web mining
techniques were developed to analyze users’ online activities
• Mining processes helped to discover web users’ usage browsing pattern
• Connectivity through hyperlink
• Classification of web pages
• Mining techniques in image processing and computer vision application was
limited
...
7
8. Big Data 2.0
• Social media analytics support social media content mining, usage
mining, and structure mining activities
• Sentiment analysis
• Lexical –based methods and machine-learning methods, to overcome the
sentiment analysis flaws
• Social networking sites were the central point to socialize
Big Data 3.0
• Introduction of IoT applications
• Devices used sensors that have unique identifiers which has the ability to share
data, collaborate over the internet without human intervention
• Trending streaming analytics which was far better than social media analytics
8
9. An Illustrative Example : Merchant Reviews
• A very good example of application of Big Data in recent times would be
Merchant Reviews. With multiple big name sites popping up with customer
reviews as their product, these Merchant Review sites have been the target of
many researchers.
• Customer written reviews are perceived as most credible.
• Other users can rate the reviews as helpful or not, which further refines the most
useful data.
• This data is regularly researched upon and run through various models for
companies to translate it into business value.
• Most reviews with higher scores are perceived as less helpful than those with
lower scores.
• Number of words in a review shows direct relationship to helpfulness.
9
10. Impact of
Big Data
Create New
Business
Develop
New
Products/
Services
Improving
Business
Operations
Cost
Savings
Better
Decision
Making
Higher
Service
Quality
10
11. Personalised Marketing
• Personalised products/services, coupons, promotional offers
• Macy’s and Target analyse shopper’s preferences and sentiments to improve
shopping experience
• Banks – increase revenue, increase client retention, better services
• U.S. Bank used both online and offline channels to enhance Customer relation
management, thereby leading to a rise in conversion rate up to 100%
Better Pricing
• Big data helps to set prices appropriately
• Use of open source technology helps in cost optimization and customer
satisfaction, e.g., eBay’s use of open source Hadoop technology
...
11
12. Cost Reduction
• Faster and effective reaction in supply chain issues
• Better demand forecasts, Real-time tracking , optimised distribution network
management and reduction in operational costs, e.g., Retail industry
• GE helped Oil and Gas industry(better efficiency with higher productivity)
and Southwest Airlines(fuel saving opportunities)
Improved Customer Service
• Integration of data from multiple channels helps the firms to understand the
customer better , e.g., Hertz in U.S
• Real time transaction analysis to detect fraudulent activities and informing
the customers
• Use of speech analytics and social media analytics helped Southwest Airlines
to provide better service offerings
12
13. Challenges in Big Data
• Data Quality - Data quality means relevance of data respective to the key
management decisions that have to be made. Low quality, unstructured data can
lead to false analysis and insights and thus affect the management processes.
There should be internal control systems in place to assure the quality and
reliability of data collected.
• Data Security - Data branches, data leaks and weak security can cause huge
financial losses for the company as well as damage to brand reputation. Highly
efficient firewalls and detection systems should be in place to ensure security of
confidential data.
• Privacy - The level of users data collected by firms can also raise concern about
user privacy and consent. On the flip side, the collection and use of personal data
can be used to improve quality of services and reduce costs, which is beneficial
for both the firms and the customers. Hence, there is trade off involved between
customer privacy and deeper customer insights for product development.
...
13
14. • High investment - Data analytics is a very efficient technology but its
applicability is limited to certain aspects of business. Thus, firms should properly
conduct the cost-benefit analysis before investing huge sums of money and
resources in big data analytics. Future cash inflows and projections should
justify the investment.
• Data Management - Data analytics require highly efficient hardware and
software resource for seamless functioning. Traditional DBMS and systems may
not be compatible with big data applications. Data warehouses management is
also very important as petabytes of big data is stored there.
• Required Talent and Expertise - The key element of successful data
analytics is the human resource that will manage, filter, and organize, loads of
unstructured data. Firms will have to invest highly in talent acquisition channels
and competitive salaries to attract qualified data scientists. Internal training
programs to be conducted to train employees.
14
15. Future in Big Data
• There is a prediction that the data generated would reach 175 zettabytes by 2025
• Machine learning would help in forming more powerful unsupervised
algorithms, greater personalisation, and cognitive services will greatly improve
computer’s ability to learn from data
• Demand for data scientists and chief data officers would be high with the
increasing availability of data so as to suffice for the analytical purposes
• Privacy would be a hot issue as data volumes increase, safeguarding it against
invasions and cyberattacks becomes more difficult, as data protection standards
cannot keep up with the rate of data expansion
• Unlike Big data, Fast data and actionable data would come into play as it allows
for processing real time streams
15