2. Big data is a collection of problems that test the limits of
current IT and computational technologies, as well as
existing algorithms.
Big data is traditionally described as a large volume of
data (in excess of 1 terabyte) processed at a high rate
with a high degree of variety and veracity
The vast amount of data – both structured and
unstructured – that inundates a company on a daily basis
is referred to as big data. What counts is what companies
do with the data. Big data may be analysed for
perspectives that contribute to stronger business choices
and strategic steps.
Introduction
3. 3 V's of Big Data
Data Volume
Tera Byte
Records
Transaction
Tables, File's
Data Velocity
Real Time
Streams
Near Time
Batches
Data Variety
Mixed
Structure
Unstructure
Semi-structured
4. 3 V's of Big Data
Volume:-
Provides the
amount of the data
and from of data.
Velocity:-
Provides the time
at which the data
is collected and
analyzed.
Variety:-
Provide the type
of data which is
collected.
5. Hive
H Base
Hadoop Distributed File System
Data models: Key value, graph, document,
column-family.
Storing Big Data
Analyse your data characteristics
Select data source for analysis
Eliminate redundant data
Establishing the role of NoSQL
Overview of Big Data stores
6. Selecting Big Data Stores
Depending on your data characteristics, choose
the appropriate data stores
Transferring code to data
Putting in place a polyglot data storage solution
Aligning market objectives for the most suitable
data repository