2. What is Big Data..?
• Big data is a buzzword, or catch-phrase, used to describe a massive
volume of both structured and unstructured data that is so large
that it's difficult to process using traditional database and
software techniques.
• In most enterprise scenarios the data is too big or it moves
too fast or it exceeds current processing capacity.
3. DIMENSIONS OF ‘BIG DATA’
Volume: The amount of information being collected is so huge that modern
database management tools are becoming overloaded and therefore obsolete.
Velocity: The sheer velocity at which we are creating data today is a huge cause of
big data.
Variety: Different forms of data i.e. from sharing of online videos &images , data
from social networks
4. An Example of Big Data
An example of big data might be petabytes (1,024 terabytes)
or exabytes(1,024 petabytes) of data consisting of billions to trillions
of records of millions of people—all from different sources like:
•
•
•
•
•
•
•
•
•
Social networks
Banking and financial services
E-commerce services
Web-centric services
Internet search indexes
Scientific searches
Document searches
Medical records
Weblogs
5. Big data technology
Big data technology must support search, development, governance and
analytics services for all data types—from transaction and application data to
machine and sensor data to social, image and geospatial data, and more.
•
•
•
•
Common characteristics of big data insights include:
Addresses speed and scalability, mobility and security, flexibility and stability
Integration of both structured and unstructured data
The realization time to information is critical to extract value from various data
sources including mobile devices, radio-frequency identification (RFID), the
Web and a growing list of automated sensory technologies
Benefits of Big Data include:
More accurate data
Improved business decisions
Improved marketing strategy and targeting
Increased revenue due to increased customer and base and decreased costs
6.
7. Not every data management/analysis problem is best solved
exclusively using a traditional DBMS
A NoSQL database provides a mechanism for storage and retrieval of
data that is modeled in means other than the tabular relations used in
relational databases.
“Schema-less Models”:
Increasing Flexibility for Data Manipulation NoSQL data systems provide a
more relaxed approach to data modeling often referred to as schema-less
modeling
Semantics of the data are embedded within a flexible connection topology
and a corresponding storage model.
Provides greater flexibility for managing large data sets while simultaneously
reducing the dependence on the more formal database structure imposed by
the relational database systems.
8. NoSQL Database Types
I.
Document databases pair each key with a complex data structure
known as a document. Documents can contain many different keyvalue pairs, or key-array pairs, or even nested documents.
II.
Graph stores are used to store information about networks, such as
social connections. Graph stores include Neo4J and HyperGraphDB.
III. Key-value stores are the simplest NoSQL databases. Every single item
in the database is stored as an attribute name (or "key"), together
with its value. Examples of key-value stores are Riak and Voldemort.
Some key-value stores, such as Redis, allow each value to have a type,
such as "integer", which adds functionality.
IV. Wide-column stores such as Cassandra and HBase are optimized for
queries over large datasets, and store columns of data together,
instead of rows.
9. Some of the key technologies concepts associated with BigData:
• Hadoop
• HDFS
• MapReduce
• MongoDB
• Cassandra
• PIG
• HIVE
• HBase
10. The Benefits of NoSQL
When compared to relational databases, NoSQL databases are more scalable
and provide superior performance, and their data model addresses several issues
that the relational model is not designed to address:
• Large volumes of structured, semi-structured, and unstructured data.
• Object-oriented programming that is easy to use and flexible.
• Efficient, scale-out architecture instead of expensive, monolithic architecture.
11. Cont…
NoSQL databases differ from the traditional relational database management system
as they do not require data to fit a schema. Utilizing the NoSQL database gives
organizations access to a range of benefits including the following:
Elastic scaling: organizations are able to scale out and take advantage of new nodes
according to their data storage needs.
No need for data to fit a schema: both structured and unstructured data can be
stored as there is no fixed data model. This flexibility gives organizations access to
much larger quantities of data.
Ability to cope with hardware failure: accepting that hardware failures will occur
meant the NoSQL database was designed with redundancy in mind.
Quick and easy development: it is easy to change how data is stored using
refactoring or batch processing.
These benefits mean the NoSQL database is ideally suited to those organizations that
need a database which can cope with large amounts of disparate data.
12. Five challenges of NoSQL
1. Maturity: For the most part, RDBMS systems are stable and richly functional. In
comparison, most NoSQL alternatives are in pre-production versions with many
key features yet to be implemented.
2. Support: All RDBMS vendors go to great lengths to provide a high level of
enterprise support. In contrast, most NoSQL systems are open source projects,
and although there are usually one or more firms offering support for each NoSQL
database.
3. Analytics and business intelligence: NoSQL databases offer few facilities for adhoc query and analysis. Even a simple query requires significant programming
expertise, and commonly used BI( Business Intelligence ) tools do not provide
connectivity to NoSQL.
4. Administration: NoSQL today requires a lot of skill to install and a lot of effort to
maintain.
5. Expertise: There are literally millions of developers throughout the world, and in
every business segment, who are familiar with RDBMS concepts and
programming. In contrast, almost every NoSQL developer is in a learning mode.
13. Conclusion
BIG DATA is a key for innovation and has a high potential for value creation.
There are huge opportunities, for example concerning healthcare, location
related data, retail, manufacturing, or social data. There are also challenges, for
example concerning data volume, data quality, data capturing, and data
management, such as privacy, security or governance.
NoSQL databases are becoming an increasingly important part of the database
landscape, and when used appropriately, can offer real benefits. However,
enterprises should proceed with caution with full awareness of the legitimate
limitations and issues that are associated with these databases.