2. OUTLINE :
• INTRODUCTION
• CHARACTERISTIC OF BIG DATA
• PROCESSES IN BIG DATA
• BIG DATA SOURCES
• TYPES OF TOOLS
• BIG DATA STATISTICS
• APPLICATIONS OF BIG DATA
• ADVANTAGES AND DISADVANTAGES
• CONCLUSION
3. INTRODUCTION :
• Big data refers to data sets that are too large or complex
for traditional data-processing application software to
adequately deal with.
• Firms like Google, eBay, LinkedIn and Facebook were
built around big data from the beginning.
• Big data generates value from storage and process very
large quantities of digital information that cannot be
analyzed with traditional computer system.
5. VOLUME :
• A typical PC might have 10gb of storage in 2000.
• Today, Facebook ingests 500 terabytes of new
data everyday.
• Boeing 737 will generate 240 terabytes of flight
data during a single flight across US.
• Thus the big data source must be very vast in
volume.
6. VELOCITY :
• Clickstreams and ad impressions capture user behaviour at
millions of events per second.
• Machine to machine processes exchange data between
billion of devices.
• Infrastructure and sensors generate massive log data in real
time.
• Online gaming systems supports millions of concurrent
users, each producing multiple inputs per second.
7. VARIETY :
• Big data is not just numbers but also collection of
3D data, audio and video, unstructured text,
including log files and social media.
• Traditional database systems are designed only to
store small amount of information.
• Big data analysis includes different types of data.
8. PROCESSES IN BIG DATA :
I. INTEGRATING DISPARATE DATA STORES :
• Mapping data to programming frame network.
• Connecting and extracting data from storage.
• Transforming data for processing.
• Subdividing data in preparation for Hadoop MapReduce.
II.EMPLOYING HADOOP MAPREDUCE :
• Creating the components of Hadoop MapReduce jobs.
• Distributing data processing across server farms.
• Executing Hadoop MapReducing jobs.
• Monitoring the progress of job flows.
10. DATA GENERATION POINTS :
• Mobile devices
• Microphones
• Readers/Scanners
• Science facilities
• Programs/Software
• Social media
• Cameras
11. TYPES OF TOOLS :
Where processing data is hosted?
Distributed servers/cloud(eg. Amazon EC2).
Where data is stored?
Distributed storage(eg. Amazon S3).
How data is stored and indexed?
High performance schema-free databases(eg. MongoDB).
What operations are performed on data?
Analytic/Semantic processing.
12. BIG DATA STATISTICS :
• Facebook generates 10TB of data daily.
• Twitter generates 7TB of data daily.
• IBM claims 90% of today’s stored data is
generated in the last two years.
• Everyday we create 2.5 quintillion bytes of data.
• Walmart handles more than 1 million customer
transactions every hour.
14. ADVANTAGES OF BIG DATA :
• It helps in improving science and research. It improves
healthcare and public health with availability of record of
patients.
• It helps in financial tradings, sports, polling, security/law
enforcement etc.
• Any one can access vast information via surveys and
deliver answer of any query.
• Every second additions are made.
• One platform carry unlimited information.
15. DISADVANTAGES OF BIG DATA :
➨Traditional storage can cost lot of money to store
big data.
➨Lots of big data is unstructured.
➨Big data analysis violates principles of privacy.
➨Big data analysis results are misleading
sometimes.
➨Speedy updates in big data can mismatch real
figures.
16. CONCLUSION :
• Big data brings new and existing
opportunities to companies who utilize the
platforms available.
• In this information Era, big data technology
has got its own importance for businesses.
• It has got lot of opportunities in the
upcoming days.