SlideShare una empresa de Scribd logo
1 de 67
Dr.V.Bhuvaneswari
Assistant Professor
Department of Computer Applications
Bharathiar University
Coimbatore
bhuvanes_v@yahoo.com, bhuvana_v@buc.edu.in
visit at www.budca.in/faculty.php
BIG DATA ROADMAP
Big Data Roadmap
 Big Data – What?
◦ Timeline – Big Data Predictions
◦ Data Explosion
 Big Data Myths
 Big Data
 5Vs of Big Data
 Why Big Data
 Data as Data Science
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Data Landscape
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Timeline – Big Data
Predictions
1944- Yale Library in 2040 will have “approximately
200,000,000 Volumes
1961- Scientific Journals will grow exponentially rather than
linearly, doubling every fifteen years and increasing
by a factor of ten during every half-century.
1975- Ministry of Posts and Telecommunications in Japan
introduced words as unifying unit of measurement
1997- First article published by Michael Cox and David
Ellsworth in in the ACM digital library to the term
“Big data.”
Big Data evolved in 1997 and exploded to greater heights in
2010 and become popular in 2012
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
DATA EXPLOSION & ATTENTION
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Explosion
12+ TBs
of tweet data
every day
25+ TBs
of
log data
every day
?TBsof
dataevery
day
2+
billion
people
on the
Web by
end 2011
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world
wide
100s of
millions
of GPS
enabled
devices
sold
annually
76 million smart
meters in 2009…
200M by 2014
Data Growth – in Units
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Data Deluge
Desktop
Hobbyist
The Future?
Internet
Big Data
Byte : one grain of rice
Kilobyte : cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks with rice
Terabyte : 2 Container Ships
Petabyte : State full of rice bag
Exabytes : States filled with rice bag
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICE BALL!
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
BIG DATA FACTS
 Every 2 days we create as much information
as we did from the beginning of time until
2003
 Over 90% of all the data in the world was
created in the past 2 years.
 It is expected that by 2020 the amount of
digital information in existence will have
grown from 3.2 zettabytes today to 40
zettabytes.
 Every minute we send 204 million emails,
generate 1.8 million Facebook likes, send
278 thousand Tweets, and up-load 200,000Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data – Popularity
 2010 – Social Networking – Post Internet
era – Facebook – Big Data
 2012 – Election of US President – Hot
 After 2012 – Industries – Hadoop
Big Data in India
 Election process – BJP Govt
 Current electioneering scenario
 Election Campaign through social media
requires permission - Election
commission
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
BIG DATA GOT SO BIG ?
Big data is a commodity as compared
to Gold.
Big data is Hot!Now What?
Businesses Freak Out Over Big Data
– Information Week
Big Data Grows up – Forbes
2012 : The Year of Big
Big Data Powers Revolution in
Decision Making – Wall Street Journal
Business Opportunities in Big Data –
Inc
THE HINDU 2015
www.thehindu.com THE WORLD’S FAVOURITE NEWSPAPER - Since 1879
5
BIG DATA MYTHS
Big Data
• New
• Only About Massive Data Volume
• Means Hadoop
• Need A Data Warehouse
• Means Unstructured Data
• for Social Media & Sentiment
Analysis
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
What is Big Data?
Lets Us Clarify
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data
Big Data is
 A complete subject with tools, techniques
and frameworks.
 Technology which deals with large and
complex dataset which are varied in data
format and structures, does not fit into
the memory.
 Not about huge volume of data; provide
an opportunity to find new insight into the
existing data and guidelines to capture
and analyze future data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data : A Definition
 Big data is the realization of greater
business intelligence by storing,
processing, and analyzing data that
was previously ignored due to the
limitations of traditional data
management technologies
:Source: Harness the Power of Big Data: The IBM Big Data Platform
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
BIG DATA as Platform
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Source: IBM
Characteristics – Big
Data
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Comp. Appll., Bharathiar
University,-
4 V‘s of Big Data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
6Vs of Big Data
Volume
Velocity
Variety
Veracity
Value
Validity
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
What is in Big Data ?
Why Big Data Analytics?
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data
Exploration
Find, visualize,
understand all big
data to improve
decision making
Enhanced 360o View
of the Customer
Extend existing customer
views (MDM, CRM, etc) by
incorporating additional
internal and external
information sources
Security/Intelligence
Extension
Lower risk, detect fraud
and monitor cyber security
in real-time
Data Warehouse Augmentation
Integrate big data and data warehouse
capabilities to increase operational
efficiency
Operations Analysis
Analyze a variety of machine
data for improved business results
The 5 Key Big Data Use Cases
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Technologies
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Conventional approaches
RDBMS
OS FILE SYSTEM
SQL QUERIES
CUSTOM FRAMEWORK
* C / C++
* PERL
* PYTHON
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
ISSUES IN LEGACY SYSTEMS
Limited Storage Capacity
Limited Processing Capacity
No Scalability
Single point of Failure
Sequential Processing
RDBMSs can handle Structured Data
Requires preprocessing of Data
Information is collected according to
current business needs
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
25
Mr. HADOOP says he has a solution to
our BIG problem !
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
27
Mr. HADOOP says he has a solution
to our BIG problem !
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
27
What is
Apache Hadoop Is A Framework That Allows For The
Distributed Processing Of Large Datasets Across Clusters Of
Commodity Computers Using A Simple Programming Model.
Concept
Moving computation is more efficient than moving
large data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
31
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
28
TWO DAEMONS OF HADOOP
44
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
HDFS ARCHITECTURE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
TERMINIOLOGY REVIEW
Node 1
Node 2
Node n
:
:
Rack 1
Node 1
Node 2
Node n
:
:
Rack 2
:
:
Cluster
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
35
HADOOP CLUSTER ARCHITECTURE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
36
WHAT IS HDFS
 Hadoop Distributed File System
Highly Fault tolerant , distributed , reliable , scalable
file system for data storage.
Stores multiple copies of data on different nodes
A File is split up into blocks and stored on multiple
machines
Hadoop cluster typically has a single namenode and no.
of data nodes to form a hadoop cluster.
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
HDFS BLOCKS
• Files are broken in to large blocks.
Typically 128 MB block size
Blocks are replicated for reliability
One replica on local node
Another replica on a remote rack
Third replica on local rack,
Additional replicas are randomly placed
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
41
HDFS BLOCKS contd.,
ADVANTAGES OF HDFS BLOCKS
Fixed Size
Chunk of file < block size : Only needed space
is used.
Eg : 420 MB file is split as
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
42
HDFS Operation Principle
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
43
NAME NODE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
44
DATA NODE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
45
SECONDARY NAME NODE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
46
HDFS – BLOCK REPLICATION
ARCHITECTURE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
48
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP REDUCE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP REDUCE - ANALOGY
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP REDUCE – ANALOGY
CONTD.,
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP REDUCE EXAMPLE
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP EXECUTION
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
MAP EXECUTION – DISTRIBUTED TWO NODE
ENVIRONMENT
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
HADOOP JOB WORK INTERACTION
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Data Science - Big Data Technology
 Collect, Load, Transform
◦ ETL SCRIBE, FLUME
 Store
◦ HADOOP, SPARK, STORM
 Process, Analyze and Reasoning
◦ Computational Algorithms,
◦ Statistical Methods and Models
 PIG, HIVE,
 R,PHYTON, JAVA, SCALA,
 CLOJURE, MAHOUT
 Visualization
◦ DASHBOARD, APP
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data - Market
Big Data Market Size
Potential Talent Pool -Big
Data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
India will require a minimum of 1 lakh data scientists in the next couple
of years in addition to data analysts and data managers to support the
Big Data space.
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Use Cases
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Usecase
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
SBI
 State Bank of India (SBI) ran its newly
acquired data-mining software recently to
check for purity of data.
 Made an interesting find - close to one crore
accountholders have not provided any
nomination for their savings accounts. What
is worse, over half of them are senior
citizens.
 To analyse trends in Banks, SBI has hired a
whole team of statisticians and economists.
 Identify default patterns, high value
customers.
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Applications - India
 Big Data – Elections
 SBI uses big data mining to check
defaults
 Karnataka Govt – Identify water
leakage
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Big Data Challenges
Privacy Protection
All Big data stages collect, store, process,
knowledge
Integration with enterprise landscape
All systems store data in rdbms,DW
Does not support bulk loading to Big data store
Limited number of analytics from Mahout
Big data technologies lack visualization support
and deliverable methods
Leveraging cloud computing for big data applications
Addressing Real time needs with varied format
and volume Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-

Más contenido relacionado

La actualidad más candente

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopApache Apex
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...Simplilearn
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologiesneeraj rathore
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Simplilearn
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 

La actualidad más candente (20)

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Big data
Big dataBig data
Big data
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Big data
Big dataBig data
Big data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is big data?
What is big data?What is big data?
What is big data?
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 

Similar a Big data analytics

Management of Data Collections
Management of Data CollectionsManagement of Data Collections
Management of Data Collectionsabedejesus
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open ScienceBeth Plale
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
Toward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemToward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemGlobus
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Anita de Waard
 
Biomedical Data Sciences - New Name and New Opportunities for Change?
Biomedical Data Sciences - New Name and New Opportunities for Change?Biomedical Data Sciences - New Name and New Opportunities for Change?
Biomedical Data Sciences - New Name and New Opportunities for Change?Philip Bourne
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8Scott Edmunds
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Big Data Brown Bag
Big Data Brown BagBig Data Brown Bag
Big Data Brown Bagusmanqureshi
 
Publishing Data on the Web
Publishing Data on the Web Publishing Data on the Web
Publishing Data on the Web Centro Web
 
VIVO at the University of Idaho
VIVO at the University of IdahoVIVO at the University of Idaho
VIVO at the University of Idahoanniegaines
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands Vivien Bonazzi
 

Similar a Big data analytics (20)

Management of Data Collections
Management of Data CollectionsManagement of Data Collections
Management of Data Collections
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis
 
Toward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemToward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data Ecosystem
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Biomedical Data Sciences - New Name and New Opportunities for Change?
Biomedical Data Sciences - New Name and New Opportunities for Change?Biomedical Data Sciences - New Name and New Opportunities for Change?
Biomedical Data Sciences - New Name and New Opportunities for Change?
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Are you ready for BIG DATA?
Are you ready for BIG DATA?Are you ready for BIG DATA?
Are you ready for BIG DATA?
 
Big Data
Big Data Big Data
Big Data
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Big Data Brown Bag
Big Data Brown BagBig Data Brown Bag
Big Data Brown Bag
 
Publishing Data on the Web
Publishing Data on the Web Publishing Data on the Web
Publishing Data on the Web
 
Big Data Hadoop Training by Easylearning Guru
Big Data Hadoop Training by Easylearning GuruBig Data Hadoop Training by Easylearning Guru
Big Data Hadoop Training by Easylearning Guru
 
METRO RDM Webinar
METRO RDM WebinarMETRO RDM Webinar
METRO RDM Webinar
 
VIVO at the University of Idaho
VIVO at the University of IdahoVIVO at the University of Idaho
VIVO at the University of Idaho
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 

Último

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 

Último (20)

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 

Big data analytics

  • 1.
  • 2. Dr.V.Bhuvaneswari Assistant Professor Department of Computer Applications Bharathiar University Coimbatore bhuvanes_v@yahoo.com, bhuvana_v@buc.edu.in visit at www.budca.in/faculty.php BIG DATA ROADMAP
  • 3. Big Data Roadmap  Big Data – What? ◦ Timeline – Big Data Predictions ◦ Data Explosion  Big Data Myths  Big Data  5Vs of Big Data  Why Big Data  Data as Data Science Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 4. Data Landscape Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 5. Timeline – Big Data Predictions 1944- Yale Library in 2040 will have “approximately 200,000,000 Volumes 1961- Scientific Journals will grow exponentially rather than linearly, doubling every fifteen years and increasing by a factor of ten during every half-century. 1975- Ministry of Posts and Telecommunications in Japan introduced words as unifying unit of measurement 1997- First article published by Michael Cox and David Ellsworth in in the ACM digital library to the term “Big data.” Big Data evolved in 1997 and exploded to greater heights in 2010 and become popular in 2012 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 6. DATA EXPLOSION & ATTENTION Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 7. Big Data Explosion 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataevery day 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014
  • 8. Data Growth – in Units Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 10. Desktop Hobbyist The Future? Internet Big Data Byte : one grain of rice Kilobyte : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks with rice Terabyte : 2 Container Ships Petabyte : State full of rice bag Exabytes : States filled with rice bag Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICE BALL! Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 11. BIG DATA FACTS  Every 2 days we create as much information as we did from the beginning of time until 2003  Over 90% of all the data in the world was created in the past 2 years.  It is expected that by 2020 the amount of digital information in existence will have grown from 3.2 zettabytes today to 40 zettabytes.  Every minute we send 204 million emails, generate 1.8 million Facebook likes, send 278 thousand Tweets, and up-load 200,000Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 12. Big Data – Popularity  2010 – Social Networking – Post Internet era – Facebook – Big Data  2012 – Election of US President – Hot  After 2012 – Industries – Hadoop Big Data in India  Election process – BJP Govt  Current electioneering scenario  Election Campaign through social media requires permission - Election commission Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 13. BIG DATA GOT SO BIG ? Big data is a commodity as compared to Gold. Big data is Hot!Now What? Businesses Freak Out Over Big Data – Information Week Big Data Grows up – Forbes 2012 : The Year of Big Big Data Powers Revolution in Decision Making – Wall Street Journal Business Opportunities in Big Data – Inc THE HINDU 2015 www.thehindu.com THE WORLD’S FAVOURITE NEWSPAPER - Since 1879 5
  • 14. BIG DATA MYTHS Big Data • New • Only About Massive Data Volume • Means Hadoop • Need A Data Warehouse • Means Unstructured Data • for Social Media & Sentiment Analysis Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 15. What is Big Data? Lets Us Clarify Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 16. Big Data Big Data is  A complete subject with tools, techniques and frameworks.  Technology which deals with large and complex dataset which are varied in data format and structures, does not fit into the memory.  Not about huge volume of data; provide an opportunity to find new insight into the existing data and guidelines to capture and analyze future data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 17. Big Data : A Definition  Big data is the realization of greater business intelligence by storing, processing, and analyzing data that was previously ignored due to the limitations of traditional data management technologies :Source: Harness the Power of Big Data: The IBM Big Data Platform Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 18. BIG DATA as Platform Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- Source: IBM
  • 19. Characteristics – Big Data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 20. 4 V‘s of Big Data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 21. 6Vs of Big Data Volume Velocity Variety Veracity Value Validity Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 22. What is in Big Data ? Why Big Data Analytics? Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 23. Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Operations Analysis Analyze a variety of machine data for improved business results The 5 Key Big Data Use Cases Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 24. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 25. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 26. Big Data Technologies Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 27. Conventional approaches RDBMS OS FILE SYSTEM SQL QUERIES CUSTOM FRAMEWORK * C / C++ * PERL * PYTHON Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 28. ISSUES IN LEGACY SYSTEMS Limited Storage Capacity Limited Processing Capacity No Scalability Single point of Failure Sequential Processing RDBMSs can handle Structured Data Requires preprocessing of Data Information is collected according to current business needs Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 25
  • 29. Mr. HADOOP says he has a solution to our BIG problem ! Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 27
  • 30. Mr. HADOOP says he has a solution to our BIG problem ! Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 27
  • 31. What is Apache Hadoop Is A Framework That Allows For The Distributed Processing Of Large Datasets Across Clusters Of Commodity Computers Using A Simple Programming Model. Concept Moving computation is more efficient than moving large data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 31
  • 32. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 28
  • 33. TWO DAEMONS OF HADOOP 44 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 34. HDFS ARCHITECTURE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 35. TERMINIOLOGY REVIEW Node 1 Node 2 Node n : : Rack 1 Node 1 Node 2 Node n : : Rack 2 : : Cluster Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 35
  • 36. HADOOP CLUSTER ARCHITECTURE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 36
  • 37. WHAT IS HDFS  Hadoop Distributed File System Highly Fault tolerant , distributed , reliable , scalable file system for data storage. Stores multiple copies of data on different nodes A File is split up into blocks and stored on multiple machines Hadoop cluster typically has a single namenode and no. of data nodes to form a hadoop cluster. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 38. HDFS BLOCKS • Files are broken in to large blocks. Typically 128 MB block size Blocks are replicated for reliability One replica on local node Another replica on a remote rack Third replica on local rack, Additional replicas are randomly placed Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 41
  • 39. HDFS BLOCKS contd., ADVANTAGES OF HDFS BLOCKS Fixed Size Chunk of file < block size : Only needed space is used. Eg : 420 MB file is split as Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 42
  • 40. HDFS Operation Principle Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 43
  • 41. NAME NODE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 44
  • 42. DATA NODE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 45
  • 43. SECONDARY NAME NODE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 46
  • 44. HDFS – BLOCK REPLICATION ARCHITECTURE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- 48
  • 45. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 46. MAP REDUCE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 47. MAP REDUCE - ANALOGY Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 48. MAP REDUCE – ANALOGY CONTD., Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 49. MAP REDUCE EXAMPLE Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 50. MAP EXECUTION Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 51. MAP EXECUTION – DISTRIBUTED TWO NODE ENVIRONMENT Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 52. HADOOP JOB WORK INTERACTION Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 53. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 54. Data Science - Big Data Technology  Collect, Load, Transform ◦ ETL SCRIBE, FLUME  Store ◦ HADOOP, SPARK, STORM  Process, Analyze and Reasoning ◦ Computational Algorithms, ◦ Statistical Methods and Models  PIG, HIVE,  R,PHYTON, JAVA, SCALA,  CLOJURE, MAHOUT  Visualization ◦ DASHBOARD, APP Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 55. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 56. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- Big Data - Market
  • 58. Potential Talent Pool -Big Data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- India will require a minimum of 1 lakh data scientists in the next couple of years in addition to data analysts and data managers to support the Big Data space.
  • 59. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 60. Big Data Use Cases Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 61. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 62. Big Data Usecase Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 63. SBI  State Bank of India (SBI) ran its newly acquired data-mining software recently to check for purity of data.  Made an interesting find - close to one crore accountholders have not provided any nomination for their savings accounts. What is worse, over half of them are senior citizens.  To analyse trends in Banks, SBI has hired a whole team of statisticians and economists.  Identify default patterns, high value customers. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 64. Big Data Applications - India  Big Data – Elections  SBI uses big data mining to check defaults  Karnataka Govt – Identify water leakage Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 65. Big Data Challenges Privacy Protection All Big data stages collect, store, process, knowledge Integration with enterprise landscape All systems store data in rdbms,DW Does not support bulk loading to Big data store Limited number of analytics from Mahout Big data technologies lack visualization support and deliverable methods Leveraging cloud computing for big data applications Addressing Real time needs with varied format and volume Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 66. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-
  • 67. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,-