SlideShare una empresa de Scribd logo
1 de 12
BIG DATA
Basic Concepts
Introduction
Big data is a collection of large datasets that cannot be
processed using traditional computing techniques.
Big data involves the data produced by different devices
and applications
Big Data is a term used to describe a collection of
data that is huge in size and yet growing exponentially
with time.
Source
BIG DATA
Black Box Data
Social Media Data
Stock Exchange Data
Power Grid Data
Transport Data
Unstructured data
Search Engine Data
Structured data
Source
Black Box Data
Voices of the flight crew
Recordings of microphones and earphones
Performance information of the aircraft
Social Media Data
FaceBook Data
Twitter Data
Pintrest Data
Source
Purchased share by customer
Sold share by customer
Complete stock data
Model of vehicle
Capacity of vehicle
Distance related data
Stock Exchange Data
Transport Data
Benefits
BIG DATA
Understand the market
conditions
Control online
reputation
New Product
Development
Time Reductions
3 V’s
BIG DATA
Variety Volume
The data is increasing
at a very fast rate. It is
estimated that the
volume of data will
double in every 2
years.
Data comes in all formats
that may be structured,
numeric in the traditional
database or the
unstructured text
documents, video, audio,
email, stock ticker data.
The amount of
data which we deal
with is of very large
size of Peta bytes.
Velocity
Technologies
BIG DATA Technologies
This include systems like
MongoDB that provide
operational capabilities for real-
time, interactive workloads
where data is primarily captured
and stored.
Operational
Big Data
Analytical
Big Data
These includes systems like
Massively Parallel Processing (MPP)
database systems and Map Reduce
that provide analytical capabilities
for retrospective and complex
analysis that may touch most or all
of the data.
Challenges
BIG DATA
Capturing data
Searching
Transfer
Storage Presentation
Sharing
Curation Analysis
Solution
BIG DATA
Map Reduce paradigm is
applied to data distributed
over network to find the
required output.
Hadoop is open source so
the cost is no more an
issue.
Pig, Hive can be used to
analyze the data.
This huge amount of data,
Hadoop uses HDFS
(Hadoop Distributed File
System).
Storage
Analyze Cost
Processing
References
Thank
You

Más contenido relacionado

La actualidad más candente

Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Big Data in Education Sector
Big Data in Education SectorBig Data in Education Sector
Big Data in Education SectorKaran Sachdeva
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesT.S. Lim
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven CultureLucas Neo
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Preparing your business for digital transformation
Preparing your business for digital transformationPreparing your business for digital transformation
Preparing your business for digital transformationiCentra Consulting
 
Scraping data from the web and documents
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documentsTommy Tavenner
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsPerficient, Inc.
 
BigchainDB - Big Data meets Blockchain
BigchainDB - Big Data meets BlockchainBigchainDB - Big Data meets Blockchain
BigchainDB - Big Data meets BlockchainDimitri De Jonghe
 

La actualidad más candente (20)

Big Data
Big DataBig Data
Big Data
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big Data in Education Sector
Big Data in Education SectorBig Data in Education Sector
Big Data in Education Sector
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Big data
Big dataBig data
Big data
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven Culture
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Preparing your business for digital transformation
Preparing your business for digital transformationPreparing your business for digital transformation
Preparing your business for digital transformation
 
Scraping data from the web and documents
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documents
 
Big Data
Big DataBig Data
Big Data
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 
Big Data
Big DataBig Data
Big Data
 
BigchainDB - Big Data meets Blockchain
BigchainDB - Big Data meets BlockchainBigchainDB - Big Data meets Blockchain
BigchainDB - Big Data meets Blockchain
 

Similar a Big data - Basics

Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
Big data - What is It?
Big data - What is It?Big data - What is It?
Big data - What is It?Nicole Aidney
 
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...IJERDJOURNAL
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAgeFriendlyEconomy
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET Journal
 
Big Data and Artificial Intelligence in Indonesia
Big Data and Artificial Intelligence in IndonesiaBig Data and Artificial Intelligence in Indonesia
Big Data and Artificial Intelligence in IndonesiaHeru Sutadi
 
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBig Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBroadridge
 

Similar a Big data - Basics (20)

Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
130214 copy
130214   copy130214   copy
130214 copy
 
Big data - What is It?
Big data - What is It?Big data - What is It?
Big data - What is It?
 
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
big data.pptx
big data.pptxbig data.pptx
big data.pptx
 
Big data
Big dataBig data
Big data
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Monetize Big Data
Monetize Big DataMonetize Big Data
Monetize Big Data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Age Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big DataAge Friendly Economy - Introduction to Big Data
Age Friendly Economy - Introduction to Big Data
 
new.pptx
new.pptxnew.pptx
new.pptx
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
Big data
Big data Big data
Big data
 
Big Data
Big DataBig Data
Big Data
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Big Data and Artificial Intelligence in Indonesia
Big Data and Artificial Intelligence in IndonesiaBig Data and Artificial Intelligence in Indonesia
Big Data and Artificial Intelligence in Indonesia
 
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBig Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
 

Último

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Último (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Big data - Basics

  • 2. Introduction Big data is a collection of large datasets that cannot be processed using traditional computing techniques. Big data involves the data produced by different devices and applications Big Data is a term used to describe a collection of data that is huge in size and yet growing exponentially with time.
  • 3. Source BIG DATA Black Box Data Social Media Data Stock Exchange Data Power Grid Data Transport Data Unstructured data Search Engine Data Structured data
  • 4. Source Black Box Data Voices of the flight crew Recordings of microphones and earphones Performance information of the aircraft Social Media Data FaceBook Data Twitter Data Pintrest Data
  • 5. Source Purchased share by customer Sold share by customer Complete stock data Model of vehicle Capacity of vehicle Distance related data Stock Exchange Data Transport Data
  • 6. Benefits BIG DATA Understand the market conditions Control online reputation New Product Development Time Reductions
  • 7. 3 V’s BIG DATA Variety Volume The data is increasing at a very fast rate. It is estimated that the volume of data will double in every 2 years. Data comes in all formats that may be structured, numeric in the traditional database or the unstructured text documents, video, audio, email, stock ticker data. The amount of data which we deal with is of very large size of Peta bytes. Velocity
  • 8. Technologies BIG DATA Technologies This include systems like MongoDB that provide operational capabilities for real- time, interactive workloads where data is primarily captured and stored. Operational Big Data Analytical Big Data These includes systems like Massively Parallel Processing (MPP) database systems and Map Reduce that provide analytical capabilities for retrospective and complex analysis that may touch most or all of the data.
  • 9. Challenges BIG DATA Capturing data Searching Transfer Storage Presentation Sharing Curation Analysis
  • 10. Solution BIG DATA Map Reduce paradigm is applied to data distributed over network to find the required output. Hadoop is open source so the cost is no more an issue. Pig, Hive can be used to analyze the data. This huge amount of data, Hadoop uses HDFS (Hadoop Distributed File System). Storage Analyze Cost Processing