SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
UNIT : II
Chracteristics of Data
 Composition: deals with the structure of data i.e. sources of
data, types of data, nature of data.
 Condition: deals with state of data i.e.
 Context: deals with generation of data, sensitivity of data.
Evolution of Big Data
 In 1970s : The data was essentially primitive and
structured.
 In 1980s and 1990s : Relational databases evolved,
so the era was of Data-intensive applications.
 In 2000 and beyond : WWW and IoT have led to
structured, unstructured and multimedia data.
Big Data
Define Big Data?
 It's anything beyond imagination.
 Today's BIG may be tomorrow's NORMAL.
 Terabytes, Petabytes or Zettabytes of data.
 About 3V's.
 In 2001 industry analyst Doug Laney defines “Big Data” as the three
V’s (3Vs): Volume, Velocity and Variety.
 In 2012 Gartner update this definition as, “Big Data” is high-volume,
high-velocity & high-variety information assets that demand cost-
effective, innovative form of information processing for enhanced
insight and decision making.
 Big data is an evolving term that describes any voluminous amount
of structured, semi-structured and unstructured data that has the
potential to be mined for information.
Big Data
Challenges with Big Data
Challenges with Big Data
Capture
Storage
Curation
Search
Analysis
Transfer
Visualization
Privacy
Characteristics of Big Data
Big data is broken by three characteristics.
Extremely largeVolume of data
Extremely highVelocity of data
Extremely wideVariety of data
Other characteristics of data which
are not definitional for Big Data
 Veracity and Validity : deals with abnormality, accuracy and
correctness
 Volatility : deals with data validity
 Variability : deals with data floe which is highly inconsistent
Why Big Data?
More Data
More Acurate Analysis
More Confidence in
decision making
Impact in terms of enhancing
operational efficiency,
reducing cost & time,
innovating New products, new services,
Optimized offerings etc.
We are only Consumers or
information producers?
Consider one scenario :
1. Text msg. To attend the party.
2. use of credit/debit card at the petrol pump.
3. Point-of-sale sys. At Archie's shop.
4. Photographs & posts on social networking
sites.
5. Likes & comments to your post.
BI Versus Big Data
Bisiness Intelligence(BI)
1. All enterprise's data is
housed in a central server
2. Tipical database server
scales data Vertically
3. BI data analyzed in an offline
mode
4. BI is about Structured Data
5. Move Data to code
Big Data
1. Data resides in a
distributed file system
2. Distributed file system
scales data Horizontally
3. Big Data analyzed in both
real time as well as
offline mode.
4. Big Data is about veriety
data
5. Move Code to data
Typical Data Warehouse Environment
ERP
(Enterprise Resource
Planning)
CRM
(Customer Relationship
Management)
Third party apps
Legacy System
Data
Warehouse
Reporting/
Dashbording
OLAP
Ad hoc querying
Modeling
Typical Hadoop Environment
Web Logs
Images and Videos
Docs and PDFs
Social Media
HDFS
Operational System
Data Warehouse
Data Mart
ODS
(Operational Data Store)
Data MartHadoop
MapReduce
Functional Requirements of Big Data
Big Data
Big Data
Big Data
(1)
Collection
(2)
Integration
(3)
Analysis
(4)
Actions
Decisions
Big Data Stack
 Big Data technical Stack explain layered
architecture.
 It is how to think about Big Data.
 It is dealing with
– Storage
– Analytics
– Reporting
– Applications
 Let's watch this Vedio....
Big Data Stack
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Big Data Stack
Layer 0 (Redundant Physical Infrastructure) :
Deals with hardware, network & so on.
 Performance: How responsive do you need the sys. To be?
performance of your machine, very fast infrastructures tends
to be very expensive.
 Availability: Do you need a 100% uptime guarantee of
servise? Highly available infrastuctures are very expensive.
 Scalability: How Big does your infrastructure need to be?
How much Disk space is needed?
 Flexibility: How quickly can you add more resourses to the
infrastructure?
 Cost: What can you afford?
Big Data Stack
Layer 1 (Security Infrastructure) :
Security and privacy requirements for big data are similar to the
requirements for conventional data environments.
 Data Access: Data should be available to authorized person.
 Application Access: Most API's offer protection from
unauthorized usage or access.
 Data Encryption: It is most challenging aspect in Big Data
environment.
 Threat Detection: The inclusion of mobile devices and social
networks exponentially increases both the amount of data and
opportunities for security threats.
Big Data Stack
Layer 2 (Operational Databases):
 For Big Data environment it is needed to be have
fast & scalable database engine.
 Use of RDBMS for Big Data is not practical
solution.
 Choose Proper Database.
 Your Database must support ACID.
Big Data Stack
Layer 3 (Organizing Data Services and Tools):
Organizing Data Services and Tools capture, validate and assemble
various big data elements in to contextually relevent collections.
Becouse Big data is massive.
Tools need to provide integration, translation, normalization and scale.
Technologies in this layer are as follows:
 A Distributed File System
 Serialization Service
 Coordination Services
 Extract, Transfer and Load (ETL) Tools
 Workflow Services
Big Data Stack
Layer 4 (Analytical data Warehouses):
 Data Warehouse and Data Mart contain normalized data
gathered from a variety of sources and assembled to facilitate
analysis of the business.
 It is for creation of reports and visualization of disparate data
items.
Big Data Analytics:
It requires proper Analytical tools
This Architecture list three classes of tools.
 Reporting and dashboards: this tools provide
“User-friendly” representation of information.
 Visualization:
 Analytics and Advanced Analytics:
Big Data Applications:
Need to choose categories of applications.

Más contenido relacionado

La actualidad más candente

Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
emailharmeet
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 

La actualidad más candente (20)

Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Big Data
Big DataBig Data
Big Data
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them
 

Similar a Unit 2

UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdf
vvpadhu
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
Vamshikrishna Goud
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
rajsharma159890
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
AASTHA PANDEY
 

Similar a Unit 2 (20)

Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Big data seminor
Big data seminorBig data seminor
Big data seminor
 
1
11
1
 
Bigdata
Bigdata Bigdata
Bigdata
 
UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdf
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Big Data przt.pptx
Big Data przt.pptxBig Data przt.pptx
Big Data przt.pptx
 
Unit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfUnit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdf
 
Big data
Big dataBig data
Big data
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Unit 2

  • 1. UNIT : II Chracteristics of Data  Composition: deals with the structure of data i.e. sources of data, types of data, nature of data.  Condition: deals with state of data i.e.  Context: deals with generation of data, sensitivity of data.
  • 2. Evolution of Big Data  In 1970s : The data was essentially primitive and structured.  In 1980s and 1990s : Relational databases evolved, so the era was of Data-intensive applications.  In 2000 and beyond : WWW and IoT have led to structured, unstructured and multimedia data.
  • 3. Big Data Define Big Data?  It's anything beyond imagination.  Today's BIG may be tomorrow's NORMAL.  Terabytes, Petabytes or Zettabytes of data.  About 3V's.
  • 4.  In 2001 industry analyst Doug Laney defines “Big Data” as the three V’s (3Vs): Volume, Velocity and Variety.  In 2012 Gartner update this definition as, “Big Data” is high-volume, high-velocity & high-variety information assets that demand cost- effective, innovative form of information processing for enhanced insight and decision making.  Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big Data
  • 5. Challenges with Big Data Challenges with Big Data Capture Storage Curation Search Analysis Transfer Visualization Privacy
  • 6. Characteristics of Big Data Big data is broken by three characteristics. Extremely largeVolume of data Extremely highVelocity of data Extremely wideVariety of data
  • 7.
  • 8. Other characteristics of data which are not definitional for Big Data  Veracity and Validity : deals with abnormality, accuracy and correctness  Volatility : deals with data validity  Variability : deals with data floe which is highly inconsistent
  • 9. Why Big Data? More Data More Acurate Analysis More Confidence in decision making Impact in terms of enhancing operational efficiency, reducing cost & time, innovating New products, new services, Optimized offerings etc.
  • 10. We are only Consumers or information producers? Consider one scenario :
  • 11. 1. Text msg. To attend the party. 2. use of credit/debit card at the petrol pump. 3. Point-of-sale sys. At Archie's shop. 4. Photographs & posts on social networking sites. 5. Likes & comments to your post.
  • 12. BI Versus Big Data Bisiness Intelligence(BI) 1. All enterprise's data is housed in a central server 2. Tipical database server scales data Vertically 3. BI data analyzed in an offline mode 4. BI is about Structured Data 5. Move Data to code Big Data 1. Data resides in a distributed file system 2. Distributed file system scales data Horizontally 3. Big Data analyzed in both real time as well as offline mode. 4. Big Data is about veriety data 5. Move Code to data
  • 13. Typical Data Warehouse Environment ERP (Enterprise Resource Planning) CRM (Customer Relationship Management) Third party apps Legacy System Data Warehouse Reporting/ Dashbording OLAP Ad hoc querying Modeling
  • 14. Typical Hadoop Environment Web Logs Images and Videos Docs and PDFs Social Media HDFS Operational System Data Warehouse Data Mart ODS (Operational Data Store) Data MartHadoop MapReduce
  • 15. Functional Requirements of Big Data Big Data Big Data Big Data (1) Collection (2) Integration (3) Analysis (4) Actions Decisions
  • 16. Big Data Stack  Big Data technical Stack explain layered architecture.  It is how to think about Big Data.  It is dealing with – Storage – Analytics – Reporting – Applications  Let's watch this Vedio....
  • 17. Big Data Stack Layer 0 Layer 1 Layer 2 Layer 3 Layer 4
  • 18. Big Data Stack Layer 0 (Redundant Physical Infrastructure) : Deals with hardware, network & so on.  Performance: How responsive do you need the sys. To be? performance of your machine, very fast infrastructures tends to be very expensive.  Availability: Do you need a 100% uptime guarantee of servise? Highly available infrastuctures are very expensive.  Scalability: How Big does your infrastructure need to be? How much Disk space is needed?  Flexibility: How quickly can you add more resourses to the infrastructure?  Cost: What can you afford?
  • 19. Big Data Stack Layer 1 (Security Infrastructure) : Security and privacy requirements for big data are similar to the requirements for conventional data environments.  Data Access: Data should be available to authorized person.  Application Access: Most API's offer protection from unauthorized usage or access.  Data Encryption: It is most challenging aspect in Big Data environment.  Threat Detection: The inclusion of mobile devices and social networks exponentially increases both the amount of data and opportunities for security threats.
  • 20. Big Data Stack Layer 2 (Operational Databases):  For Big Data environment it is needed to be have fast & scalable database engine.  Use of RDBMS for Big Data is not practical solution.  Choose Proper Database.  Your Database must support ACID.
  • 21. Big Data Stack Layer 3 (Organizing Data Services and Tools): Organizing Data Services and Tools capture, validate and assemble various big data elements in to contextually relevent collections. Becouse Big data is massive. Tools need to provide integration, translation, normalization and scale. Technologies in this layer are as follows:  A Distributed File System  Serialization Service  Coordination Services  Extract, Transfer and Load (ETL) Tools  Workflow Services
  • 22. Big Data Stack Layer 4 (Analytical data Warehouses):  Data Warehouse and Data Mart contain normalized data gathered from a variety of sources and assembled to facilitate analysis of the business.  It is for creation of reports and visualization of disparate data items.
  • 23. Big Data Analytics: It requires proper Analytical tools This Architecture list three classes of tools.  Reporting and dashboards: this tools provide “User-friendly” representation of information.  Visualization:  Analytics and Advanced Analytics:
  • 24. Big Data Applications: Need to choose categories of applications.