SlideShare una empresa de Scribd logo
1 de 49
Hassnain Ali 15081598-066
Nadeem Tahir 15081598-106
What is Big Data?
“Big data is the data characterized by 4 key
attributes: volume, variety, velocity and
value.”
-- Oracle
Let’s look at
Big Data
in a different way.
Byte
Byte : one grain of rice
Kilobyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
: 3 Semi trucks
: 2 Container Ships
Petabyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
Petabyte
: 3 Semi trucks
: 2 Container Ships
: Blankets Manhattan
OEnxeabByyttee
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
Petabyte
Exabyte
: 3 Semi trucks
: 2 Container Ships
: Blankets Manhattan
: Blankets west coast states
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
Petabyte
Exabyte
: 3 Semi trucks
: 2 Container Ships
: Blankets Manhattan
: Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Zettabyte
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
Petabyte
Exabyte
: 3 Semi trucks
: 2 Container Ships
: Blankets Manhattan
: Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL! Yottabyte
Hobbyist
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte
Terabyte
Petabyte
Exabyte
: 3 Semi trucks
: 2 Container Ships
: Blankets Manhattan
: Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Desktop
Hobbyist
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Petabyte
Exabyte
: 2 Container Ships
: Blankets Manhattan
: Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Desktop
Hobbyist
Internet
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Petabyte
: 2 Container Ships
: Blankets Manhattan
Exabyte : Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Desktop
Hobbyist
Internet
BigData
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Petabyte
: 2 Container Ships
: Blankets Manhattan
Exabyte : Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Petabyte
: 2 Container Ships
: Blankets Manhattan
Exabyte : Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Desktop
Hobbyist
The Future?
Internet
BigData
Byte
Kilobyte
: one grain of rice
: cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte
Petabyte
: 2 Container Ships
: Blankets Manhattan
Exabyte : Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : A EARTH SIZE RICEBALL!
Big Data is not about the size of the data,
it’s about the value within the data.
We are generating huge
amounts of data.
Data with a
lot of information.
… and a lot of noise.
The ability to hear the signal
from the noise is the key…
to unlocking the human conversation
that is taking place around us.
Did it work?
Most people don’t know
what to do with all the data
that they already have…
Get Big
by starting
small
Big Data isn’t big, if you know
how to use it.
Storing Big
Data
• Data start to play an increasingly important role in
business and science.
• Storing, searching, sharing, analysing and visualising big
data has become a challenge.
• Especially storing of data is often disregarded as an
issue. Note that sometimes a MySQL database is not
enough.
• Hadoop offers an out of the box distributed filesystem for
storing data files. However, the challenge appears when
someone needs DB capabilities, frequent updates or real
Problems Now A days
 Nowadays traditional relational databases can reach their limit
in performance.
 Data keep on coming in high velocity, high volumes, and high
variety.
 Common practices to increase performance fail after a while:
buying a faster server, getting more RAM, using materialised
views, fine tuning queries...
 Furthermore, “alter table” doesn’t really work with lots of
data. Backups and data availability becomes an issue.
NO SQL
• The term is too broad and new to really define it.
• No schema
• No joins between tables
• No common scripting language (like SQL)
• No ACID (atomicity, consistency, isolation, durability)
• On the other hand you gain horizontal scalability and high performance.
Also, most NoSQL systems are Map/Reduce ready and/or bind with
Hadoop.
MangoDB Example:-
A document is represented in JSON format:
{
“ id” : 12345678,
“Link” : “http://news.scotsman.com/abc.html”, “Title”:“Blah blah
blah”,
“Content”: “More blah blah”, “OutletID” : 14,
“Date” : ISODate(“2011-11-17T20:33:15.097Z”), “ Hash” :
550973592,
“Tags” : [ International, News, Scotland],
MongoDB - Replication
Master/Slave
Single Server
MongoDB - Sharding MongoDB
If new shard is added, data is balanced automaticall
Data Processing
 Without data processing, organizations have no access to
massive amounts of data that can help them gain a competitive
edge, give them insight into sales, marketing strategies and
consumer needs. It is imperative that companies large and small
understand the necessity of data processing.
 Data processing occurs when data is collected and translated
into usable information
The Six Stages of Data Processing
• Data Collection
• Data Preparation
• Data Input
• Processing
• Data Output/Interpretation
• Data Storage
The Future of Data Processing
The future of data processing lies in the cloud. Cloud technology
builds on the convenience of current electronic data processing
methods and accelerates its speed and effectiveness. Faster,
higher-quality data means more data for each organization to
utilize and more valuable insights to extract.
Big data tools:-
1. Apache Hadoop 2. Microsoft HDInsight
3. NoSQL 4. Hive
5. Sqoop
7. Big data in EXCEL 8. Presto
6. PolyBase
Big Data Techniques
Quantitative Analysis
Quantitative analysis is a data analysis technique that focuses on quantifying
the patterns and correlations found in the data. Based on statistical practices,
this technique involves analyzing a large number of observations from a dataset
Qualitative Analysis
Qualitative analysis is a data analysis technique that focuses
on describing various data qualities using words. It involves
analyzing a smaller sample in greater depth compared to
quantitative data analysis. These analysis results cannot be
generalized to an entire dataset due to the small sample size
DATA MINING
Data mining, also known as data discovery, is a specialized form of
data analysis that targets large datasets. In relation to Big Data
analysis, data mining generally refers to automated, software-based
techniques that sift through massive datasets to identify patterns and
trends.
STATISTICAL ANALYSIS
Statistical analysis uses statistical methods based on mathematical formulas as a
means for analyzing data. Statistical analysis is most often quantitative, but can also be
qualitative. This type of analysis is commonly used to describe datasets via
summarization, such as providing the mean, median, or mode of statistics associated
with the dataset.
MACHINE LEARNING
Humans are good at spotting patterns and relationships within data.
Unfortunately, we cannot process large amounts of data very quickly.
Machines, on the other hand, are very adept at processing large amounts of
data quickly, but only if they know how.
SEMANTIC ANALYSIS
A fragment of text or speech data can carry different meanings in different
contexts, whereas a complete sentence may retain its meaning, even if
structured in different ways. In order for the machines to extract valuable
information, text and speech data needs to be understood by the machines
in the same way as humans do. Semantic analysis represents practices for
extracting meaningful information from textual and speech data.
VISUAL ANALYSIS
Visual analysis is a form of data analysis that involves the
graphic representation of data to enable or enhance its visual
perception. Based on the premise that humans can
understand and draw conclusions from graphics more quickly
than from text, visual analysis acts as a discovery tool in the
field of Big Data.
Intro to big data and how it works

Más contenido relacionado

La actualidad más candente

Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
The Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataThe Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataNicha Tatsaneeyapan
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Democratizing Data
Democratizing DataDemocratizing Data
Democratizing DataDatabricks
 
Big Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideBig Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideSlideTeam
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Big data - Basics
Big data - BasicsBig data - Basics
Big data - BasicsRohit Gupta
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
MongoDB and Azure Databricks
MongoDB and Azure DatabricksMongoDB and Azure Databricks
MongoDB and Azure DatabricksMongoDB
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"Nicola Ferraro
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesSlideTeam
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 DataWorks Summit
 

La actualidad más candente (20)

Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
The Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataThe Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big Data
 
Big data
Big dataBig data
Big data
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Democratizing Data
Democratizing DataDemocratizing Data
Democratizing Data
 
Big data
Big dataBig data
Big data
 
Big Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideBig Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation Slide
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Big data - Basics
Big data - BasicsBig data - Basics
Big data - Basics
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
MongoDB and Azure Databricks
MongoDB and Azure DatabricksMongoDB and Azure Databricks
MongoDB and Azure Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
 
Big Data Presentation
Big  Data PresentationBig  Data Presentation
Big Data Presentation
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 
Data modeling for the business
Data modeling for the businessData modeling for the business
Data modeling for the business
 

Similar a Intro to big data and how it works

Similar a Intro to big data and how it works (20)

Big data
Big data Big data
Big data
 
Big Data Chapter1.pdf
Big Data Chapter1.pdfBig Data Chapter1.pdf
Big Data Chapter1.pdf
 
Big data anuj
Big data anujBig data anuj
Big data anuj
 
Whatisbigdata 130718170809-phpapp01
Whatisbigdata 130718170809-phpapp01Whatisbigdata 130718170809-phpapp01
Whatisbigdata 130718170809-phpapp01
 
What is big data
What is big dataWhat is big data
What is big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
L21 Big Data and Analytics
L21 Big Data and AnalyticsL21 Big Data and Analytics
L21 Big Data and Analytics
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Big data
Big dataBig data
Big data
 
L18 Big Data and Analytics
L18 Big Data and AnalyticsL18 Big Data and Analytics
L18 Big Data and Analytics
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Overview of bigdata
Overview of bigdataOverview of bigdata
Overview of bigdata
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentation
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
Big data
Big dataBig data
Big data
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013
 

Último

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 

Último (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 

Intro to big data and how it works

  • 1. Hassnain Ali 15081598-066 Nadeem Tahir 15081598-106
  • 2. What is Big Data?
  • 3. “Big data is the data characterized by 4 key attributes: volume, variety, velocity and value.” -- Oracle
  • 4. Let’s look at Big Data in a different way.
  • 5. Byte Byte : one grain of rice
  • 6. Kilobyte Byte Kilobyte : one grain of rice : cup of rice
  • 7. Megabyte Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice
  • 8. Gigabyte Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks
  • 9. Terabyte Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte : 3 Semi trucks : 2 Container Ships
  • 10. Petabyte Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte Petabyte : 3 Semi trucks : 2 Container Ships : Blankets Manhattan
  • 11. OEnxeabByyttee Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte Petabyte Exabyte : 3 Semi trucks : 2 Container Ships : Blankets Manhattan : Blankets west coast states
  • 12. Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte Petabyte Exabyte : 3 Semi trucks : 2 Container Ships : Blankets Manhattan : Blankets west coast states Zettabyte : Fills the Pacific Ocean Zettabyte
  • 13. Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte Petabyte Exabyte : 3 Semi trucks : 2 Container Ships : Blankets Manhattan : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL! Yottabyte
  • 14. Hobbyist Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte Terabyte Petabyte Exabyte : 3 Semi trucks : 2 Container Ships : Blankets Manhattan : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 15. Desktop Hobbyist Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks Terabyte Petabyte Exabyte : 2 Container Ships : Blankets Manhattan : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 16. Desktop Hobbyist Internet Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks Terabyte Petabyte : 2 Container Ships : Blankets Manhattan Exabyte : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 17. Desktop Hobbyist Internet BigData Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks Terabyte Petabyte : 2 Container Ships : Blankets Manhattan Exabyte : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 18. Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks Terabyte Petabyte : 2 Container Ships : Blankets Manhattan Exabyte : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 19. Desktop Hobbyist The Future? Internet BigData Byte Kilobyte : one grain of rice : cup of rice Megabyte : 8 bags of rice Gigabyte : 3 Semi trucks Terabyte Petabyte : 2 Container Ships : Blankets Manhattan Exabyte : Blankets west coast states Zettabyte : Fills the Pacific Ocean Yottabyte : A EARTH SIZE RICEBALL!
  • 20. Big Data is not about the size of the data, it’s about the value within the data.
  • 21. We are generating huge amounts of data.
  • 22. Data with a lot of information.
  • 23. … and a lot of noise.
  • 24. The ability to hear the signal from the noise is the key…
  • 25. to unlocking the human conversation that is taking place around us.
  • 27. Most people don’t know what to do with all the data that they already have…
  • 29.
  • 30. Big Data isn’t big, if you know how to use it.
  • 32. • Data start to play an increasingly important role in business and science. • Storing, searching, sharing, analysing and visualising big data has become a challenge. • Especially storing of data is often disregarded as an issue. Note that sometimes a MySQL database is not enough. • Hadoop offers an out of the box distributed filesystem for storing data files. However, the challenge appears when someone needs DB capabilities, frequent updates or real
  • 33. Problems Now A days  Nowadays traditional relational databases can reach their limit in performance.  Data keep on coming in high velocity, high volumes, and high variety.  Common practices to increase performance fail after a while: buying a faster server, getting more RAM, using materialised views, fine tuning queries...  Furthermore, “alter table” doesn’t really work with lots of data. Backups and data availability becomes an issue.
  • 34. NO SQL • The term is too broad and new to really define it. • No schema • No joins between tables • No common scripting language (like SQL) • No ACID (atomicity, consistency, isolation, durability) • On the other hand you gain horizontal scalability and high performance. Also, most NoSQL systems are Map/Reduce ready and/or bind with Hadoop.
  • 35. MangoDB Example:- A document is represented in JSON format: { “ id” : 12345678, “Link” : “http://news.scotsman.com/abc.html”, “Title”:“Blah blah blah”, “Content”: “More blah blah”, “OutletID” : 14, “Date” : ISODate(“2011-11-17T20:33:15.097Z”), “ Hash” : 550973592, “Tags” : [ International, News, Scotland],
  • 37. MongoDB - Sharding MongoDB If new shard is added, data is balanced automaticall
  • 38. Data Processing  Without data processing, organizations have no access to massive amounts of data that can help them gain a competitive edge, give them insight into sales, marketing strategies and consumer needs. It is imperative that companies large and small understand the necessity of data processing.  Data processing occurs when data is collected and translated into usable information
  • 39. The Six Stages of Data Processing • Data Collection • Data Preparation • Data Input • Processing • Data Output/Interpretation • Data Storage
  • 40. The Future of Data Processing The future of data processing lies in the cloud. Cloud technology builds on the convenience of current electronic data processing methods and accelerates its speed and effectiveness. Faster, higher-quality data means more data for each organization to utilize and more valuable insights to extract.
  • 41. Big data tools:- 1. Apache Hadoop 2. Microsoft HDInsight 3. NoSQL 4. Hive 5. Sqoop 7. Big data in EXCEL 8. Presto 6. PolyBase
  • 42. Big Data Techniques Quantitative Analysis Quantitative analysis is a data analysis technique that focuses on quantifying the patterns and correlations found in the data. Based on statistical practices, this technique involves analyzing a large number of observations from a dataset
  • 43. Qualitative Analysis Qualitative analysis is a data analysis technique that focuses on describing various data qualities using words. It involves analyzing a smaller sample in greater depth compared to quantitative data analysis. These analysis results cannot be generalized to an entire dataset due to the small sample size
  • 44. DATA MINING Data mining, also known as data discovery, is a specialized form of data analysis that targets large datasets. In relation to Big Data analysis, data mining generally refers to automated, software-based techniques that sift through massive datasets to identify patterns and trends.
  • 45. STATISTICAL ANALYSIS Statistical analysis uses statistical methods based on mathematical formulas as a means for analyzing data. Statistical analysis is most often quantitative, but can also be qualitative. This type of analysis is commonly used to describe datasets via summarization, such as providing the mean, median, or mode of statistics associated with the dataset.
  • 46. MACHINE LEARNING Humans are good at spotting patterns and relationships within data. Unfortunately, we cannot process large amounts of data very quickly. Machines, on the other hand, are very adept at processing large amounts of data quickly, but only if they know how.
  • 47. SEMANTIC ANALYSIS A fragment of text or speech data can carry different meanings in different contexts, whereas a complete sentence may retain its meaning, even if structured in different ways. In order for the machines to extract valuable information, text and speech data needs to be understood by the machines in the same way as humans do. Semantic analysis represents practices for extracting meaningful information from textual and speech data.
  • 48. VISUAL ANALYSIS Visual analysis is a form of data analysis that involves the graphic representation of data to enable or enhance its visual perception. Based on the premise that humans can understand and draw conclusions from graphics more quickly than from text, visual analysis acts as a discovery tool in the field of Big Data.