SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Big Data
Structuring, Modeling, Managing
COURSE BY DAAN GERITS
CONTENT
Dive into the techniques that make data systems scale
1
ANATOMY
2
DATA AT SCALE
What is so different in working with data the traditional way vs
the bigdata way?
3
DATA MODELS
An overview of the most popular types of data models
4
ADVICE
So what to make of all this?
Course by Daan Gerits
Data expert at design is dead
Co-Founder of Fitchain.io
data unicorn,
technopreneur,
founder
Daan Gerits
@daangerits
Co-Founder of Bigdata.be
https://pbs.twimg.com/profile_imag
es/431014702533976064/7RZOwlp
H_400x400.jpeg
01
ANATOMY
Discover the techniques that make data systems scale
• Replication and Partitioning
• System load
Course by Daan Gerits
What?
Copy data across physical nodes
Why?
Improve reliability and fault tolerance
How?
Create replica’s of the data and keep those in sync
Replication
Course by Daan Gerits
What?
Partition the data and distribute across physical
nodes
Why?
Scale data systems
How?
Logical partitioning key
Same partitioning key goes to same node
Partitioning
Course by Daan Gerits
Read Heavy
Most of the operations are read operations
Write Heavy
Most of the operations are write operations
Balanced
# read operations == # write operations
Load
Course by Daan Gerits
How you store the data
depends on how you
query the data
02
DATA AT SCALE
To seasoned data professionals a lot of the techniques and
approaches do not seem so different to what they have done
during the past decades. So what is so different?
Course by Daan Gerits
At the core of big data
is the ability to deal
with the volume,
variety and velocity of
data.
Course by Daan Gerits
Big Data is all about
new ways of thinking
about data
THINK DIFFERENT
OPERATIONAL
Automate your
processes through the
use of data
BUSINESS
Change the metrics
you use to measure
success
PERSONAL
Data makes people
important again. This
doesn’t stop with the
customer
Course by Daan Gerits
TRADITIONAL APPROACH
Supply Model Request
Request
Request
Course by Daan Gerits
Big Data Approach
Supply Model Request
Request
Request
Model
Model
03
DATA MODELS
How you want to retrieve your data has an impact in how you
store your data. These data models provide almost standard
approaches to do so.
HOW DATA IS STORED
GRAPH
Data model built out
of nodes and their
connections
COLUMN
FAMILY
Seriously powerful
but complex data
model, ideal for
sparse data
KEY-VALUE
A very simple data
model mapping a key
to a value
KV
DOCUMENT
A data model where
the structure of every
value can be different
KEY-VALUE
KEY VALUE
users.214.name Daan gerits
users.214.birthdate 18/05/1983
users.214.roles [user, admin]
users.214.isSubscribed true
users.214.social.twitter @daangerits
Course by Daan Gerits
Fast Lookups
But no way to query the data
Scanning if keys are ordered
Flexible value types
Key and value can be anything, even collections and
more complex data structures
Easy to scale
- Little to no dependencies between key-value pairs
- Ordering can become difficult to scale
Use cases
- Caches
- Configuration
KEY-VALUE
Course by Daan Gerits
SCAN <prefix>
Scan through all pairs where the key matches the
given prefix. This is only possible if the keys are
ordered
GET <key>
Get a key-value pair by its key
SET <key> <value>
Set the value of the given key
DELETE <key>
Remove the pair with the given key
KEY-VALUE
DOCUMENT
KEY DOCUMENT
daan {
“name”: “Daan Gerits”,
“birthday”: “18/05/1983”
}
wim {
“name”: “Wim Van Leuven”,
“company”: “Highestpoint”
}
Course by Daan Gerits
Queryable
Technology specific query language
Separate index needs to be kept in sync
Flexible value types
Key can be anything
Value is structured type (JSON, BSON, XML, …)
Scalability requires caution
- Relationships between documents
- Scaling search can become a hurdle
Use cases
- Search engines
- Entity Data Stores
DOCUMENT
Course by Daan Gerits
FIND <query>
Find all documents matching the given query
GET <key>
Get the document matching the given key
CREATE <key> <document>
Create a new document with the given key
UPDATE <key> <field> <value>
Update the given field within the document with the
given key
DELETE <key>
Remove the document with the given key
DOCUMENT
GRAPH
teaches
Name: Daan
Type: Tutor
1
Name: Els
Type: Tutor
2
Name: bigdata
Type: Course
3
Name: Amy
Type: Student
4
teaches
friend of
enrolled
in
Course by Daan Gerits
Relationships are first class citizens
Graph traversal in specific language
Updating relationships is cheap
Easy concepts
Node with properties
Edge
Very hard to scale
Golden Ratio
Scaling requires deep knowledge of the data
Use cases
- Social modeling
- Metadata stores
GRAPH
Course by Daan Gerits
LINK <type> <src-node-id> <target-node-id>
Create a new link with the given characteristics
UNLINK <type> <src-node-id> <target-node-id>
Remove the link with the given characteristics
GET <node-id>
Get the node with the given node id
SET <node-id> <properties>
Set the properties of the node with the given id
DELETE <node-id>
Remove the node with the given id
GRAPH
COLUMN FAMILY
KEY DEFAULT INVOICES
name birthday 2018/001 20../... 2019/483
customers/214 Daan
Gerits
18/05/1983 {
total: 980.03,
…
}
... {
total: 38.73,
…
}
customer/583 Wim Van
Leuven
10/05/1973 {
total: 20.83,
…
}
... {
total: 378.60,
…
}
Course by Daan Gerits
Seemingly trivial concepts
Table, RowKey, Column Family, Column
Hard to reason about
Dynamic column names
Optimize for retrieval
Very fast
All data including related data in one request
Use cases
- Analytical stores
COLUMN FAMILY
Course by Daan Gerits
SCAN <prefix>
Scan through all records where the key matches the
given prefix.
GET <key> <column_family> [, <column_family>]
Get the given column families for the given key
SET <key> <value>
Set the value of the given key
DELETE <key>
Remove the record with the given key
COLUMN FAMILY
04
ADVICE
So how to deal with all of this?
Course by Daan Gerits
Data model for writing
can differ from data
model for reading
Course by Daan Gerits
Always start from the
questions you are to
answer
Course by Daan Gerits
If you need a join, you
most likely did it
wrong!
Questions?
@daangerits

Más contenido relacionado

La actualidad más candente

Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyGreta Workman
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThomas Kelly, PMP
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsChandan Rajah
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersKaren Lopez
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentCaserta
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldDataWorks Summit/Hadoop Summit
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise ArchitectsNeo4j
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive FrameworkRan Zhang
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamGreg Goltsov
 
Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceVital.AI
 
Total Data Industry Report
Total Data Industry ReportTotal Data Industry Report
Total Data Industry ReportRan Zhang
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryInside Analysis
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Denodo
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 

La actualidad más candente (20)

Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business Environment
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data World
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
 
Data management
Data managementData management
Data management
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data Science
 
Total Data Industry Report
Total Data Industry ReportTotal Data Industry Report
Total Data Industry Report
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 

Similar a Course 4 : Big Data Structuring, Integration and Management Systems by Daan Gerits

Conceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data ModelingConceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data ModelingDATAVERSITY
 
Qiagram
QiagramQiagram
Qiagramjwppz
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data SquaredDATAVERSITY
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data WarehousingAlex Meadows
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIDenodo
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Physical Database Requirements.pdf
Physical Database Requirements.pdfPhysical Database Requirements.pdf
Physical Database Requirements.pdfseifusisay06
 
Data Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeData Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeDATAVERSITY
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
 
Relational Database explanation with detail.pdf
Relational Database explanation with detail.pdfRelational Database explanation with detail.pdf
Relational Database explanation with detail.pdf9wldv5h8n
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Caserta
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Fast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingFast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingKaren Lopez
 
a scalable two phase top down specialization approach for data anonymization ...
a scalable two phase top down specialization approach for data anonymization ...a scalable two phase top down specialization approach for data anonymization ...
a scalable two phase top down specialization approach for data anonymization ...swathi78
 
Why Data Modeling Is Fundamental
Why Data Modeling Is FundamentalWhy Data Modeling Is Fundamental
Why Data Modeling Is FundamentalDATAVERSITY
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
 
Essential Reference and Master Data Management
Essential Reference and Master Data ManagementEssential Reference and Master Data Management
Essential Reference and Master Data ManagementDATAVERSITY
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxodane3
 

Similar a Course 4 : Big Data Structuring, Integration and Management Systems by Daan Gerits (20)

Conceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data ModelingConceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data Modeling
 
Database 1 Introduction
Database 1   IntroductionDatabase 1   Introduction
Database 1 Introduction
 
Qiagram
QiagramQiagram
Qiagram
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data Squared
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
Physical Database Requirements.pdf
Physical Database Requirements.pdfPhysical Database Requirements.pdf
Physical Database Requirements.pdf
 
Data Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeData Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s Home
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
 
Relational Database explanation with detail.pdf
Relational Database explanation with detail.pdfRelational Database explanation with detail.pdf
Relational Database explanation with detail.pdf
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Fast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingFast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & Processing
 
a scalable two phase top down specialization approach for data anonymization ...
a scalable two phase top down specialization approach for data anonymization ...a scalable two phase top down specialization approach for data anonymization ...
a scalable two phase top down specialization approach for data anonymization ...
 
Why Data Modeling Is Fundamental
Why Data Modeling Is FundamentalWhy Data Modeling Is Fundamental
Why Data Modeling Is Fundamental
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working Together
 
Essential Reference and Master Data Management
Essential Reference and Master Data ManagementEssential Reference and Master Data Management
Essential Reference and Master Data Management
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
 
mongo db EMERSON EDUARDO RODRIGUES
mongo db EMERSON EDUARDO RODRIGUESmongo db EMERSON EDUARDO RODRIGUES
mongo db EMERSON EDUARDO RODRIGUES
 

Último

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx9to5mart
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Último (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Course 4 : Big Data Structuring, Integration and Management Systems by Daan Gerits

  • 1. Big Data Structuring, Modeling, Managing COURSE BY DAAN GERITS
  • 2. CONTENT Dive into the techniques that make data systems scale 1 ANATOMY 2 DATA AT SCALE What is so different in working with data the traditional way vs the bigdata way? 3 DATA MODELS An overview of the most popular types of data models 4 ADVICE So what to make of all this?
  • 3. Course by Daan Gerits Data expert at design is dead Co-Founder of Fitchain.io data unicorn, technopreneur, founder Daan Gerits @daangerits Co-Founder of Bigdata.be https://pbs.twimg.com/profile_imag es/431014702533976064/7RZOwlp H_400x400.jpeg
  • 4. 01 ANATOMY Discover the techniques that make data systems scale • Replication and Partitioning • System load
  • 5. Course by Daan Gerits What? Copy data across physical nodes Why? Improve reliability and fault tolerance How? Create replica’s of the data and keep those in sync Replication
  • 6. Course by Daan Gerits What? Partition the data and distribute across physical nodes Why? Scale data systems How? Logical partitioning key Same partitioning key goes to same node Partitioning
  • 7. Course by Daan Gerits Read Heavy Most of the operations are read operations Write Heavy Most of the operations are write operations Balanced # read operations == # write operations Load
  • 8. Course by Daan Gerits How you store the data depends on how you query the data
  • 9. 02 DATA AT SCALE To seasoned data professionals a lot of the techniques and approaches do not seem so different to what they have done during the past decades. So what is so different?
  • 10. Course by Daan Gerits At the core of big data is the ability to deal with the volume, variety and velocity of data.
  • 11. Course by Daan Gerits Big Data is all about new ways of thinking about data
  • 12. THINK DIFFERENT OPERATIONAL Automate your processes through the use of data BUSINESS Change the metrics you use to measure success PERSONAL Data makes people important again. This doesn’t stop with the customer
  • 13. Course by Daan Gerits TRADITIONAL APPROACH Supply Model Request Request Request
  • 14. Course by Daan Gerits Big Data Approach Supply Model Request Request Request Model Model
  • 15. 03 DATA MODELS How you want to retrieve your data has an impact in how you store your data. These data models provide almost standard approaches to do so.
  • 16. HOW DATA IS STORED GRAPH Data model built out of nodes and their connections COLUMN FAMILY Seriously powerful but complex data model, ideal for sparse data KEY-VALUE A very simple data model mapping a key to a value KV DOCUMENT A data model where the structure of every value can be different
  • 17. KEY-VALUE KEY VALUE users.214.name Daan gerits users.214.birthdate 18/05/1983 users.214.roles [user, admin] users.214.isSubscribed true users.214.social.twitter @daangerits
  • 18. Course by Daan Gerits Fast Lookups But no way to query the data Scanning if keys are ordered Flexible value types Key and value can be anything, even collections and more complex data structures Easy to scale - Little to no dependencies between key-value pairs - Ordering can become difficult to scale Use cases - Caches - Configuration KEY-VALUE
  • 19. Course by Daan Gerits SCAN <prefix> Scan through all pairs where the key matches the given prefix. This is only possible if the keys are ordered GET <key> Get a key-value pair by its key SET <key> <value> Set the value of the given key DELETE <key> Remove the pair with the given key KEY-VALUE
  • 20. DOCUMENT KEY DOCUMENT daan { “name”: “Daan Gerits”, “birthday”: “18/05/1983” } wim { “name”: “Wim Van Leuven”, “company”: “Highestpoint” }
  • 21. Course by Daan Gerits Queryable Technology specific query language Separate index needs to be kept in sync Flexible value types Key can be anything Value is structured type (JSON, BSON, XML, …) Scalability requires caution - Relationships between documents - Scaling search can become a hurdle Use cases - Search engines - Entity Data Stores DOCUMENT
  • 22. Course by Daan Gerits FIND <query> Find all documents matching the given query GET <key> Get the document matching the given key CREATE <key> <document> Create a new document with the given key UPDATE <key> <field> <value> Update the given field within the document with the given key DELETE <key> Remove the document with the given key DOCUMENT
  • 23. GRAPH teaches Name: Daan Type: Tutor 1 Name: Els Type: Tutor 2 Name: bigdata Type: Course 3 Name: Amy Type: Student 4 teaches friend of enrolled in
  • 24. Course by Daan Gerits Relationships are first class citizens Graph traversal in specific language Updating relationships is cheap Easy concepts Node with properties Edge Very hard to scale Golden Ratio Scaling requires deep knowledge of the data Use cases - Social modeling - Metadata stores GRAPH
  • 25. Course by Daan Gerits LINK <type> <src-node-id> <target-node-id> Create a new link with the given characteristics UNLINK <type> <src-node-id> <target-node-id> Remove the link with the given characteristics GET <node-id> Get the node with the given node id SET <node-id> <properties> Set the properties of the node with the given id DELETE <node-id> Remove the node with the given id GRAPH
  • 26. COLUMN FAMILY KEY DEFAULT INVOICES name birthday 2018/001 20../... 2019/483 customers/214 Daan Gerits 18/05/1983 { total: 980.03, … } ... { total: 38.73, … } customer/583 Wim Van Leuven 10/05/1973 { total: 20.83, … } ... { total: 378.60, … }
  • 27. Course by Daan Gerits Seemingly trivial concepts Table, RowKey, Column Family, Column Hard to reason about Dynamic column names Optimize for retrieval Very fast All data including related data in one request Use cases - Analytical stores COLUMN FAMILY
  • 28. Course by Daan Gerits SCAN <prefix> Scan through all records where the key matches the given prefix. GET <key> <column_family> [, <column_family>] Get the given column families for the given key SET <key> <value> Set the value of the given key DELETE <key> Remove the record with the given key COLUMN FAMILY
  • 29. 04 ADVICE So how to deal with all of this?
  • 30. Course by Daan Gerits Data model for writing can differ from data model for reading
  • 31. Course by Daan Gerits Always start from the questions you are to answer
  • 32. Course by Daan Gerits If you need a join, you most likely did it wrong!