SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
Data Mesh in Practice
Max Schultze - max.schultze@zalando.de
Arif Wider - awider@thoughtworks.com
12-06-2020
How Europe’s Leading
Online Platform for Fashion
Goes Beyond the Data Lake
@mcs1408 @arifwider
2
Max Schultze
● Lead Data Engineer
● MSc in Computer Science
● Took part in early
development of Apache Flink
● Retired semi-professional
Magic: the Gathering player
Who are we?
Arif Wider
● Lead Technology Consultant
● Head of AI, ThoughtWorks Germany
● Scala & FP enthusiast
● Coffee geek
7000+ technologists with 43 offices in 14 countries
Partner for technology driven business transformation
Barcelona - Madrid - London - Manchester - Berlin - Hamburg - Munich - Cologne
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2020
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2020
6
WHAT TO EXPECT
Zalando Analytics Cloud Journey
What’s this Data Mesh?
Data Mesh in Practice
7
Zalando Analytics Cloud
Journey
8
Legacy Analytics
DWH
9
Messaging
Bus
Data Lake
Legacy Evolving
10
Zalando’s Data Lake
Ingestion
Storage
Serving
11
Zalando’s Data Lake
Web
Tracking
Event Bus
DWH
Data Center
Ingestion
Storage
Serving
12
Zalando’s Data Lake
Web
Tracking
Event Bus
DWH
Data Center
Ingestion
Storage
Serving
Metastore
13
Zalando’s Data Lake
Data CatalogWeb
Tracking
Event Bus
DWH
Data Center
Ingestion
Storage
Serving
Metastore
Fast Query Layer
Processing Platform
14
Centralization Challenges
Datasets provided by data agnostic infrastructure team
● Lack of ownership
?
15
Field_A Field_B
Record_1
Record_2
Record_3
Datasets provided by data agnostic infrastructure team
● Lack of ownership
Pipeline responsibility on data agnostic infrastructure team
● Lack of quality
Centralization Challenges
16
Centralization Challenges
Datasets provided by data agnostic infrastructure team
● Lack of ownership
Pipeline responsibility on data agnostic infrastructure team
● Lack of quality
Organizational scaling
● Central team becomes the bottleneck
17
A Recurring Pattern
Product teams
generating data
Data engineers
maintaining the
data platform
Decisions makers,
data scientists
consuming data
18
Why is that?
central
data platform
19
Why is that?
checkout
service
checkout
events
20
What is Data Mesh?
Old wine applied to new bottles…
→ Product Thinking
→ Domain-Driven Distributed Architecture
→ Infrastructure as a Platform
… creates value from Data
21
Data as a Product
Data
Product
What is my market?
What are the desires of
my customers?
What “price” is justified?
How to do marketing?
What’s the USP?
Are my customers happy?
22
Domain-Driven Distributed Data Architecture
Domain
22
23
Domain-Driven Distributed Data Architecture
Domain
23
→ The Data Product is the
fundamental building block
Aggregated
Domain
24
Domain-Driven Distributed Data Architecture
Discoverable
Addressable
Self-describing
Trustworthy
Interoperable
(governed by open
standard)
Secure (governed by
global access control)
Domain
24
→ The Data Product is the
fundamental building block
Aggregated
Domain
25
Self-Service Data Infrastructure
Data Infra as a Platform
Storage, pipeline, catalogue, access control, etc
Data infra
engineers
Discoverable
Addressable
Self-describing
Trustworthy
Interoperable
(governed by open
standard)
Secure (governed by
global access control)
Domain
25
→ The Data Product is the
fundamental building block
Aggregated
Domain
26
Global Governance & Open Standards
Enable interoperability
An Ecosystem of Data Products
Data Infra as a Platform
Storage, pipeline, catalogue, access control, etc
Data infra
engineers
Discoverable
Addressable
Self-describing
Trustworthy
Interoperable
(governed by open
standard)
Secure (governed by
global access control)
Domain
26
→ The Data Product is the
fundamental building block
Aggregated
Domain
27
It’s a mindset shift
FROM TO
Centralized ownership Decentralized ownership
Pipelines as first class concern Domain Data as first class concern
Data as a by-product Data as a Product
Siloed Data Engineering Team Cross-functional Domain-Data Teams
Centralized Data Lake / Warehouse Ecosystem of Data Products
28
Data Mesh in Practice
29
Recap:
● From Bottleneck to Infra Platform
Data Mesh in Practice
Data Infra as a Platform
Storage, pipeline, catalogue, access control, etc
30
Recap:
● From Bottleneck to Infra Platform
● From Data Monolith to Interoperable Services
Data Mesh in Practice
Data Infra as a Platform
Storage, pipeline, catalogue, access control, etc
central
data
platform
31
Data Lake Storage
Metadata Layer
Central Services with Global Interoperability
32
Data Lake Storage
Metadata Layer
Bring Your Own Bucket (BYOB)
33
Data Lake Storage
Processing Platform
Metadata Layer
Central Processing Platform
34
Data Lake Storage
Processing Platform
Metadata Layer
Simplify Data Sharing
35
Central Services with Global Interoperability
Decentralized ownership does not imply decentralized infrastructure!
Interoperability is created through convenient solutions of a self service platform.
Decentral Storage Central Infrastructure
Decentral Ownership Central Governance
36
Recap:
● Datasets provided through pipelines of data agnostic infrastructure teams
Data Mesh in Practice
?
37
Recap:
● Datasets provided through pipelines of data agnostic infrastructure teams
Data Mesh in Practice
?
Who is allowed to share data?
What are the criteria to enable data consumers?
How to ensure data quality?
38
How to Ensure Data Quality?
Make conscious decisions
● Opt-in instead of default storage
39
How to Ensure Data Quality?
Make conscious decisions
● Opt-in instead of default storage
● Classification of data usage
40
Data Quality - A Contract between Consumer and Producer
Behavioral changes for data producers
● Data is a product not a by-product
41
Behavioral changes for data producers
● Data is a product not a by-product
● Dedicate resources to
○ Understand usage
○ Ensure quality
Data Quality - A Contract between Consumer and Producer
42
Into the Future
43
Into the Future
● Domain Enterprise Architecture
○ Definition of domain responsibilities
○ Appointment of domain specific experts
44
Into the Future
● Domain Enterprise Architecture
○ Definition of domain responsibilities
○ Appointment of domain specific experts
● “Off the shelf” data products
○ De-centralized archiving
○ Template driven data preparation
45
Data Mesh in Practice
How Europe’s Leading
Online Platform for Fashion
Goes Beyond the Data Lake
Max Schultze
max.schultze@zalando.de
@mcs1408
Arif Wider
awider@thoughtworks.com
@arifwider

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Liberating data with Talend Data Catalog
Liberating data with Talend Data CatalogLiberating data with Talend Data Catalog
Liberating data with Talend Data Catalog
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 
Data mesh
Data meshData mesh
Data mesh
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 

Similar a Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes Beyond the Data Lake

Emea partners recruitment webinar
Emea partners recruitment webinarEmea partners recruitment webinar
Emea partners recruitment webinar
MongoDB
 
Black Box Global Corproate deck.pdf
Black Box Global Corproate deck.pdfBlack Box Global Corproate deck.pdf
Black Box Global Corproate deck.pdf
Black Box
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
Denodo
 

Similar a Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes Beyond the Data Lake (20)

Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Partner Recruitment Webinar: "Join the Most Productive Ecosystem in Big Data ...
Partner Recruitment Webinar: "Join the Most Productive Ecosystem in Big Data ...Partner Recruitment Webinar: "Join the Most Productive Ecosystem in Big Data ...
Partner Recruitment Webinar: "Join the Most Productive Ecosystem in Big Data ...
 
Emea partners recruitment webinar
Emea partners recruitment webinarEmea partners recruitment webinar
Emea partners recruitment webinar
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
Black Box Global Corproate deck.pdf
Black Box Global Corproate deck.pdfBlack Box Global Corproate deck.pdf
Black Box Global Corproate deck.pdf
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
GraphTalks Hamburg - Einführung in Graphdatenbanken
GraphTalks Hamburg - Einführung in GraphdatenbankenGraphTalks Hamburg - Einführung in Graphdatenbanken
GraphTalks Hamburg - Einführung in Graphdatenbanken
 
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
 
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
GraphTalks Stuttgart - Einführung in Graphdatenbanken und Neo4j
GraphTalks Stuttgart - Einführung in Graphdatenbanken und Neo4jGraphTalks Stuttgart - Einführung in Graphdatenbanken und Neo4j
GraphTalks Stuttgart - Einführung in Graphdatenbanken und Neo4j
 
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDatenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
 

Más de Dr. Arif Wider

Más de Dr. Arif Wider (11)

Data Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleData Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about people
 
Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
Continuous Intelligence: Keeping Your AI Application in Production (NDC Sydne...
Continuous Intelligence: Keeping Your AI Application in Production (NDC Sydne...Continuous Intelligence: Keeping Your AI Application in Production (NDC Sydne...
Continuous Intelligence: Keeping Your AI Application in Production (NDC Sydne...
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production Reliably
 
Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
 
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
 
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
 
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf HamburgA High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
 
An Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI CompositionAn Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI Composition
 

Último

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Último (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes Beyond the Data Lake

  • 1. Data Mesh in Practice Max Schultze - max.schultze@zalando.de Arif Wider - awider@thoughtworks.com 12-06-2020 How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake @mcs1408 @arifwider
  • 2. 2 Max Schultze ● Lead Data Engineer ● MSc in Computer Science ● Took part in early development of Apache Flink ● Retired semi-professional Magic: the Gathering player Who are we? Arif Wider ● Lead Technology Consultant ● Head of AI, ThoughtWorks Germany ● Scala & FP enthusiast ● Coffee geek
  • 3. 7000+ technologists with 43 offices in 14 countries Partner for technology driven business transformation Barcelona - Madrid - London - Manchester - Berlin - Hamburg - Munich - Cologne
  • 4. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020
  • 5. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020
  • 6. 6 WHAT TO EXPECT Zalando Analytics Cloud Journey What’s this Data Mesh? Data Mesh in Practice
  • 11. 11 Zalando’s Data Lake Web Tracking Event Bus DWH Data Center Ingestion Storage Serving
  • 12. 12 Zalando’s Data Lake Web Tracking Event Bus DWH Data Center Ingestion Storage Serving Metastore
  • 13. 13 Zalando’s Data Lake Data CatalogWeb Tracking Event Bus DWH Data Center Ingestion Storage Serving Metastore Fast Query Layer Processing Platform
  • 14. 14 Centralization Challenges Datasets provided by data agnostic infrastructure team ● Lack of ownership ?
  • 15. 15 Field_A Field_B Record_1 Record_2 Record_3 Datasets provided by data agnostic infrastructure team ● Lack of ownership Pipeline responsibility on data agnostic infrastructure team ● Lack of quality Centralization Challenges
  • 16. 16 Centralization Challenges Datasets provided by data agnostic infrastructure team ● Lack of ownership Pipeline responsibility on data agnostic infrastructure team ● Lack of quality Organizational scaling ● Central team becomes the bottleneck
  • 17. 17 A Recurring Pattern Product teams generating data Data engineers maintaining the data platform Decisions makers, data scientists consuming data
  • 20. 20 What is Data Mesh? Old wine applied to new bottles… → Product Thinking → Domain-Driven Distributed Architecture → Infrastructure as a Platform … creates value from Data
  • 21. 21 Data as a Product Data Product What is my market? What are the desires of my customers? What “price” is justified? How to do marketing? What’s the USP? Are my customers happy?
  • 22. 22 Domain-Driven Distributed Data Architecture Domain 22
  • 23. 23 Domain-Driven Distributed Data Architecture Domain 23 → The Data Product is the fundamental building block Aggregated Domain
  • 24. 24 Domain-Driven Distributed Data Architecture Discoverable Addressable Self-describing Trustworthy Interoperable (governed by open standard) Secure (governed by global access control) Domain 24 → The Data Product is the fundamental building block Aggregated Domain
  • 25. 25 Self-Service Data Infrastructure Data Infra as a Platform Storage, pipeline, catalogue, access control, etc Data infra engineers Discoverable Addressable Self-describing Trustworthy Interoperable (governed by open standard) Secure (governed by global access control) Domain 25 → The Data Product is the fundamental building block Aggregated Domain
  • 26. 26 Global Governance & Open Standards Enable interoperability An Ecosystem of Data Products Data Infra as a Platform Storage, pipeline, catalogue, access control, etc Data infra engineers Discoverable Addressable Self-describing Trustworthy Interoperable (governed by open standard) Secure (governed by global access control) Domain 26 → The Data Product is the fundamental building block Aggregated Domain
  • 27. 27 It’s a mindset shift FROM TO Centralized ownership Decentralized ownership Pipelines as first class concern Domain Data as first class concern Data as a by-product Data as a Product Siloed Data Engineering Team Cross-functional Domain-Data Teams Centralized Data Lake / Warehouse Ecosystem of Data Products
  • 28. 28 Data Mesh in Practice
  • 29. 29 Recap: ● From Bottleneck to Infra Platform Data Mesh in Practice Data Infra as a Platform Storage, pipeline, catalogue, access control, etc
  • 30. 30 Recap: ● From Bottleneck to Infra Platform ● From Data Monolith to Interoperable Services Data Mesh in Practice Data Infra as a Platform Storage, pipeline, catalogue, access control, etc central data platform
  • 31. 31 Data Lake Storage Metadata Layer Central Services with Global Interoperability
  • 32. 32 Data Lake Storage Metadata Layer Bring Your Own Bucket (BYOB)
  • 33. 33 Data Lake Storage Processing Platform Metadata Layer Central Processing Platform
  • 34. 34 Data Lake Storage Processing Platform Metadata Layer Simplify Data Sharing
  • 35. 35 Central Services with Global Interoperability Decentralized ownership does not imply decentralized infrastructure! Interoperability is created through convenient solutions of a self service platform. Decentral Storage Central Infrastructure Decentral Ownership Central Governance
  • 36. 36 Recap: ● Datasets provided through pipelines of data agnostic infrastructure teams Data Mesh in Practice ?
  • 37. 37 Recap: ● Datasets provided through pipelines of data agnostic infrastructure teams Data Mesh in Practice ? Who is allowed to share data? What are the criteria to enable data consumers? How to ensure data quality?
  • 38. 38 How to Ensure Data Quality? Make conscious decisions ● Opt-in instead of default storage
  • 39. 39 How to Ensure Data Quality? Make conscious decisions ● Opt-in instead of default storage ● Classification of data usage
  • 40. 40 Data Quality - A Contract between Consumer and Producer Behavioral changes for data producers ● Data is a product not a by-product
  • 41. 41 Behavioral changes for data producers ● Data is a product not a by-product ● Dedicate resources to ○ Understand usage ○ Ensure quality Data Quality - A Contract between Consumer and Producer
  • 43. 43 Into the Future ● Domain Enterprise Architecture ○ Definition of domain responsibilities ○ Appointment of domain specific experts
  • 44. 44 Into the Future ● Domain Enterprise Architecture ○ Definition of domain responsibilities ○ Appointment of domain specific experts ● “Off the shelf” data products ○ De-centralized archiving ○ Template driven data preparation
  • 45. 45 Data Mesh in Practice How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake Max Schultze max.schultze@zalando.de @mcs1408 Arif Wider awider@thoughtworks.com @arifwider