SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
Four Data Architecture Mega-
Patterns for Agility
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Agenda
Four Data Architecture Mega-Patterns for Agility
1. DataOps
2. Data Fabric
3. Data Mesh
4. Functional Data Engineering
An Example that Combines all Four Patterns
Conclusion and More Information
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Our Focus Is The River Of Work Right In Front Of
Us
• The Model,
• The Algorithm,
• The Data Pipeline,
• The Data Visualization,
• The Governance,
• The Data Itself
What is my next task?
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Next Task Focus Is Making Us Blind To Failure
• The Model,
• The Algorithm,
• The Data Pipeline,
• The Data Visualization,
• The Governance,
• The Data Itself
Task Focus Not Working
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Look Upstream At The Source Of The Problem
• Develop
• Deploy
• Iterate
• Monitor
• Test
• Collaborate
How You Do It
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
How? Focus On Four Key Upstream Processes
Decrease The Cycle Time:
Continuously Deploy
Innovation
Lower Error Rates: Increasing
Customer Data Trust
Improve Collaboration: Less
Meetings & Bureaucracy
Measure Your Team: And
show everyone your success
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataOps Aligns People, Processes,
and Technology
Rapid experimentation and innovation
enables faster delivery
Low error rates
Collaboration across complex sets of
people, technology, and
environments
Clear measurement and monitoring of
results
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Agenda
What Problems Do We Need To Solve With
Architecture for AI and Data Analytics?
Four Data Architecture Mega-Patterns for Agility
1. DataOps
2. Data Fabric
3. Data Mesh
4. Functional Data Engineering
An Example that Combines all Four Patterns
Conclusion and More Information
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Gartner Data Fabric
“Data fabric focuses on composability,
allowing users to build a flexible, agile,
scalable architecture that will be able
to supply data to humans or machine
users.
Data fabric is a design concept, not just
a set of technology components. “
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Fabric Toolchain Elements
Store: Transform:
SQL Code, ETL
Govern:
Catalog
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Fabric Toolchain Elements
Store: Transform:
SQL Code, ETL
Virtualize:
layer
Govern:
Catalog
Includes Data
Virtualization in
Reference Fabric
Design
Includes Data
Streaming in
Reference Fabric
Design
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Fabric: Beware Magic of ‘AI Inside’
Store: Transform:
SQL Code, ETL
Virtualize:
layer
Govern:
Catalog
AI
AI
AI AI
Magic AI:
Danger Will
Robinson
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Fabric: Beware Magic of ‘AI Inside’
Think of ‘AI Inside’ of Data Fabric like
autonomous driving:
• Level 1: Simple, keep your hands
on wheel
• Level 5: Cross Boston, in the
snow, at night
We are at Level 1 of AI in the Data
Fabric
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
AI + New Tools Agility
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
People & Tools in a
DataOps
Architecture
Agility
AI + New Tools
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Canonical ‘Factory’ Data Architecture / Fabric
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataOps Functional Architecture
Cloud/On-Prem
Production
Environment
Test
Dev
Source
Data
Data
Customers
Raw
Lake
Data
Engine
-ering
Refined
Data
Data
Science
Data
Viz.
Data
Govern
-ance
Orchestrate, Monitor, Test
Orchestrate, Monitor, Test
Orchestrate, Monitor, Test
DataOps Platform
Storage
&Version
Control
History &
Metadat
a
Auth &
Permissions
Envron-
ment
Secrets
DataOps
Metrics &
Reports
Automated
Deployment
Environment
Creation
and
Management
DataOps
Team
Second
Cloud/On-
Prem Data
Center
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataOps Physical Architecture
Cloud/On-Prem
Data
Center
Production
Environment
Test
Dev
Source
Data
Data
Customers
Agent
Agent
Agent
DataOps Platform
Storage Metadat
a
Auth Secrets Metrics
Raw
Lake
Data
Engine
-ering
Refined
Data
Data
Science
Data
Viz.
Data
Govern
-ance
Second
Cloud/On-
Prem Data
Center
Agent DataOps
Team
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Cloud/ON-Prem
#1
Production
Environment
Test
Dev
Agent
Agent
Agent
DataOps
Team
DataOps Pipeline
Cloud/On
Prem
#2
Production
Environment
Dev
Agent
Agent
DataOps Pipeline
DataOps Platform
Storage
&Version
Control
History &
Metadat
a
Auth &
Permissions
Envron-
ment
Secrets
DataOps
Metrics &
Reports
DataOps Spans Environments
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Fabric – A New Fashion Trend?
• It's Hot Stuff:
Gartner View, Forrester View. Top 10 downloaded report 2020, top inquiry
• What is a data fabric?:
• All the stuff you do with centralized data infrastructure:
ETL, DB, governance, store, lake, warehouse, stream/batch transformation.
• Plus, some fancy new stuff
1. AI component - magic pixie dust of self-driving data
2. Data virtualization/semantic layer
• However, it is missing other parts of the data value chain:
models, visualizations, self service. It’s more ‘hub’ than ‘spoke’
• Why? Moniker that covers the latest trends in data management.
• Caveat: The goal of implementing a data fabric is agility - agility is a second-order effect from
better tools. The primary driver is people & process following DataOps.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Agenda
Four Data Architecture Mega-Patterns for Agility
1. DataOps
2. Data Fabric
3. Data Mesh
4. Functional Data Engineering
An Example that Combines all Four Patterns
Conclusion and More Information
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Mesh 101
Why Data Mesh?
• Centralized Systems Fail
• Skill-based roles are unable to respond to rapid
customer needs
• Data domain knowledge matters
• Universal, one size fits all patterns fail
• General Data Analytic Project Failure
• Inspired by domain driven design (DDD) in software
The main idea is to take a best practice from
developing software & apply them to data analytics.
(Sound familiar?)
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
The Human Side of a Data Mesh: Main Idea
• The organization structure builds walls
& barriers to the changes
• When you make a change, you need to
update each component & coordinate
between several different teams
The organization creates walls & changes need to cross the traditional organizational boundaries
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
No, Data Engineers Are Not Perfectly Fungible
Data Mesh = Organization Mesh
The use of domain-driven / data mesh
design as the primary means:
1. Assignment of full end-to-end
ownership of a domain to one
cross-functional team that gets the
necessary support to fulfil that
responsibility.
2. Structure data
3. Build composable systems
Data Organization Keys
Let the small team continually own the
data set & not move for project to project
is key
‘You own the product’ thinking provides
the right incentives between the producers
& consumers
Source: thoughtworks.com/insights/blog/data-mesh-its-not-about-tech-its-about-ownership-and-communication
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
● Take the ideas of microservices where a team
owns the dev, test, deploy & running of the
microservice (5-9 people)
● Organize around the domain, not the technology
● The Operational & Data products are created by
the same team
● Domain data as a product - domain data teams
must consider their data assets & artifacts as their
products & others as their customers
● Data Engineers must live, work & understand a
finite number of data sets to really add value
The Human Side of a Data Mesh: Main Idea
The organization creates walls & changes need to cross the traditional organizational boundaries
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
What Data is in a Domain?
Domains Aligned with Sources / Types of Data
• ‘Mastered’ Data:
• Entities of business / subject areas
• Customers, products, etc.
• ‘Sources’ of Data:
• Business reality: facts on the ground
• Weblogs, user interaction history
Domains Aligned with Consumption of Data
• Integrated Data / Ready for Consumption
• Facts / Dimensions / Star Schemas
• Aggregated Views
• Product View
• Never Done, Always Improving
• Customer Usage Fucus
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
What are the Domain’s Components?
1. Data
2. Artifacts created from that data:
models, views, reports, dashboards, etc.
3. Code that acts upon that data:
pipelines, toolchains, etc.
4. Team used to create/update/run that Domain
5. Metadata: catalogs, lineage, test results,
processing history, etc.
Data Domain 1
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Must Be Composable & Controllable
Data Domain 1
Data Domain
2
Data Domain
3
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Interfaces
Data Domain
The Where:
How to find & access data securely;
e.g., DB connect string
The What:
Description of the data;
e.g., data catalog URL
The When:
Processing Results, Timing,
Test Results, Status, etc.
The How:
Steps, Code/Config, toolchain
& processing pipeline
The With:
Raw Data (or other Data
Domain), hopefully immutable
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Interfaces as URLs
https://cloud.datakitchen.io/#/recipes/dc/Production/agile-analytic-ops/variations/prod-env-DevSprint-build-now
https://cloud.datakitchen.io/#/orders/dc/Production/runs/60e82aa8-2518-11eb-8653-c2e92ba8ebec
jdbc:redshift://endpoint:port/database
https://dkimplementation.atlassian.net/wiki/spaces/
DC/pages/9306114/Dimension+Tables
Data Domain
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
What Do You Want Out of a Domain?
A series of independent domains of data that are:
1. Trusted
2. Usable by the teams’ customer
3. Discoverable / Findable
4. Understandable & well-described
5. Secure & permissioned
6. URL/API Driven: & can inter-operate with other domains
7. Have ‘single throat to choke’ for the customer to easily:
• Report problem & get updates on fixes
• Ask for new insights / improvements & get them into
production quickly
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Data Mesh Change in Focus
1. Domains & the grouping of your work into small teams
& partitions over ‘one platform to rule them all’
2. What services you are providing you customer, rather
than what data you are loading
3. Discovering & using over extracting & loading
4. Decentralization & the freedom to innovate over
central control
5. Ecosystem of data products linked together over a
centralized lake / warehouse
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
An Example of Domains
US Commercial Pharma Domains
• NPP (Non-Personal Promotion): emails, web site visits, even radio ads
• Physician: doctor (& other outlets) sales, claims data, anonymized patient data
• Payer: Payer/Plan, rebates, formulary
Launch:
NPP Domain
Growth:
Physician
Domain
Mature:
Payer
Domain
Commercial Pharma Analytics
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
What About the Data?
What about the data in each domain?
• Each domain has separate data sources
• Overlapping entities (e.g., physicians) exist in
each domain
• Each domain has different cycle times of product
(i.e., daily, weekly, hourly, etc.)
• Each data domain has its unique characteristics.
• For instance, subnational physician data from
IQVIA - purchased by pharma companies -
may not 1:1 match claims data, which may
not match payer data. This is due to data
supplier issues & timing projection
algorithms.
Sub-national Weekly data
Sub-national Payer Data
Sub-national Institutional (DDD) Data
National Prescription Audit Data
Sales Force Alignment Data
Longitudinal Patient Data
Sub-national Profit and Loss Data
Sub-national Claims and Co-pay Data
Payer and Plan Formulary Data
Census Data
Stocking Data
Source of Business
AMA Data
Retail OTC Data
Buy and Bill Data
Field Calls and Promotional Activity Data
Rep Expenses and Vacancy Data
Hotline Verification Data
Contract and Payer Rebates Data
Veeva CRM Data
ERP Data
NPP Data
Forecast Data
Primary Research Data
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Pharma Sales & Marketing Teams
NPP Domain Marketing & Sales Team
One part of the pharma brand team focused on ads, digital & other non-personal
promotions. This team matters most pre-launch & during the growth phase of a product
Physician Domain Marketing & Sales Team
Another part of the pharma team focused on in-person sales. Those are the good-looking
people you see in doctors waiting rooms. Sales calls, samples, doctor visits, messages,
call alignments, etc. This team matters the most during the first years of a pharma launch.
Payer Domain Marketing & Sales Team
A third part is focused on Payer Marketing. This part is - in essence - controlling the price
of a pharmaceutical product due to the rebate given to any payer. They are concerned
about the rebate contract, being on formulary & tier & copays. Payer Marketing matters
more during the 'mature' phase of a pharma product lifecycle.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Layers
1. Mastering & small files foundation files are a domain layer
There are 1M physicians in the US, but the company master of
physicians is only 40K. This work is done by separate teams working
independently.
2. Of course, the main data warehouse is a domain layer
There are facts & dimensions, along with multiple tables used for specific
analysts needed.
3. Self/Service & Data Science are a domain layers
They can keep their owned cached data sets (e.g., tableau extract) or
have their own small data sets that they mix with the central data in
Alteryx (or other) tools. Data Science teams have their own segmentation
models dependent on specific views or extracts of data.
Mastered Data Sets
(IT)
Integrated Data Sets
(Data Engineers)
Self Service Tools
(Analyst)
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Layers
Sub-national Weekly data
Sub-national Payer Data
Sub-national Institutional (DDD) Data
National Prescription Audit Data
Sales Force Alignment Data
Longitudinal Patient Data
Sub-national Profit and Loss Data
Sub-national Claims and Co-pay Data
Payer and Plan Formulary Data
Census Data
Stocking Data
Source of Business
AMA Data
Retail OTC Data
Buy and Bill Data
Field Calls and Promotional Activity Data
Rep Expenses and Vacancy Data
Hotline Verification Data
Contract and Payer Rebates Data
Veeva CRM Data
ERP Data
NPP Data
Forecast Data
Primary Research Data
Mastering Domain:
Physician MDM
Mastering Domain:
Target Lists, Product
Market Baskets
Brand Team
Reporting Domain
Field Sales Reporting
Domain
Raw, Sourced Data
(Various)
Mastered Data Sets
(IT)
Integrated Data Sets
(Data Engineers)
Self Service Tools
(Analyst)
Business
Customer
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Layers Processing Relationships
Sub-national Weekly data
Sub-national Payer Data
Sub-national Institutional (DDD) Data
National Prescription Audit Data
Sales Force Alignment Data
Longitudinal Patient Data
Sub-national Profit and Loss Data
Sub-national Claims and Co-pay Data
Payer and Plan Formulary Data
Census Data
Stocking Data
Source of Business
AMA Data
Retail OTC Data
Buy and Bill Data
Field Calls and Promotional Activity Data
Rep Expenses and Vacancy Data
Hotline Verification Data
Contract and Payer Rebates Data
Veeva CRM Data
ERP Data
NPP Data
Forecast Data
Primary Research Data
Mastering Domain:
Physician MDM
Mastering Domain:
Target Lists, Product
Market Baskets
Brand Team
Reporting Domain
Field Sales Reporting
Domain
Raw, Sourced Data
(Various)
Mastered Data Sets
(IT)
Integrated Data Sets
(Data Engineers)
Self Service Tools
(Analyst)
Business
Customer
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Layers Processing Steps
Sub-national Weekly data
Sub-national Payer Data
Sub-national Institutional (DDD) Data
National Prescription Audit Data
Sales Force Alignment Data
Longitudinal Patient Data
Sub-national Profit and Loss Data
Sub-national Claims and Co-pay Data
Payer and Plan Formulary Data
Census Data
Stocking Data
Source of Business
AMA Data
Retail OTC Data
Buy and Bill Data
Field Calls and Promotional Activity Data
Rep Expenses and Vacancy Data
Hotline Verification Data
Contract and Payer Rebates Data
Veeva CRM Data
ERP Data
NPP Data
Forecast Data
Primary Research Data
Raw, Sourced Data
(Various)
Mastered Data Sets
(IT)
Integrated Data Sets
(Data Engineers)
Self Service Tools
(Analyst)
Business
Customer
Mastering Domain:
Physician MDM
Brand Team
Reporting Domain
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Benefits of Approach
• Yes, you can do all these four Data Architecture Mega-
Patterns for Agility!
• Benefits
• Support over $10 Billion in sales
• Integrated 100s of data sets
• Very, very few errors or missed SLAs
• > 50,000 automated tests
• > 100 of schema/data changes per week
• Staff of seven data and DataOps engineers
• Low total yearly costs
hardware/hosting/software/staffing
• DataKitchen software enables those four patterns:
Recipes, Tests, Kitchens and Especially Ingredients can
handle all the needs
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Agenda
Four Data Architecture Mega-Patterns for Agility
1. DataOps
2. Data Fabric
3. Data Mesh
4. Functional Data Engineering
An Example that Combines all Four Patterns
Conclusion and More Information
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Built With Functional Programming
• Start with immutable (never
changing) data
• Pure functions (you put some
data in & get some data out)
• Idempotency (you can run it over
again & get the same thing)
• No side effects
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Functional Approach Benefits
Reproducibility
• Foundational to the scientific method
and data science / AI
• Critical from a legal standpoint and
sanity standpoint
Complexity Reduction
Cloud Native
• Storage and compute are cheap
Faster Time To Value
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Functional Data Mesh Systems
Production
Data
Analytic
Customers
Production Team
Yeah! All my tests & monitors
are passing!
Happy Customers!
Think of all your data & analytic work as a
“Big Function” in domain
• In that function are your data & AI toolchain
• Everybody works that function
(whether they know it or not!)
• Re-running a task for the same date should
always produce same output
• Data can be repaired by rerunning the new code
• A ‘big red/green light’ on the system telling you
everything is OK
Data
Domain
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Functional Data Systems Are Easier to Test & Deploy
Yeah! All my tests & monitors are
passing!
I did not break any code!
I can safely push to production!
A safe controlled process
Production
Data
Production Team
Data
Domain
Test
Data
Development Team
Data
Domain
Just flip the DNS entry for
the production URL!
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Agenda
Four Data Architecture Mega-Patterns for Agility
1. DataOps
2. Data Fabric
3. Data Mesh
4. Functional Data Engineering
An Example that Combines all Four Patterns
Why DataKitchen supports these four patterns
easily!
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Layers Processing Relationships
How do we update the data?
• Each Domain layer its own domain update processing
• Each layer has their own toolchain (i.e., SQL, Python, Informatica, etc.)
• Each layer has a series of sub-steps (i.e., a ‘DAG’)
• Each layer wants to know if the build is completed, the test applied & if the data is data is correct
What causes the update of each domain?
• Time / Schedule
• Order of operations, a meta-orchestrated coupling of each Domain, one part may need to be done
before the other or after.
• Event-orchestrated coupling. When new data arrives, kick off a change.
You Need a ‘Master DAG’ to run them all
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Inter-Domain Communication Links
Field Sales
Reporting Domain
Inter-Domain Communication Question / Steps Asked
Domain Query
“When was the last time you were updated?”
Successful or failure? Warnings?
Domain Query
“Is the data or artifacts in your domain good?
Can you prove it with some test results?”
Process Linkage
“Ok, you start. I am done.”
Process Linkage
“Ok, you start. I am done & here are a bunch of parameters you need to
keep going.”
Event Linkage
“Here is an event: e.g., processing completed, error, warnings, etc.”
Data Linkage
“We share a common table (e.g., a dimension table) in our domain.”
Development Linkage
“Can I re-create your domain in development?”
Can I see the code you used to create it?”
“Can I modify that code in development?”
“Is there a path to production?”
{ … }
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Supported Inter-Domain Communication Links
Field Sales
Reporting Domain
Inter Domain Communication DataKitchen Support
Domain Query YES
Domain Query YES
Process Linkage YES
Process Linkage YES
Event Linkage YES
Data Linkage NO
Development Linkage YES
{ … }
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Domain Development Process
The development process is essential.
• Code changes or new data sets may affect
downstream parts of the mesh.
• DataKitchen encapsules the development
& production environments
Key Questions
• How does a developer change one part
& not break things?
• How do you allow local change to a
domain & global governance & control?
Mastering Domain: Physician
MDM
Brand Team Reporting Domain
Mastering Domain: Physician
MDM
Brand Team Reporting Domain
Production Domains
Development of Domains
How do I change
this part & not
break things?
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Software's Role (Recipes)
DataKitchen DataOps Capability
Intelligent, test-informed, system-wide production
orchestration (meta-orchestration)
What workflow tools like Airflow, Control-
M, or Azure Data Factory do not have
• Integrated Production Testing & Monitoring
• A set of connectors to the complex chain of
data engineering, science, analytics, self-
service, governance & database tools.
• DataKitchen Recipes Meta-Orchestration or a
‘DAG of DAGs’
Mastering Domain: Physician
MDM
Brand Team Reporting Domain
DataKitchen Recipe
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Domain Interfaces As URLs
https://cloud.datakitchen.io/#/recipes/dc/Production/agile-
analytic-ops/variations/prod-env-DevSprint-build-now
Data Domain
The When:
DataKitchen OrderRun information
The How:
DataKitchen Recipe
https://cloud.datakitchen.io/#/orders/dc/
Production/runs/60e82aa8-2518-11eb-
8653-c2e92ba8ebec
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
DataKitchen Ingredients Allow Composition
• DataKitchen Ingredients allow reusable components that
can be incorporated into other processing
• Each domain can change independently, with a centralized
process to make sure the entire system is correct
• While DataKitchen Kitchens lets people work
independently, Ingredients let people work dependently:
• Recipes can reuse the data or artifacts that other Recipe
Variations produce
• Recipes need to incorporate other Recipes Variations
when they run
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Conclusion
Data Fabric, Data Mesh, and Functional Data engineering are exciting new paradigms
However, the DataOps part of is of paramount importance!
• The lineages & composition between domains are important
• Managing central process control & governance with local domain independence is very important
DataKitchen Features (e.g., Recipes, Tests, Kitchens & Ingredients) can handle all the needs of
the DataOps part of the mesh
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Accelerate Theses Patterns With DataKitchen
Software
DataKitchen DataOps Software Platform
that delivers new business insights by
enabling the development and
deployment of innovative, high quality
data analytic pipelines. Rapidly
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering
Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
Learn More !
Sign The DataOps Manifesto:
http://dataopsmanifesto.org
Free DataOps Cookbook:
https://datakitchen.io/the-dataops-cookbook/
Free DataOps Transformation Book
https://datakitchen.io/recipes-for-dataops-success-guide-to-dataops-transformation/
DataOps Data Fabric
Data Mesh
Functional
Data
Engineering

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 

Similar a DataOps - The Foundation for Your Agile Data Architecture

Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
redmondpulver
 

Similar a DataOps - The Foundation for Your Agile Data Architecture (20)

Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Modernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataModernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your Data
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
Veritas + MongoDB
Veritas + MongoDBVeritas + MongoDB
Veritas + MongoDB
 

Más de DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

Más de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 

Último (20)

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 

DataOps - The Foundation for Your Agile Data Architecture

  • 1. Four Data Architecture Mega- Patterns for Agility
  • 2. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Agenda Four Data Architecture Mega-Patterns for Agility 1. DataOps 2. Data Fabric 3. Data Mesh 4. Functional Data Engineering An Example that Combines all Four Patterns Conclusion and More Information DataOps Data Fabric Data Mesh Functional Data Engineering
  • 3. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Our Focus Is The River Of Work Right In Front Of Us • The Model, • The Algorithm, • The Data Pipeline, • The Data Visualization, • The Governance, • The Data Itself What is my next task?
  • 4. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Next Task Focus Is Making Us Blind To Failure • The Model, • The Algorithm, • The Data Pipeline, • The Data Visualization, • The Governance, • The Data Itself Task Focus Not Working
  • 5. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Look Upstream At The Source Of The Problem • Develop • Deploy • Iterate • Monitor • Test • Collaborate How You Do It
  • 6. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. How? Focus On Four Key Upstream Processes Decrease The Cycle Time: Continuously Deploy Innovation Lower Error Rates: Increasing Customer Data Trust Improve Collaboration: Less Meetings & Bureaucracy Measure Your Team: And show everyone your success
  • 7. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataOps Aligns People, Processes, and Technology Rapid experimentation and innovation enables faster delivery Low error rates Collaboration across complex sets of people, technology, and environments Clear measurement and monitoring of results
  • 8. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Agenda What Problems Do We Need To Solve With Architecture for AI and Data Analytics? Four Data Architecture Mega-Patterns for Agility 1. DataOps 2. Data Fabric 3. Data Mesh 4. Functional Data Engineering An Example that Combines all Four Patterns Conclusion and More Information DataOps Data Fabric Data Mesh Functional Data Engineering
  • 9. Copyright 2021 by DataKitchen, Inc. All Rights Reserved.
  • 10. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Gartner Data Fabric “Data fabric focuses on composability, allowing users to build a flexible, agile, scalable architecture that will be able to supply data to humans or machine users. Data fabric is a design concept, not just a set of technology components. “
  • 11. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Fabric Toolchain Elements Store: Transform: SQL Code, ETL Govern: Catalog
  • 12. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Fabric Toolchain Elements Store: Transform: SQL Code, ETL Virtualize: layer Govern: Catalog Includes Data Virtualization in Reference Fabric Design Includes Data Streaming in Reference Fabric Design
  • 13. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Fabric: Beware Magic of ‘AI Inside’ Store: Transform: SQL Code, ETL Virtualize: layer Govern: Catalog AI AI AI AI Magic AI: Danger Will Robinson
  • 14. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Fabric: Beware Magic of ‘AI Inside’ Think of ‘AI Inside’ of Data Fabric like autonomous driving: • Level 1: Simple, keep your hands on wheel • Level 5: Cross Boston, in the snow, at night We are at Level 1 of AI in the Data Fabric
  • 15. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. AI + New Tools Agility
  • 16. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. People & Tools in a DataOps Architecture Agility AI + New Tools
  • 17. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Canonical ‘Factory’ Data Architecture / Fabric
  • 18. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataOps Functional Architecture Cloud/On-Prem Production Environment Test Dev Source Data Data Customers Raw Lake Data Engine -ering Refined Data Data Science Data Viz. Data Govern -ance Orchestrate, Monitor, Test Orchestrate, Monitor, Test Orchestrate, Monitor, Test DataOps Platform Storage &Version Control History & Metadat a Auth & Permissions Envron- ment Secrets DataOps Metrics & Reports Automated Deployment Environment Creation and Management DataOps Team Second Cloud/On- Prem Data Center
  • 19. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataOps Physical Architecture Cloud/On-Prem Data Center Production Environment Test Dev Source Data Data Customers Agent Agent Agent DataOps Platform Storage Metadat a Auth Secrets Metrics Raw Lake Data Engine -ering Refined Data Data Science Data Viz. Data Govern -ance Second Cloud/On- Prem Data Center Agent DataOps Team
  • 20. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Cloud/ON-Prem #1 Production Environment Test Dev Agent Agent Agent DataOps Team DataOps Pipeline Cloud/On Prem #2 Production Environment Dev Agent Agent DataOps Pipeline DataOps Platform Storage &Version Control History & Metadat a Auth & Permissions Envron- ment Secrets DataOps Metrics & Reports DataOps Spans Environments
  • 21. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Fabric – A New Fashion Trend? • It's Hot Stuff: Gartner View, Forrester View. Top 10 downloaded report 2020, top inquiry • What is a data fabric?: • All the stuff you do with centralized data infrastructure: ETL, DB, governance, store, lake, warehouse, stream/batch transformation. • Plus, some fancy new stuff 1. AI component - magic pixie dust of self-driving data 2. Data virtualization/semantic layer • However, it is missing other parts of the data value chain: models, visualizations, self service. It’s more ‘hub’ than ‘spoke’ • Why? Moniker that covers the latest trends in data management. • Caveat: The goal of implementing a data fabric is agility - agility is a second-order effect from better tools. The primary driver is people & process following DataOps.
  • 22. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Agenda Four Data Architecture Mega-Patterns for Agility 1. DataOps 2. Data Fabric 3. Data Mesh 4. Functional Data Engineering An Example that Combines all Four Patterns Conclusion and More Information DataOps Data Fabric Data Mesh Functional Data Engineering
  • 23. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Mesh 101 Why Data Mesh? • Centralized Systems Fail • Skill-based roles are unable to respond to rapid customer needs • Data domain knowledge matters • Universal, one size fits all patterns fail • General Data Analytic Project Failure • Inspired by domain driven design (DDD) in software The main idea is to take a best practice from developing software & apply them to data analytics. (Sound familiar?)
  • 24. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. The Human Side of a Data Mesh: Main Idea • The organization structure builds walls & barriers to the changes • When you make a change, you need to update each component & coordinate between several different teams The organization creates walls & changes need to cross the traditional organizational boundaries
  • 25. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. No, Data Engineers Are Not Perfectly Fungible Data Mesh = Organization Mesh The use of domain-driven / data mesh design as the primary means: 1. Assignment of full end-to-end ownership of a domain to one cross-functional team that gets the necessary support to fulfil that responsibility. 2. Structure data 3. Build composable systems Data Organization Keys Let the small team continually own the data set & not move for project to project is key ‘You own the product’ thinking provides the right incentives between the producers & consumers Source: thoughtworks.com/insights/blog/data-mesh-its-not-about-tech-its-about-ownership-and-communication
  • 26. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. ● Take the ideas of microservices where a team owns the dev, test, deploy & running of the microservice (5-9 people) ● Organize around the domain, not the technology ● The Operational & Data products are created by the same team ● Domain data as a product - domain data teams must consider their data assets & artifacts as their products & others as their customers ● Data Engineers must live, work & understand a finite number of data sets to really add value The Human Side of a Data Mesh: Main Idea The organization creates walls & changes need to cross the traditional organizational boundaries
  • 27. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. What Data is in a Domain? Domains Aligned with Sources / Types of Data • ‘Mastered’ Data: • Entities of business / subject areas • Customers, products, etc. • ‘Sources’ of Data: • Business reality: facts on the ground • Weblogs, user interaction history Domains Aligned with Consumption of Data • Integrated Data / Ready for Consumption • Facts / Dimensions / Star Schemas • Aggregated Views • Product View • Never Done, Always Improving • Customer Usage Fucus
  • 28. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. What are the Domain’s Components? 1. Data 2. Artifacts created from that data: models, views, reports, dashboards, etc. 3. Code that acts upon that data: pipelines, toolchains, etc. 4. Team used to create/update/run that Domain 5. Metadata: catalogs, lineage, test results, processing history, etc. Data Domain 1
  • 29. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Must Be Composable & Controllable Data Domain 1 Data Domain 2 Data Domain 3
  • 30. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Interfaces Data Domain The Where: How to find & access data securely; e.g., DB connect string The What: Description of the data; e.g., data catalog URL The When: Processing Results, Timing, Test Results, Status, etc. The How: Steps, Code/Config, toolchain & processing pipeline The With: Raw Data (or other Data Domain), hopefully immutable
  • 31. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Interfaces as URLs https://cloud.datakitchen.io/#/recipes/dc/Production/agile-analytic-ops/variations/prod-env-DevSprint-build-now https://cloud.datakitchen.io/#/orders/dc/Production/runs/60e82aa8-2518-11eb-8653-c2e92ba8ebec jdbc:redshift://endpoint:port/database https://dkimplementation.atlassian.net/wiki/spaces/ DC/pages/9306114/Dimension+Tables Data Domain
  • 32. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. What Do You Want Out of a Domain? A series of independent domains of data that are: 1. Trusted 2. Usable by the teams’ customer 3. Discoverable / Findable 4. Understandable & well-described 5. Secure & permissioned 6. URL/API Driven: & can inter-operate with other domains 7. Have ‘single throat to choke’ for the customer to easily: • Report problem & get updates on fixes • Ask for new insights / improvements & get them into production quickly
  • 33. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Data Mesh Change in Focus 1. Domains & the grouping of your work into small teams & partitions over ‘one platform to rule them all’ 2. What services you are providing you customer, rather than what data you are loading 3. Discovering & using over extracting & loading 4. Decentralization & the freedom to innovate over central control 5. Ecosystem of data products linked together over a centralized lake / warehouse
  • 34. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. An Example of Domains US Commercial Pharma Domains • NPP (Non-Personal Promotion): emails, web site visits, even radio ads • Physician: doctor (& other outlets) sales, claims data, anonymized patient data • Payer: Payer/Plan, rebates, formulary Launch: NPP Domain Growth: Physician Domain Mature: Payer Domain Commercial Pharma Analytics
  • 35. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. What About the Data? What about the data in each domain? • Each domain has separate data sources • Overlapping entities (e.g., physicians) exist in each domain • Each domain has different cycle times of product (i.e., daily, weekly, hourly, etc.) • Each data domain has its unique characteristics. • For instance, subnational physician data from IQVIA - purchased by pharma companies - may not 1:1 match claims data, which may not match payer data. This is due to data supplier issues & timing projection algorithms. Sub-national Weekly data Sub-national Payer Data Sub-national Institutional (DDD) Data National Prescription Audit Data Sales Force Alignment Data Longitudinal Patient Data Sub-national Profit and Loss Data Sub-national Claims and Co-pay Data Payer and Plan Formulary Data Census Data Stocking Data Source of Business AMA Data Retail OTC Data Buy and Bill Data Field Calls and Promotional Activity Data Rep Expenses and Vacancy Data Hotline Verification Data Contract and Payer Rebates Data Veeva CRM Data ERP Data NPP Data Forecast Data Primary Research Data
  • 36. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Pharma Sales & Marketing Teams NPP Domain Marketing & Sales Team One part of the pharma brand team focused on ads, digital & other non-personal promotions. This team matters most pre-launch & during the growth phase of a product Physician Domain Marketing & Sales Team Another part of the pharma team focused on in-person sales. Those are the good-looking people you see in doctors waiting rooms. Sales calls, samples, doctor visits, messages, call alignments, etc. This team matters the most during the first years of a pharma launch. Payer Domain Marketing & Sales Team A third part is focused on Payer Marketing. This part is - in essence - controlling the price of a pharmaceutical product due to the rebate given to any payer. They are concerned about the rebate contract, being on formulary & tier & copays. Payer Marketing matters more during the 'mature' phase of a pharma product lifecycle.
  • 37. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Layers 1. Mastering & small files foundation files are a domain layer There are 1M physicians in the US, but the company master of physicians is only 40K. This work is done by separate teams working independently. 2. Of course, the main data warehouse is a domain layer There are facts & dimensions, along with multiple tables used for specific analysts needed. 3. Self/Service & Data Science are a domain layers They can keep their owned cached data sets (e.g., tableau extract) or have their own small data sets that they mix with the central data in Alteryx (or other) tools. Data Science teams have their own segmentation models dependent on specific views or extracts of data. Mastered Data Sets (IT) Integrated Data Sets (Data Engineers) Self Service Tools (Analyst)
  • 38. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Layers Sub-national Weekly data Sub-national Payer Data Sub-national Institutional (DDD) Data National Prescription Audit Data Sales Force Alignment Data Longitudinal Patient Data Sub-national Profit and Loss Data Sub-national Claims and Co-pay Data Payer and Plan Formulary Data Census Data Stocking Data Source of Business AMA Data Retail OTC Data Buy and Bill Data Field Calls and Promotional Activity Data Rep Expenses and Vacancy Data Hotline Verification Data Contract and Payer Rebates Data Veeva CRM Data ERP Data NPP Data Forecast Data Primary Research Data Mastering Domain: Physician MDM Mastering Domain: Target Lists, Product Market Baskets Brand Team Reporting Domain Field Sales Reporting Domain Raw, Sourced Data (Various) Mastered Data Sets (IT) Integrated Data Sets (Data Engineers) Self Service Tools (Analyst) Business Customer
  • 39. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Layers Processing Relationships Sub-national Weekly data Sub-national Payer Data Sub-national Institutional (DDD) Data National Prescription Audit Data Sales Force Alignment Data Longitudinal Patient Data Sub-national Profit and Loss Data Sub-national Claims and Co-pay Data Payer and Plan Formulary Data Census Data Stocking Data Source of Business AMA Data Retail OTC Data Buy and Bill Data Field Calls and Promotional Activity Data Rep Expenses and Vacancy Data Hotline Verification Data Contract and Payer Rebates Data Veeva CRM Data ERP Data NPP Data Forecast Data Primary Research Data Mastering Domain: Physician MDM Mastering Domain: Target Lists, Product Market Baskets Brand Team Reporting Domain Field Sales Reporting Domain Raw, Sourced Data (Various) Mastered Data Sets (IT) Integrated Data Sets (Data Engineers) Self Service Tools (Analyst) Business Customer
  • 40. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Layers Processing Steps Sub-national Weekly data Sub-national Payer Data Sub-national Institutional (DDD) Data National Prescription Audit Data Sales Force Alignment Data Longitudinal Patient Data Sub-national Profit and Loss Data Sub-national Claims and Co-pay Data Payer and Plan Formulary Data Census Data Stocking Data Source of Business AMA Data Retail OTC Data Buy and Bill Data Field Calls and Promotional Activity Data Rep Expenses and Vacancy Data Hotline Verification Data Contract and Payer Rebates Data Veeva CRM Data ERP Data NPP Data Forecast Data Primary Research Data Raw, Sourced Data (Various) Mastered Data Sets (IT) Integrated Data Sets (Data Engineers) Self Service Tools (Analyst) Business Customer Mastering Domain: Physician MDM Brand Team Reporting Domain
  • 41. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Benefits of Approach • Yes, you can do all these four Data Architecture Mega- Patterns for Agility! • Benefits • Support over $10 Billion in sales • Integrated 100s of data sets • Very, very few errors or missed SLAs • > 50,000 automated tests • > 100 of schema/data changes per week • Staff of seven data and DataOps engineers • Low total yearly costs hardware/hosting/software/staffing • DataKitchen software enables those four patterns: Recipes, Tests, Kitchens and Especially Ingredients can handle all the needs
  • 42. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Agenda Four Data Architecture Mega-Patterns for Agility 1. DataOps 2. Data Fabric 3. Data Mesh 4. Functional Data Engineering An Example that Combines all Four Patterns Conclusion and More Information DataOps Data Fabric Data Mesh Functional Data Engineering
  • 43. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Built With Functional Programming • Start with immutable (never changing) data • Pure functions (you put some data in & get some data out) • Idempotency (you can run it over again & get the same thing) • No side effects
  • 44. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Functional Approach Benefits Reproducibility • Foundational to the scientific method and data science / AI • Critical from a legal standpoint and sanity standpoint Complexity Reduction Cloud Native • Storage and compute are cheap Faster Time To Value
  • 45. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Functional Data Mesh Systems Production Data Analytic Customers Production Team Yeah! All my tests & monitors are passing! Happy Customers! Think of all your data & analytic work as a “Big Function” in domain • In that function are your data & AI toolchain • Everybody works that function (whether they know it or not!) • Re-running a task for the same date should always produce same output • Data can be repaired by rerunning the new code • A ‘big red/green light’ on the system telling you everything is OK Data Domain
  • 46. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Functional Data Systems Are Easier to Test & Deploy Yeah! All my tests & monitors are passing! I did not break any code! I can safely push to production! A safe controlled process Production Data Production Team Data Domain Test Data Development Team Data Domain Just flip the DNS entry for the production URL!
  • 47. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Agenda Four Data Architecture Mega-Patterns for Agility 1. DataOps 2. Data Fabric 3. Data Mesh 4. Functional Data Engineering An Example that Combines all Four Patterns Why DataKitchen supports these four patterns easily! DataOps Data Fabric Data Mesh Functional Data Engineering
  • 48. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Layers Processing Relationships How do we update the data? • Each Domain layer its own domain update processing • Each layer has their own toolchain (i.e., SQL, Python, Informatica, etc.) • Each layer has a series of sub-steps (i.e., a ‘DAG’) • Each layer wants to know if the build is completed, the test applied & if the data is data is correct What causes the update of each domain? • Time / Schedule • Order of operations, a meta-orchestrated coupling of each Domain, one part may need to be done before the other or after. • Event-orchestrated coupling. When new data arrives, kick off a change. You Need a ‘Master DAG’ to run them all
  • 49. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Inter-Domain Communication Links Field Sales Reporting Domain Inter-Domain Communication Question / Steps Asked Domain Query “When was the last time you were updated?” Successful or failure? Warnings? Domain Query “Is the data or artifacts in your domain good? Can you prove it with some test results?” Process Linkage “Ok, you start. I am done.” Process Linkage “Ok, you start. I am done & here are a bunch of parameters you need to keep going.” Event Linkage “Here is an event: e.g., processing completed, error, warnings, etc.” Data Linkage “We share a common table (e.g., a dimension table) in our domain.” Development Linkage “Can I re-create your domain in development?” Can I see the code you used to create it?” “Can I modify that code in development?” “Is there a path to production?” { … }
  • 50. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataKitchen Supported Inter-Domain Communication Links Field Sales Reporting Domain Inter Domain Communication DataKitchen Support Domain Query YES Domain Query YES Process Linkage YES Process Linkage YES Event Linkage YES Data Linkage NO Development Linkage YES { … }
  • 51. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Domain Development Process The development process is essential. • Code changes or new data sets may affect downstream parts of the mesh. • DataKitchen encapsules the development & production environments Key Questions • How does a developer change one part & not break things? • How do you allow local change to a domain & global governance & control? Mastering Domain: Physician MDM Brand Team Reporting Domain Mastering Domain: Physician MDM Brand Team Reporting Domain Production Domains Development of Domains How do I change this part & not break things?
  • 52. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataKitchen Software's Role (Recipes) DataKitchen DataOps Capability Intelligent, test-informed, system-wide production orchestration (meta-orchestration) What workflow tools like Airflow, Control- M, or Azure Data Factory do not have • Integrated Production Testing & Monitoring • A set of connectors to the complex chain of data engineering, science, analytics, self- service, governance & database tools. • DataKitchen Recipes Meta-Orchestration or a ‘DAG of DAGs’ Mastering Domain: Physician MDM Brand Team Reporting Domain DataKitchen Recipe
  • 53. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataKitchen Domain Interfaces As URLs https://cloud.datakitchen.io/#/recipes/dc/Production/agile- analytic-ops/variations/prod-env-DevSprint-build-now Data Domain The When: DataKitchen OrderRun information The How: DataKitchen Recipe https://cloud.datakitchen.io/#/orders/dc/ Production/runs/60e82aa8-2518-11eb- 8653-c2e92ba8ebec
  • 54. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. DataKitchen Ingredients Allow Composition • DataKitchen Ingredients allow reusable components that can be incorporated into other processing • Each domain can change independently, with a centralized process to make sure the entire system is correct • While DataKitchen Kitchens lets people work independently, Ingredients let people work dependently: • Recipes can reuse the data or artifacts that other Recipe Variations produce • Recipes need to incorporate other Recipes Variations when they run
  • 55. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Conclusion Data Fabric, Data Mesh, and Functional Data engineering are exciting new paradigms However, the DataOps part of is of paramount importance! • The lineages & composition between domains are important • Managing central process control & governance with local domain independence is very important DataKitchen Features (e.g., Recipes, Tests, Kitchens & Ingredients) can handle all the needs of the DataOps part of the mesh
  • 56. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Accelerate Theses Patterns With DataKitchen Software DataKitchen DataOps Software Platform that delivers new business insights by enabling the development and deployment of innovative, high quality data analytic pipelines. Rapidly DataOps Data Fabric Data Mesh Functional Data Engineering
  • 57. Copyright 2021 by DataKitchen, Inc. All Rights Reserved. Learn More ! Sign The DataOps Manifesto: http://dataopsmanifesto.org Free DataOps Cookbook: https://datakitchen.io/the-dataops-cookbook/ Free DataOps Transformation Book https://datakitchen.io/recipes-for-dataops-success-guide-to-dataops-transformation/ DataOps Data Fabric Data Mesh Functional Data Engineering