SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Data Pipelines Observability
OpenLineage & Marquez
Julien Le Dem
CTO & Co-Founder Datakin
@J_
AGENDA
● The need for metadata
● OpenLineage: open standard for metadata and
lineage collection
● Marquez: a reference implementation
The need for Metadata
Building a healthy data ecosystem
Team A Team B
Team C
● What is the data source?
● What is the schema?
● Who is the owner?
● How often is it updated?
● Where is it coming from?
● Who is using the data?
● What has changed?
Today: Limited context
Maslow’s Data hierarchy of needs
New Business Opportunities
Business Optimization
Data Quality
Data Freshness
Data Availability
OpenLineage
OpenLineage contributors
Creators and contributors from major open source projects involved
Purpose
Define an Open standard for metadata and lineage
collection by instrumenting data pipelines as they are
running.
Purpose: EXIF for data pipelines
Problem
Before:
● Duplication of effort: Each project
has to instrument all jobs
● Integrations are external and can
break with new versions
● Effort of integration is shared
● Integration can be pushed in
each project: no need to play
catch up
With Open Lineage
Open Lineage scope Not in scope
Backend
Integrations
Metadata
and
lineage
collection
standard
Warehouse
Schedulers
...
Kafka
topic
Graph
db
HTTP
client
Consumers
Kafka
client
GraphDB
client
...
Core Model
● JSONSchema spec
● Consistent naming:
○ Jobs:
Example: scheduler.job.task
○ Datasets:
Example: instance.schema.table
Protocol
● Asynchronous events: unique run id for identifying a run and correlate events
○ Run Start event
■ source code version
■ run parameters
○ Run Complete event
■ input dataset
■ output dataset version and schema
● Configurable backend
○ Kafka
○ Http
○ ...
Facets
● Extensible:
Facets are atomic pieces of metadata identified by a unique name that can be
attached to the core entities.
● Decentralized:
Prefixes in facet names allow the definition of Custom facets that can be
promoted to the spec at a later point.
Facet examples
Dataset:
- Stats
- Schema
- Version
- Column level
lineage
Job:
- Source code
- Dependencies
- params
- Source control
- Query plan
- Query profile
Run:
- Schedule time
- Batch id
Metadata:
Ingest Storage Compute
Streaming
Batch/ML
● Data Platform
built around
Marquez
● Integrations
○ Ingest
○ Storage
○ Compute
Flink
Airflow
Kafka
Iceberg / S3
BI
OpenLineage
Marquez: Data model
Job
Dataset Job Version
Run
*
1
*
1
*
1
1
*
1
*
Source
1 *
●
●
●
●
●
●
●
●
●
●
●
Dataset Version
API
● Open Lineage and Marquez standardize
metadata collection
○ Job runs
○ Parameters
○ Version
○ Inputs / outputs
● Datakin enables
○ Understanding operational dependencies
○ Impact analysis
○ Troubleshooting: What has changed
since the last time it worked?
Datakin leverages Marquez metadata
Lineage analysis
Graph
Integrations
Join the conversation
OpenLineage:
Github: github.com/OpenLineage
Slack: OpenLineage.slack.com
Twitter: @OpenLineage
Email: groups.google.com/g/openlineage
Marquez:
Github: github.com/MarquezProject/marquez
Slack: MarquezProject.slack.com
Twitter: @MarquezProject
Thank You
*we’re hiring! jobs@datakin.com

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Evolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in MotionEvolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in Motion
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
Data and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineageData and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineage
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flow
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 

Similar a Data pipelines observability: OpenLineage & Marquez

A compute infrastructure for data scientists
A compute infrastructure for data scientistsA compute infrastructure for data scientists
A compute infrastructure for data scientists
Stitch Fix Algorithms
 

Similar a Data pipelines observability: OpenLineage & Marquez (20)

Open core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageOpen core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineage
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQL
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
Anatomy of Data Frame API :  A deep dive into Spark Data Frame APIAnatomy of Data Frame API :  A deep dive into Spark Data Frame API
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
 
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
 
WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
WhereHows: Taming Metadata for 150K Datasets Over 9 Data PlatformsWhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019
 
A compute infrastructure for data scientists
A compute infrastructure for data scientistsA compute infrastructure for data scientists
A compute infrastructure for data scientists
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 

Más de Julien Le Dem

Más de Julien Le Dem (20)

Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 
Strata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed databaseStrata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed database
 
From flat files to deconstructed database
From flat files to deconstructed databaseFrom flat files to deconstructed database
From flat files to deconstructed database
 
Strata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmapStrata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmap
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
 
Mule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet ArrowMule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet Arrow
 
Data Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet ArrowData Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet Arrow
 
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
 
Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...
 
Sql on everything with drill
Sql on everything with drillSql on everything with drill
Sql on everything with drill
 
If you have your own Columnar format, stop now and use Parquet 😛
If you have your own Columnar format,  stop now and use Parquet  😛If you have your own Columnar format,  stop now and use Parquet  😛
If you have your own Columnar format, stop now and use Parquet 😛
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013
 
Parquet Twitter Seattle open house
Parquet Twitter Seattle open houseParquet Twitter Seattle open house
Parquet Twitter Seattle open house
 
Parquet overview
Parquet overviewParquet overview
Parquet overview
 
Poster Hadoop summit 2011: pig embedding in scripting languages
Poster Hadoop summit 2011: pig embedding in scripting languagesPoster Hadoop summit 2011: pig embedding in scripting languages
Poster Hadoop summit 2011: pig embedding in scripting languages
 
Embedding Pig in scripting languages
Embedding Pig in scripting languagesEmbedding Pig in scripting languages
Embedding Pig in scripting languages
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Data pipelines observability: OpenLineage & Marquez

  • 1. Data Pipelines Observability OpenLineage & Marquez Julien Le Dem CTO & Co-Founder Datakin @J_
  • 2. AGENDA ● The need for metadata ● OpenLineage: open standard for metadata and lineage collection ● Marquez: a reference implementation
  • 3. The need for Metadata
  • 4. Building a healthy data ecosystem Team A Team B Team C
  • 5. ● What is the data source? ● What is the schema? ● Who is the owner? ● How often is it updated? ● Where is it coming from? ● Who is using the data? ● What has changed? Today: Limited context
  • 6. Maslow’s Data hierarchy of needs New Business Opportunities Business Optimization Data Quality Data Freshness Data Availability
  • 8. OpenLineage contributors Creators and contributors from major open source projects involved
  • 9. Purpose Define an Open standard for metadata and lineage collection by instrumenting data pipelines as they are running.
  • 10. Purpose: EXIF for data pipelines
  • 11. Problem Before: ● Duplication of effort: Each project has to instrument all jobs ● Integrations are external and can break with new versions ● Effort of integration is shared ● Integration can be pushed in each project: no need to play catch up With Open Lineage
  • 12. Open Lineage scope Not in scope Backend Integrations Metadata and lineage collection standard Warehouse Schedulers ... Kafka topic Graph db HTTP client Consumers Kafka client GraphDB client ...
  • 13. Core Model ● JSONSchema spec ● Consistent naming: ○ Jobs: Example: scheduler.job.task ○ Datasets: Example: instance.schema.table
  • 14. Protocol ● Asynchronous events: unique run id for identifying a run and correlate events ○ Run Start event ■ source code version ■ run parameters ○ Run Complete event ■ input dataset ■ output dataset version and schema ● Configurable backend ○ Kafka ○ Http ○ ...
  • 15. Facets ● Extensible: Facets are atomic pieces of metadata identified by a unique name that can be attached to the core entities. ● Decentralized: Prefixes in facet names allow the definition of Custom facets that can be promoted to the spec at a later point.
  • 16. Facet examples Dataset: - Stats - Schema - Version - Column level lineage Job: - Source code - Dependencies - params - Source control - Query plan - Query profile Run: - Schedule time - Batch id
  • 17.
  • 18. Metadata: Ingest Storage Compute Streaming Batch/ML ● Data Platform built around Marquez ● Integrations ○ Ingest ○ Storage ○ Compute Flink Airflow Kafka Iceberg / S3 BI OpenLineage
  • 19. Marquez: Data model Job Dataset Job Version Run * 1 * 1 * 1 1 * 1 * Source 1 * ● ● ● ● ● ● ● ● ● ● ● Dataset Version
  • 20. API ● Open Lineage and Marquez standardize metadata collection ○ Job runs ○ Parameters ○ Version ○ Inputs / outputs ● Datakin enables ○ Understanding operational dependencies ○ Impact analysis ○ Troubleshooting: What has changed since the last time it worked? Datakin leverages Marquez metadata Lineage analysis Graph Integrations
  • 21. Join the conversation OpenLineage: Github: github.com/OpenLineage Slack: OpenLineage.slack.com Twitter: @OpenLineage Email: groups.google.com/g/openlineage Marquez: Github: github.com/MarquezProject/marquez Slack: MarquezProject.slack.com Twitter: @MarquezProject
  • 22. Thank You *we’re hiring! jobs@datakin.com