SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
Jan Scherbaum & Marek Novotny
Barclays Africa Group Limited
SPLINE:
APACHE SPARK LINEAGE, NOT ONLY
FOR THE BANKING INDUSTRY
#EUent3
7.34
Stuff happens
under the hood…
#EUent3 2
Data type
changes
Adding new dataComplex calcs
7.34
#EUent3 3
Overview
• Barclays Africa
• Spline
• Live demo
• Future work
• The big picture
• Questions
4#EUent3
• Pan-African financial services provider
– Providing services to South Africa (ABSA) and 11 other
African countries (Barclays)
• Subject to strict regulatory compliance
– Basel Committee on Banking Supervision (BCBS)
• Accuracy
• Comprehensiveness
• Clarity
• Usefulness
5#EUent3
Barclays Africa Group Limited
• SPark LINEage
• Open source project
• Goals
– Satisfy initial interpretations of regulatory
requirements, specifically on “Clarity”
• Data lineage from Spark’s execution plans
• Visualize in an “explorable” user-friendly format
6#EUent3
Spline
Spline – How it works
7#EUent3
Spark job
Spark Job
Spark library
Spark Session
Action
Transformations
Generate execution plans
Spline – How it works
8#EUent3
Spark job
1 line initialization
Use SQLContext listeners
Generate execution plans
Spark Job & Spline
Spark library
Spark Session
Action
Spline
Transformations
Spline UI
Spline – How it works
9#EUent3
Spark job
1 line initialization
Use SQLContext listeners
Generate execution plans
Spark Job & Spline
Spark library
Spark Session
Action
Spline
Transformations
…
Demo use case
• Find the countries with the highest annual beer
consumption per person
– Correlation with GDP??
10#EUent3
Data
11#EUent3
Country 2011 2010 2009
Czech Republic 15,583,000 15,549,000 16,190,000
Ireland 4,721,000 4,814,000 4,832,000
Country Metric 2011 2010 2009
Czech Republic GDP $21,717 $19,764 $19,698
Czech Republic Population 10,496,088 10,474,410 10,443,936
Ireland GDP $52,567 $48,538 $51,983
Ireland Population 4,576,794 4,560,155 4,535,375
Beer consumption per country
Development indicators from the world bank
Analysis
• Marek’s job
– Data prep
– Analyze the correlation between beer consumption
and GDP growth
• Jan’s beer job
– Calculate the consumption of beer per country per
capita
12#EUent3
Dependencies
<dependency>
<groupId>za.co.absa</groupId>
<artifactId>spline-core</artifactId>
<version>${spline.version}</version>
</dependency>
<dependency>
<groupId>za.co.absa</groupId>
<artifactId>spline-persistence-mongo</artifactId>
<version>${spline.version}</version>
</dependency>
13#EUent3
Initialization
14#EUent3
// Initializing library to hook up to Apache Spark
import za.co.absa.spline.core.SparkLineageInitializer._
spark.enableLineageTracking()
15#EUent3
Next steps
• Enterprise features
– Authentication (Kerberos, SSO)
– Authorization
– User management
• Interoperability with other tools
– Cloudera Manager, Informatica, Apache Atlas
• Support other Spark data sources and actions
– Streaming, ML
16#EUent3
The bigger picture
• Develop open source conformance & ingestion
engine on Spark
– BCBS compliant (lineage, dataflow controls, error tracking)
– Transfer & transform data from different source systems
• To strongly typed datasets
• On Hadoop
• Conforming to enterprise level data dictionaries & data quality
• In development – stay tuned J
17#EUent3
We’re open source!
• Contributions are most welcome
• Released versions mirrored on
– https://github.com/AbsaOSS/spline
• Wiki and docs on
– https://absaoss.github.io/spline/
18#EUent3
Questions
19#EUent3
• Now is a good time
• Or feel free to contact us
– Jan Scherbaum
• jan.scherbaum@barclays.com
– Marek Novotny
• marek.x.novotny@barclays.com
– Oleksandr Vayda
• oleksandr.vayda@barclays.com
• Acknowledgements:
– Dennis Chu, Aaisha Bibi Osman, Adam Smyczek, Andrew Baker

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Debugging PySpark: Spark Summit East talk by Holden Karau
Debugging PySpark: Spark Summit East talk by Holden KarauDebugging PySpark: Spark Summit East talk by Holden Karau
Debugging PySpark: Spark Summit East talk by Holden Karau
 
Informatica MDM Presentation
Informatica MDM PresentationInformatica MDM Presentation
Informatica MDM Presentation
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 

Similar a Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Novotny Jan Scherbaum

An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
Databricks
 

Similar a Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Novotny Jan Scherbaum (20)

An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
 
Kafka Summit NYC 2017 - The Real-time Event Driven Bank: A Kafka Story
Kafka Summit NYC 2017 - The Real-time Event Driven Bank: A Kafka Story Kafka Summit NYC 2017 - The Real-time Event Driven Bank: A Kafka Story
Kafka Summit NYC 2017 - The Real-time Event Driven Bank: A Kafka Story
 
SANOG 38: RPKI Update
SANOG 38: RPKI UpdateSANOG 38: RPKI Update
SANOG 38: RPKI Update
 
Splunk at Scotiabank
Splunk at ScotiabankSplunk at Scotiabank
Splunk at Scotiabank
 
SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3
SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3
SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3
 
Exascale Computing Project (ECP) Update
Exascale Computing Project (ECP) UpdateExascale Computing Project (ECP) Update
Exascale Computing Project (ECP) Update
 
.NET Fest 2019. Андрей Винда. Создание REST API с поддержкой высокой нагрузки
.NET Fest 2019. Андрей Винда. Создание REST API с поддержкой высокой нагрузки.NET Fest 2019. Андрей Винда. Создание REST API с поддержкой высокой нагрузки
.NET Fest 2019. Андрей Винда. Создание REST API с поддержкой высокой нагрузки
 
Introducing Blockchain With Java.pdf
Introducing Blockchain With Java.pdfIntroducing Blockchain With Java.pdf
Introducing Blockchain With Java.pdf
 
ION Krakow - BCOP Update
ION Krakow - BCOP UpdateION Krakow - BCOP Update
ION Krakow - BCOP Update
 
Introduce to PredictionIO
Introduce to PredictionIOIntroduce to PredictionIO
Introduce to PredictionIO
 
16Gb Fibre Channel Deployment Guide
16Gb Fibre Channel Deployment Guide16Gb Fibre Channel Deployment Guide
16Gb Fibre Channel Deployment Guide
 
Clean Slate Design - O Que é?
Clean Slate Design - O Que é?Clean Slate Design - O Que é?
Clean Slate Design - O Que é?
 
Portland Splunk User Group May 2020
Portland Splunk User Group May 2020 Portland Splunk User Group May 2020
Portland Splunk User Group May 2020
 
bmp.pptx
bmp.pptxbmp.pptx
bmp.pptx
 
Customer Presentation
Customer PresentationCustomer Presentation
Customer Presentation
 
Splunk live! Customer Presentation – Wellsfargo
Splunk live! Customer Presentation – WellsfargoSplunk live! Customer Presentation – Wellsfargo
Splunk live! Customer Presentation – Wellsfargo
 
Sql Server 2008 Portfolio
Sql Server 2008 PortfolioSql Server 2008 Portfolio
Sql Server 2008 Portfolio
 
IET programmable components
IET programmable componentsIET programmable components
IET programmable components
 
[Tel Aviv merge world tour] Introducing Perforce Insights
[Tel Aviv merge world tour] Introducing Perforce Insights[Tel Aviv merge world tour] Introducing Perforce Insights
[Tel Aviv merge world tour] Introducing Perforce Insights
 
Life is but a Stream
Life is but a StreamLife is but a Stream
Life is but a Stream
 

Más de Spark Summit

Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Spark Summit
 

Más de Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
 

Último

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 

Último (20)

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 

Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Novotny Jan Scherbaum

  • 1. Jan Scherbaum & Marek Novotny Barclays Africa Group Limited SPLINE: APACHE SPARK LINEAGE, NOT ONLY FOR THE BANKING INDUSTRY #EUent3
  • 2. 7.34 Stuff happens under the hood… #EUent3 2
  • 3. Data type changes Adding new dataComplex calcs 7.34 #EUent3 3
  • 4. Overview • Barclays Africa • Spline • Live demo • Future work • The big picture • Questions 4#EUent3
  • 5. • Pan-African financial services provider – Providing services to South Africa (ABSA) and 11 other African countries (Barclays) • Subject to strict regulatory compliance – Basel Committee on Banking Supervision (BCBS) • Accuracy • Comprehensiveness • Clarity • Usefulness 5#EUent3 Barclays Africa Group Limited
  • 6. • SPark LINEage • Open source project • Goals – Satisfy initial interpretations of regulatory requirements, specifically on “Clarity” • Data lineage from Spark’s execution plans • Visualize in an “explorable” user-friendly format 6#EUent3 Spline
  • 7. Spline – How it works 7#EUent3 Spark job Spark Job Spark library Spark Session Action Transformations Generate execution plans
  • 8. Spline – How it works 8#EUent3 Spark job 1 line initialization Use SQLContext listeners Generate execution plans Spark Job & Spline Spark library Spark Session Action Spline Transformations
  • 9. Spline UI Spline – How it works 9#EUent3 Spark job 1 line initialization Use SQLContext listeners Generate execution plans Spark Job & Spline Spark library Spark Session Action Spline Transformations …
  • 10. Demo use case • Find the countries with the highest annual beer consumption per person – Correlation with GDP?? 10#EUent3
  • 11. Data 11#EUent3 Country 2011 2010 2009 Czech Republic 15,583,000 15,549,000 16,190,000 Ireland 4,721,000 4,814,000 4,832,000 Country Metric 2011 2010 2009 Czech Republic GDP $21,717 $19,764 $19,698 Czech Republic Population 10,496,088 10,474,410 10,443,936 Ireland GDP $52,567 $48,538 $51,983 Ireland Population 4,576,794 4,560,155 4,535,375 Beer consumption per country Development indicators from the world bank
  • 12. Analysis • Marek’s job – Data prep – Analyze the correlation between beer consumption and GDP growth • Jan’s beer job – Calculate the consumption of beer per country per capita 12#EUent3
  • 14. Initialization 14#EUent3 // Initializing library to hook up to Apache Spark import za.co.absa.spline.core.SparkLineageInitializer._ spark.enableLineageTracking()
  • 16. Next steps • Enterprise features – Authentication (Kerberos, SSO) – Authorization – User management • Interoperability with other tools – Cloudera Manager, Informatica, Apache Atlas • Support other Spark data sources and actions – Streaming, ML 16#EUent3
  • 17. The bigger picture • Develop open source conformance & ingestion engine on Spark – BCBS compliant (lineage, dataflow controls, error tracking) – Transfer & transform data from different source systems • To strongly typed datasets • On Hadoop • Conforming to enterprise level data dictionaries & data quality • In development – stay tuned J 17#EUent3
  • 18. We’re open source! • Contributions are most welcome • Released versions mirrored on – https://github.com/AbsaOSS/spline • Wiki and docs on – https://absaoss.github.io/spline/ 18#EUent3
  • 19. Questions 19#EUent3 • Now is a good time • Or feel free to contact us – Jan Scherbaum • jan.scherbaum@barclays.com – Marek Novotny • marek.x.novotny@barclays.com – Oleksandr Vayda • oleksandr.vayda@barclays.com • Acknowledgements: – Dennis Chu, Aaisha Bibi Osman, Adam Smyczek, Andrew Baker