SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Beyond the SchemaMapper
Presenters
The
Peak
of
Data
Integration
20
23
Matthias
Holemans
FME Consultant
Nordend
Michaël
Ferré
GIS Analyst
District09
The
Peak
of
Data
Integration
20
23
Agenda
1. Topic
2. Specific problem
3. Solution
4. Conclusion
The
Peak
of
Data
Integration
20
23
Introduction
The
Peak
of
Data
Integration
20
23
Topic
Using simple spread sheets to extract specific data from a
big data platform.
Advantages?
• User friendly: separate configuration from scripting
• Transparency: paying attention to functional design
• Future-proof: allowing the source and target platforms to scale
• Limited to no adaptations needed to the script
The
Peak
of
Data
Integration
20
23
Problem
● New business processes shorten
information management cycle
● One source platform replacing x
different sources
● Data available through message
queue streaming JSON objects
● About 30k messages/day, highly
variable in size
● 1 message contains 1-n objects
● ‘last one wins’
● Automate ETL to uniformise and
shorten data updates
● Allow for efficient future scaling
The
Peak
of
Data
Integration
20
23
Difficulties
● How to set up a workflow fit for the job?
● Performant and robust in a non-controllable message stream
volume
● How to make this generic & future-proof?
● Source and target platforms have different stakeholders and
evolve independently
● How to do complex mapping?
● Finding the breakoff point of spreadsheet configuration
The
Peak
of
Data
Integration
20
23
Difficulties
● How to deploy this on FME Form and different FME Flow
environments?
● Creating a portable script on FME Form connecting to the source
platform environments on FME Form
● Scalability?
● Accounting for future scaling in the source as well as target
platforms
The
Peak
of
Data
Integration
20
23
How we tackled the problem
The
Peak
of
Data
Integration
20
23
2 workspaces connected with jobSubmitters
1. Filter the relevant information
2. Execute the mapping
The
Peak
of
Data
Integration
20
23
Workspace 1: Client input - excel sheet
Description KoppelingTypeCode KoppelingSubTypeCode TrajectDetailCode Target Table
Neighbourhoud bycicle
parking
1 58 71
Bicycle
Parking
Bicycle Parking Gent 1 58 72
Bicycle
Parking
Byciclin Parking Others 1 58 83
Bicycle
Parking
Parking meters 1 58 85
Parking
meters
Charging posts 1 58 81
Charging
posts
The
Peak
of
Data
Integration
20
23
Workspace 1: Theory
The
Peak
of
Data
Integration
20
23
Worspace 1: Reality
The
Peak
of
Data
Integration
20
23
Workspace 2: Input mapping spreadsheet
Source attribute Target attribute Prefix Reference ExternalTable
Join
Attribute
Join Attribute
Values
Attribute External
Table
Parse Seperator
Default
Value
Properties.Capacity Capacity
Capacities.Surface Surface SU_
ContourLabels Owner alfa.TrajectDetailLabel Code (353,354,355) Label # City Ghent
The
Peak
of
Data
Integration
20
23
Workspace 2: Input mapping spreadsheet
Source attribute Target attribute Prefix Reference ExternalTable
Join
Attribute
Join Attribute
Values
Attribute External
Table
Parse Seperator
Default
Value
Properties.Capacity Capacity
Capacities.Surface Surface SU_
ContourLabels Owner alfa.TrajectDetailLabel Code (353,354,355) Label # City Ghent
Simple one on one mapping
Input Data
Attribute A Properties.Capacity Attribute ..
.. 50 bikes …
15 bikes
39 bikes
Desired output data
Attribute A Capacity Attribute …
.. 50 bikes …
15 bikes
39 bikes
The
Peak
of
Data
Integration
20
23
Workspace 2: Input mapping spreadsheet
Source attribute Target attribute Prefix Reference ExternalTable
Join
Attribute
Join Attribute
Values
Attribute External
Table
Parse Seperator
Default
Value
Properties.Capacity Capacity
Capacities.Surface Surface SU_
ContourLabels Owner alfa.TrajectDetailLabel Code (353,354,355) Label # City Ghent
Simple one on one mapping with prefix
Input Data
Attribute A Capacities.Surface Attribute ..
.. Concrete …
Sand
Wood
Desired output data
Attribute A Surface Attribute …
.. SU_Concrete …
SU_Sand
SU_Wood
The
Peak
of
Data
Integration
20
23
Workspace 2: Theory in the most simple way
The
Peak
of
Data
Integration
20
23
But what with the more complex mapping?
Build your own solution with custom transformers
The
Peak
of
Data
Integration
20
23
Workspace 2: Input mapping spreadsheet
Source attribute Target attribute Prefix Reference ExternalTable
Join
Attribute
Join Attribute
Values
Attribute External
Table
Parse Seperator
Default
Value
Properties.Capacity Capacity
Capacities.Surface Surface SU_
ContourLabels Owner alfa.TrajectDetailLabel Code (353,354,355) Label # City Ghent
Complex mapping
Input Data
Attribute A Contourlabels Attribute ..
.. #314#354#312 …
#310#320
#355#367#398
alfa.TrajectDetailLabel
Attribute A Code Label Attribute ..
.. 353 Police ..
354 Municipality
355 Fire station
… …
Desired output data
Attribute A Owner Attribute …
.. Municipality …
City Ghent
Fire Station
The
Peak
of
Data
Integration
20
23
Workspace 2: Theory complex mapping
Handy tip: @Value(@Value(attribute))
The
Peak
of
Data
Integration
20
23
Worspace 2: Reality
The
Peak
of
Data
Integration
20
23
Conclusion
The
Peak
of
Data
Integration
20
23
Summary
● Always try to think as generic as possible
● FME is an ideal tool for data migration
● Always keep it simple for the client
The
Peak
of
Data
Integration
20
23
ListExploding may cause
the process to explode.
Call to Action
1. Think further than the standard transformers
2. Make your process future-proof by making it as generic
as possible
3. SchemaMapper cannot be configured with scripted
parameters
4. @Value(@Value(Attribute)) is a handy trick for generic
flows
ThankYou!
matthias@nordend.eu
michael.ferre@district09.gent

Más contenido relacionado

Similar a Beyond the Schema Mapper

SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Chester Chen
 
b04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdfb04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdf
RAJA RAY
 
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
NLJUG
 

Similar a Beyond the Schema Mapper (20)

Spark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick PentreathSpark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick Pentreath
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
 
Optimize the Large Scale Graph Applications by using Apache Spark with 4-5x P...
Optimize the Large Scale Graph Applications by using Apache Spark with 4-5x P...Optimize the Large Scale Graph Applications by using Apache Spark with 4-5x P...
Optimize the Large Scale Graph Applications by using Apache Spark with 4-5x P...
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talk
 
b04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdfb04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdf
 
About The Event-Driven Data Layer & Adobe Analytics
About The Event-Driven Data Layer & Adobe AnalyticsAbout The Event-Driven Data Layer & Adobe Analytics
About The Event-Driven Data Layer & Adobe Analytics
 
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
 
MLSD18. Feature Engineering
MLSD18. Feature EngineeringMLSD18. Feature Engineering
MLSD18. Feature Engineering
 
Encode Club workshop slides
Encode Club workshop slidesEncode Club workshop slides
Encode Club workshop slides
 
Resume 11 2015
Resume 11 2015Resume 11 2015
Resume 11 2015
 
Bridging Between CAD & GIS: 8 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Your Data IntegrationBridging Between CAD & GIS: 8 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Your Data Integration
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Bridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data IntegrationBridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data Integration
 
FME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales TeamFME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales Team
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And Data
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And Data
 
seminar100326a.pdf
seminar100326a.pdfseminar100326a.pdf
seminar100326a.pdf
 

Más de Safe Software

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Safe Software
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Safe Software
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Safe Software
 
Taking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New HeightsTaking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New Heights
Safe Software
 
Initiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software
 

Más de Safe Software (20)

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action:  Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action:  Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data Ecosystem
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
 
Geospatial Synergy: Amplifying Efficiency with FME & Esri
Geospatial Synergy: Amplifying Efficiency with FME & EsriGeospatial Synergy: Amplifying Efficiency with FME & Esri
Geospatial Synergy: Amplifying Efficiency with FME & Esri
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 
Best Practices to Navigating Data and Application Integration for the Enterpr...
Best Practices to Navigating Data and Application Integration for the Enterpr...Best Practices to Navigating Data and Application Integration for the Enterpr...
Best Practices to Navigating Data and Application Integration for the Enterpr...
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
New Year's Fireside Chat with Safe Software’s Founders
New Year's Fireside Chat with Safe Software’s FoundersNew Year's Fireside Chat with Safe Software’s Founders
New Year's Fireside Chat with Safe Software’s Founders
 
Taking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New HeightsTaking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New Heights
 
Initiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
 

Último

Último (20)

10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 

Beyond the Schema Mapper