SlideShare una empresa de Scribd logo
1 de 42
CREATE ETL SOLUTIONS FASTER
WITH METADATA DRIVEN DEVELOPMENT

KOEN VERBEECK

SQL SERVER DAYS 2013
WHO AM I?
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
INTRODUCTION
INTRODUCTION
• large percentage of BI projects fail
• Gartner - http://www.gartner.com/newsroom/id/492112

• one of the reasons is underestimating development effort ETL
• Kimball: 70% time of building a DWH goes into ETL
http://www.informationweek.com/the-38-subsystems-of-etl/55300422
INTRODUCTION

“I choose a lazy person to do a hard
job. Because a lazy person will find
an easy way to do it”
Bill Gates
INTRODUCTION
• a lot of SSIS packages are very similar
•
•
•
•
•

packages importing flat files
packages writing change data to staging tables
packages exporting data to excel (for some reason)
packages updating dimensions
…

• … but they take a lot of time to create
INTRODUCTION
• solution?

• code reuse

o SSIS basically only supports copy-paste
o copy-paste has improved in SSIS 2012

• design patterns

o for example: incremental load package
o SQL Server 2012 Integration Services Design Patterns

• enable through templates

o build a template package
o save it to C:Program Files (x86)Microsoft Visual Studio
11.0Common7IDEPrivateAssembliesProjectItemsDataTransformationProjectDataTransformationIt
ems

• but still requires you to edit each package!
• (and what if you forget to edit a crucial piece?)
INTRODUCTION
• metadata driven development to the rescue!

• (aka code generating code)
• automate generation of common logic in SSIS packages

• first option is the “dynamic SSIS package”
1.
2.

3.
4.

reads metadata from tables
generates code

o usually outputs T-SQL or bcp commands
o uses T-SQL or C#
o for example: SELECT … INTO statements

loops over the generated code
executes each statement

• disadvantages
•
•
•
•

complex project
no parallelism
difficult row based error handling
difficult to incorporate “business logic”
INTRODUCTION
• second option: BIML
• started as a project at MS: http://vulcan.codeplex.com/
• developer left to found company Varigence http://www.varigence.com/
o took the idea (not the code) and developed BIML

• BIML is a markup language and compiler
o translates metadata into business intelligence solutions for SQL Server
o supports SSIS and SSAS
o Varigence made part of BIML available as open source
INTRODUCTION
• BIDS Helper has open source implementation of BIML
• it’s free!
• it’s already in the add-on you love!
• it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?)

• BIML offers

• powerful code generation

o only some parts of the project deployment model are not supported

• reuse BI patterns and components

o create your pattern in BIML and generate all your packages with the same structure
o BIML files can reference each other

• .NET based script language

o C# code can be incorporated into BIML to generate objects based on metadata
o Intellisense (sometimes) available

• don’t like BIML?

o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBI
o no vendor lock-in
INTRODUCTION
• scenario for our demos
• import different flat files
o exports from ERP systems, other database vendors, 3rd party providers, …

• each type of flat file has a different structure
o no single SSIS package for all flat files

• the name of the flat files can change
o for example the name includes a timestamp

• this would normally require 1 SSIS package per flat file type
• couple of hours/days work?

• let’s solve it with BIML!
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
HELLO WORLD
• basic BML script structure

Tasks

BIML
Dataflow
Connections
FileFormats
Packages
Tasks
Containers

Precedence constraints
Transformations
You can also specify
• events
• log handlers
• variables
• parameters
• custom tasks
• script tasks/components
• …
HELLO WORLD
• let’s take a look at a simple BIML script
HELLO WORLD
• BIML root node

• add connections

• add packages
HELLO WORLD
• specify Tasks

• specify specific properties
HELLO WORLD
• check for errors & generate package

• result
DEMO
show Hello World BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
READ FLAT FILE
• specify FlatFileFormat
• columns: name, data type, size, delimiter (, code page)
• what you’d normally specify in the flat file connection manager

• specify connection
READ FLAT FILE
• specify data flow with transformations
• if no input/output connectors are specified, transformations are connected in
the order specified in the BIML file

• result
DEMO
import flat file with BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
READ FLAT FILE IN LOOP
• now let’s loop over a bunch of flat files
• specify variables to hold path to current file and source folder

• add an expression on the flat file connection manager
READ FLAT FILE IN LOOP
• add a for each loop
• which has its own tasks child element
READ FLAT FILE IN LOOP
• result
DEMO
import flat file using for each loop with BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
METADATA DRIVEN DEVELOPMENT
• BIML is nice
• … but isn’t the GUI much faster to developer packages?
• time to enhance BIML with some C# goodness!
called BIMLScript
use C# to read metadata
loop over metadata and create multiple objects
entire website dedicated with tutorials and code snippets
http://bimlscript.com/
• also has an online editor
•
•
•
•
METADATA DRIVEN DEVELOPMENT
• Add namespaces

• Declare variables
METADATA DRIVEN DEVELOPMENT
• Retrieve metadata (stored in a SQL Server table)

• Loop over metadata and create corresponding objects
METADATA DRIVEN DEVELOPMENT
• result
METADATA DRIVEN DEVELOPMENT
• remarks
• make sure the code or the metadata doesn’t contain invalid XML characters
o <>“&

• using C# can mess with the Intellisense
o Visual Studio thinks it’s not valid XML anymore
o color coding can disappear > right click file and choose Open With…
o Intellisense can stop working in Visual Studio > use online editor

• beware of the protection levels
• some elements can only appear once
o do not put those in a loop
o e.g. Connections, Packages
DEMO
generate multiple packages using BIMLScript
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
CONCLUSION
• BIML can radically reduce SSIS development time
• for frequently used package patterns
• when combined with BIMLScript

• BIML supports all versions of SSIS
• but some project deployment functionality is missing

• bit of a learning curve
• good understanding of SSIS is necessary
• basic C# skills needed
• return of investment is in next projects
RESOURCES
• Official BIML

• Varigence BIML product page

http://www.varigence.com/Products/Biml/Capabilities

• BIMLScript resource hub
http://bimlscript.com/

• BIDS Helper on Codeplex
http://bidshelper.codeplex.com/

• Blogs

• Stairway to BIML by Andy Leonard

http://www.sqlservercentral.com/stairway/100550/

• BIML articles by Joost van Rossum

http://microsoft-ssis.blogspot.be/search/label/BIML

• BIML articles by Marco Schreuder
http://blog.in2bi.eu/tags/biml/

• BIML articles by John Welch

http://agilebi.com/jwelch/tag/biml/

• Introduction to BIML part I by Koen Verbeeck

http://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-languagebiml-for-ssis/
Q&A

SQL SERVER DAYS 2013
THANKS FOR LISTENING!
koen.verbeeck@element61.be
@Ko_Ver
http://www.linkedin.com/in/kverbeeck

SQL SERVER DAYS 2013
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Más contenido relacionado

Último

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

SQL Server Days 2013 - Create ETL solutions faster with metadata driven development

  • 1. CREATE ETL SOLUTIONS FASTER WITH METADATA DRIVEN DEVELOPMENT KOEN VERBEECK SQL SERVER DAYS 2013
  • 3. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 4. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 6. INTRODUCTION • large percentage of BI projects fail • Gartner - http://www.gartner.com/newsroom/id/492112 • one of the reasons is underestimating development effort ETL • Kimball: 70% time of building a DWH goes into ETL http://www.informationweek.com/the-38-subsystems-of-etl/55300422
  • 7. INTRODUCTION “I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it” Bill Gates
  • 8. INTRODUCTION • a lot of SSIS packages are very similar • • • • • packages importing flat files packages writing change data to staging tables packages exporting data to excel (for some reason) packages updating dimensions … • … but they take a lot of time to create
  • 9. INTRODUCTION • solution? • code reuse o SSIS basically only supports copy-paste o copy-paste has improved in SSIS 2012 • design patterns o for example: incremental load package o SQL Server 2012 Integration Services Design Patterns • enable through templates o build a template package o save it to C:Program Files (x86)Microsoft Visual Studio 11.0Common7IDEPrivateAssembliesProjectItemsDataTransformationProjectDataTransformationIt ems • but still requires you to edit each package! • (and what if you forget to edit a crucial piece?)
  • 10. INTRODUCTION • metadata driven development to the rescue! • (aka code generating code) • automate generation of common logic in SSIS packages • first option is the “dynamic SSIS package” 1. 2. 3. 4. reads metadata from tables generates code o usually outputs T-SQL or bcp commands o uses T-SQL or C# o for example: SELECT … INTO statements loops over the generated code executes each statement • disadvantages • • • • complex project no parallelism difficult row based error handling difficult to incorporate “business logic”
  • 11. INTRODUCTION • second option: BIML • started as a project at MS: http://vulcan.codeplex.com/ • developer left to found company Varigence http://www.varigence.com/ o took the idea (not the code) and developed BIML • BIML is a markup language and compiler o translates metadata into business intelligence solutions for SQL Server o supports SSIS and SSAS o Varigence made part of BIML available as open source
  • 12. INTRODUCTION • BIDS Helper has open source implementation of BIML • it’s free! • it’s already in the add-on you love! • it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?) • BIML offers • powerful code generation o only some parts of the project deployment model are not supported • reuse BI patterns and components o create your pattern in BIML and generate all your packages with the same structure o BIML files can reference each other • .NET based script language o C# code can be incorporated into BIML to generate objects based on metadata o Intellisense (sometimes) available • don’t like BIML? o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBI o no vendor lock-in
  • 13. INTRODUCTION • scenario for our demos • import different flat files o exports from ERP systems, other database vendors, 3rd party providers, … • each type of flat file has a different structure o no single SSIS package for all flat files • the name of the flat files can change o for example the name includes a timestamp • this would normally require 1 SSIS package per flat file type • couple of hours/days work? • let’s solve it with BIML!
  • 14. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 15. HELLO WORLD • basic BML script structure Tasks BIML Dataflow Connections FileFormats Packages Tasks Containers Precedence constraints Transformations You can also specify • events • log handlers • variables • parameters • custom tasks • script tasks/components • …
  • 16. HELLO WORLD • let’s take a look at a simple BIML script
  • 17. HELLO WORLD • BIML root node • add connections • add packages
  • 18. HELLO WORLD • specify Tasks • specify specific properties
  • 19. HELLO WORLD • check for errors & generate package • result
  • 21. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 22. READ FLAT FILE • specify FlatFileFormat • columns: name, data type, size, delimiter (, code page) • what you’d normally specify in the flat file connection manager • specify connection
  • 23. READ FLAT FILE • specify data flow with transformations • if no input/output connectors are specified, transformations are connected in the order specified in the BIML file • result
  • 25. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 26. READ FLAT FILE IN LOOP • now let’s loop over a bunch of flat files • specify variables to hold path to current file and source folder • add an expression on the flat file connection manager
  • 27. READ FLAT FILE IN LOOP • add a for each loop • which has its own tasks child element
  • 28. READ FLAT FILE IN LOOP • result
  • 29. DEMO import flat file using for each loop with BIML
  • 30. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 31. METADATA DRIVEN DEVELOPMENT • BIML is nice • … but isn’t the GUI much faster to developer packages? • time to enhance BIML with some C# goodness! called BIMLScript use C# to read metadata loop over metadata and create multiple objects entire website dedicated with tutorials and code snippets http://bimlscript.com/ • also has an online editor • • • •
  • 32. METADATA DRIVEN DEVELOPMENT • Add namespaces • Declare variables
  • 33. METADATA DRIVEN DEVELOPMENT • Retrieve metadata (stored in a SQL Server table) • Loop over metadata and create corresponding objects
  • 35. METADATA DRIVEN DEVELOPMENT • remarks • make sure the code or the metadata doesn’t contain invalid XML characters o <>“& • using C# can mess with the Intellisense o Visual Studio thinks it’s not valid XML anymore o color coding can disappear > right click file and choose Open With… o Intellisense can stop working in Visual Studio > use online editor • beware of the protection levels • some elements can only appear once o do not put those in a loop o e.g. Connections, Packages
  • 37. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 38. CONCLUSION • BIML can radically reduce SSIS development time • for frequently used package patterns • when combined with BIMLScript • BIML supports all versions of SSIS • but some project deployment functionality is missing • bit of a learning curve • good understanding of SSIS is necessary • basic C# skills needed • return of investment is in next projects
  • 39. RESOURCES • Official BIML • Varigence BIML product page http://www.varigence.com/Products/Biml/Capabilities • BIMLScript resource hub http://bimlscript.com/ • BIDS Helper on Codeplex http://bidshelper.codeplex.com/ • Blogs • Stairway to BIML by Andy Leonard http://www.sqlservercentral.com/stairway/100550/ • BIML articles by Joost van Rossum http://microsoft-ssis.blogspot.be/search/label/BIML • BIML articles by Marco Schreuder http://blog.in2bi.eu/tags/biml/ • BIML articles by John Welch http://agilebi.com/jwelch/tag/biml/ • Introduction to BIML part I by Koen Verbeeck http://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-languagebiml-for-ssis/
  • 42. © 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Notas del editor

  1. Blue - Use for Cloud on Your Terms specific content
  2. Green - Use for Mission Critical Confidence specific content
  3. Orange - Use for Breakthrough Insight specific content
  4. Blue - Use for Cloud on Your Terms specific content