SlideShare una empresa de Scribd logo
1 de 26
- David Portnoy
http://LinkedIn.com/in/DavidPortnoy
312.970.9740- © Copyright 2012-2014 Datalytx, Inc.
Applying Agile Delivery to–
– Business Intelligence
Topic: Data Integration & ETL
This group extends the TDWI community
online and is designed to foster peer
network and discussion of key issues
relevant to business intelligence and data
warehousing managers.
TDWI (The Data Warehousing
Institute™) provides education, training,
certification, news, and research for
executives and information technology
(IT) professionals worldwide. Founded in
1995, TDWI is the premier educational
institute for business intelligence and
data warehousing. Our Web site is
www.tdwi.org.
Why this topic?
 There’s a lot of confusion and misconception about the
meaning of Agile, especially as it applies to BI
 Many in corporate IT still believe that Agile cannot easily be
applied to BI
 Posts on this topic in the TDWI forum in LinkedIn would benefit
from being organized and summarized
What we’ll cover
 Misconceptions about Agile BI
 Core techniques of Agile BI
 Review of ETL tool landscape and benefits
 Decision factors for choosing the ETL environment
 Mitigating aspects of ETL tools that make Agile harder
 How to implement an Agile BI development environment
Due to the prevailing confusion
and misconceptions, it’s easier to
start with what Agile BI is not
Misconceptions about Agile in the BI community
There’s a common misconception that Agile BI applies to practically any
methodology or tool that helps develop BI projects faster or in a more flexible way.
Some examples of misconceptions:
 Agile is primarily adding iterations to
typical projects
 Agile implies starting to code without
planning or design
 Agile involves particular data models,
such as Data Vault
 Agile involves rapid prototyping
techniques, as can be achieved by
certain metadata driven tools
 Agile involves self-serve reporting, such
as Tableau
 Agile involves moving ETL from a
separate code base into the reporting
layer, as made possible by in-memory
processing, such as with QlickView
 Agile involves building real-time or low-
latency DW, rather than traditional batch
 Agile operates in a hosted cloud
environment, especially PaaS (Platform
as a Service)
The culprits for the myths and misconceptions
#1 Vendors claim that their products are agile.
#2 The BI community as a whole does not have a long history or
substantial practice with agile development. Therefore they are more likely
to be swayed by vendor pitches.
The culprits for the myths and misconceptions (cont.)
 In the software development world, that’s equivalent to saying
that new frameworks, such as Ruby on Rails, are needed for
Agile development. (Few credible publications or developers
would make such a claim.)
 The implication that other BI tools can’t be used to achieve
Agile BI is simply not true. (Even general purpose
development platforms can be applied to BI.)
 In reality, team composition, proficiency with existing
technologies and management’s acceptance of agile is a
bigger impact than a specific type of BI tool.
“...Agile BI methodology differs from
[agile software development] in that it
requires new and different technologies
and architectures for support. Metadata-
generated BI applications are one such
example...”
Example source of
misconceptions:
The article goes on to claim that these
particular tools are needed in order to
achieve “development done faster”,
“react[ing] more quickly to ...
requirements“, incremental product
delivery, “rapid prototypes versus
specifications”, “reacting versus
planning”, “personal interactions ...
versus documentation”, etc.
Forrester Research article “Agile Out of the
Box”, 2010
 This list is just buzz words associated with agile without
substantial evidence of why other tools are insufficient.
 Rapid prototyping is confused with the role of end-to-end
working software.
 On the contrary, arguments can be made why the tools
identified could be detrimental to agile teams. (See TDWI
LinkedIn group discussion “The Role of ETL tools in Agile
BI”.)
What’s wrong with itWhat’s being said
The reality
Yes, many of the items misclassified as necessary for
Agile still help projects ramp up and complete faster.
Yes, many improve the flexibility of dealing with
changes in source data, business logic and reporting.
Yes, many provide additional visibility into complex
logic and functional changes across team members and
stakeholders.
 Data Vault model
 Rapid prototyping tools
 Metadata driven BI tools
 Self-serve reporting
 In-memory processing
 Hosted cloud (PaaS)
environment
But none of them are required
to have successful Agile BI projects
So what are the requirements for implementing Agile BI?
Productive Agile BI teams operate almost identically to Agile methodology used for
software development.
...With just the minimal tweaks to accommodate:
1. Integration of available ETL and reporting tools into the development
environment
2. Changes to regression testing due to the fact that databases have state
3. Challenges of managing large data sets in the deployment process
Techniques for implementing Agile in BI
 Timebox deliverables – of course
 Measure completion with working
software! (Prototypes using non-
production tools are OK. But need to get
end-to-end data flow working ASAP.)
 Highly efficient, daily team
synchronization in which entire team
participates.
 Monitor completion of features (stories),
not time spent. Calculate team velocity to
improve planning.
 Hold sprint retrospectives to learn from
mistakes.
 Leverage techniques of Agile app dev:
 Manage everything in version control,
including data model and test data sets
 Assume refactoring of working code can
occur later to improve performance and
maintainability
 Use Test Driven Development (TDD), to
ensure understanding of requirements and
reduce rework
 Implement Continuous Integration to
automate build, tests, deployment
 Measure project success by delivery of
business value, not delivery of predefined
requirements on time and on budget
 Accept that it’s OK to fail, but fail early and
adapt. (Non agile projects don’t recognize
failure until time or budget runs out.)
What’s the reason for low adoption of Agile in BI?
Application Development Business Intelligence
Development
Environment
Custom app development using
standard, general purpose
languages well suited for
automation
Proprietary vendor architectures and
DSLs (domain specific languages) not
well suited for automation
Team skills Have skills to write automation for
continuous integration
Rely on vendors to provide these
features
Costs Low up front investment by
leveraging open source platforms
High up front investment in vendor-
specific tools: DW appliance, data
modeling, ETL, OLAP, Reporting, etc.
Releases Software is stateless and therefore
easier to test and deploy with each
build
Databases have state, with each build
needing to start with a certain data set.
High data volumes may take hours to
load a changed data model or roll back
changes.
Agile is widely adopted in application development
...but not in BI
Potential reasons might stem from differences between the two worlds
Now let’s get into
the specifics of
ETL in Agile BI
ETL tools have evolved over the years
 Graphical development accomplishing ETL through parameterization and
configuration, rather than code generation
 Avoids complexities with code management and deployment
 Intuitive development UI enabling developers to manipulate ETL metadata
 From metadata, generate code in a general purpose (such as C or Java) or
domain specific (such as SQL or MDX) language
 Types: One-shot generators (that require switching to a native dev env) vs.
full development environments with managed version deployments
 Origin: Reusable code compiled from a few similar projects
 Just change parameters to reuse for specific loading, logging, change data
capture, database connections, etc.
 One-time solutions
 Built with focus on short-term delivery and minimal up front cost
Custom
Code
Frameworks
Code
Generators
Engines
We can categorize the major ETL players
The vendors
 Traditional vendors: Informatica, SSIS, DataStage
 Open source: Talend, Pentaho Kettle
 Metadata driven, automated discovery, federated integration:
Kalido, BI Ready, Wherescape, Composite Software
The most common alternative
 SQL + shell scripts
 Native DB load utilities
ETL tools have lots of value
 Built-in commonly used features for transformation and job control
 Without ETL tools, we’re reinventing the wheel on many BI design patterns that
have been implemented countless times throughout history
 Abstracts complex logic into a graphical components or domain specific language
that leverages best practices and is often more maintainable over the potentially
long project life span
 Graphical representation of data model, data flow and job flow provide visibility
into business logic, especially useful for less technical team members
 Provides a degree of self-documentation without the need to update the graphical
representation of logic separately from source code
 Master Data Management (MDM)
 Data cleansing
 Change Data Capture (CDC)
 Data lineage and data dependency
functionality
 Processing of SCD (Slowly Changing
Dimensions)
 Parallelization of tasks that can be run
concurrently
 Advanced merging functionality
But many ETL tools are
not well suited to an
Agile BI environment
First, these tools may not be ideal for Agile in general...
Some ETL tools are...
 Not well suited for code refactoring, branching, and merging because
the code is not in text files that can be used modern version control, such
as Git
 Not well suited for use with automation in Continuous Integration,
because they’re often standalone environments with no provisions for
external automation
 Not well suited for TDD (Test Driven Development), unless the vendors
explicitly made provisions for unit test automation
 Proprietary and have “black box” features that might make testing more
challenging or decrease portability of test cases
 Expensive, with high up-front license cost also putting more capital at
risk – unless open source ETL, of course
Second, they may negatively impact productivity of Agile teams
ETL tools may...
 Require a proprietary, vendor-specific skill set not present in the organization
 Cause work priority to be stove-piped and limited to skill set, rather than overall
business value
 Prevent the ability to leverage the full dev team, since they fall under a
separate development environment from the rest of apps
 Result in a productivity hit, since some professional developers are more
productive writing code in native languages than using GUI tools, even after
training
 Not provide compelling enough reasons for developers to learn any one ETL
tool, since the lack of industry standards decreases skill portability
Third, there are other challenges and considerations
There are challenges and limitations with ETL tools even outside of Agile
 Require allocation of additional resources to manage version upgrades of the
ETL tool, even if the code base hasn’t been changing
 When the type of processing needed is outside of core ETL tool features,
complexity can grow quickly
 Usefulness of visual representations for data models, data flows and job flows is
reduced as complexity increases
 Some find GUI development less efficient than traditional coding, especially for
complex or unique type of processing
 Often the sophisticated features are underutilized, resulting in expensive tools
being used just for job scheduling
Fourth, BI is increasingly involving Big Data
Big Data implementations often make ETL tools less compelling
 Large volumes make it more efficient to
 Manipulate data in place using ELT, rather than have multiple staging areas
 Use native methods (MapReduce /Java, SQL, Hive, etc.) that allow for more control
and performance optimization
 High velocity of data makes it harder to use ETL tools that have traditionally
been designed around batch-oriented processing.
 High variability of data makes ETL tools less attractive, since they expect a
fixed schema and don’t gracefully accommodate changes. Common examples
include unstructured web log data in flat files and logical objects from apps
stored in key-value pair format.
 MPP vendors, such as Teradata and Netezza make a case for doing ELT (rather
than ETL) processing natively and provide built-in features to do so
 Currently ETL tools are rarely used with the Hadoop ecosystem for many of the
reasons stated, as well as licensing cost
That said, how do we
implement an-
Agile BI environment?
First, use ETL tools when it makes sense
Pick the right ETL tool for the job...
 We covered the potential benefits and problems of using such ETL tools
for Agile BI. Look for situations where benefits outweigh the problems.
For example, a good situation to employ ETL tools might be: A use case
requiring sophisticated data cleansing transformations, complex job control
logic, and data volumes easily handled by traditional SMP database
architectures.
 Outside of such situations, consider using SQL, DB-specific native code, or
general purpose languages already in use elsewhere in the organization.
Is it OK to start with using an ETL tool as a job scheduler?
 Yes, assuming it’s an efficient way to handle much needed job control
logic, including failures, event triggers, and dependencies.
 Plus, you get the option to adopt other capabilities of the tool over time
with low project risk.
While traditional ETL tools
can simplify a complex task,
they can also overcomplicate
a simple task.
Second, when you do use ETL tools, look
for ways to mitigate these issues identified
So what’s the solution?
L
Issue Approach
High up-front
license cost
Use open source tools or less expensive licenses like with SQL Server.
Aggressive vendor negotiations, in light of lower cost alternatives.
Use with
Continuous
Integration
See following slides. Some vendors, like Microsoft, may make provisions for
automated builds within their environment. Otherwise look for opportunities to
simplify, partially automate, and notify team of build state.
Use with
version control
Where possible, save ETL logic to XML, create dumps of repository, and
generate code from metadata. Then manage in common version control tool.
Decreased
portability
Move code to general purpose development languages, including SQL and
MDX. Consider tools that generate generic code from GUI or metadata.
Vendor-specific
skill set
Build cross-functional team by...
 Training existing developers
 Hiring well-rounded developers willing to learn ETL tools
Risk of
introducing
another
development
environment
Start using ETL tools now and “grow” into using the functionality
 Continue coding in what you know: native RDBMS code or even general
app dev languages
 Start using ETL as a glorified job scheduler to wrap native code
 When refactoring code, take the opportunity to push more logic into the
ETL tool
 Gradually start using other features such as MDM, data quality,
notifications, enterprise service bus, etc.
Continuous Integration: Methodology
 Each developer should have a sandbox:
1-to-1 app instance to DB instance (CI by Martin Fowler)
 Automate: Table deployment, usage stats, schema
verification, data migration verification, DB testing,
migration to prod
 Version control all DB assets, ideally using a
distributed tool like Git
 Use tool like dbDeploy and link app build, DB version, and forward/reverse DDL & DML
scripts
 Generate a test data set with a dimension annotating what each is testing; Becomes a
company asset that enables TDD of BI
For cases where an application consumes data from the data warehouse:
 BI developers should learn software coding practices; Application developers should learn
data modeling, SQL, DB tuning
 Consuming apps use 2 phased builds:
Build 1, DB is stubbed out and runs within minutes
Build 2, includes real DB for end-to-end testing, but might run for a while
 Bugs found in Build 2, trigger additions to the test data set; Next time same bug is caught
in Build 1
Shared
developer
schema
Dev 1
Dev 2
Dev 3
Typical BI dev env
with contention
during development
Sandboxed dev env
appropriate for agile
development
Schema Dev 1Dev 1
Dev 2
Dev 3
Schema Dev 2
Schema Dev 3


Continuous Integration: Tools & Configuration
How dbDeploy works
dbDeploy is treated as a custom Ant task:
1. Logs & assigns version #s to changes in
SQL files
2. Save changelog table since prior version
3. Generates DDL & DML scripts to apply to
DB in other envs
Tool Type Purpose
Ant Build tool Automates steps to build & deploy software
Jenkins Continuous
Integration
Monitors source code repository (Git) for checkins,
automatically launching build-test cycles and publishing
results.
Git Source control /
repository
Source code repository optimized for branching and merging,
making it efficient for each developer to have their own
sandbox environment. It triggers CI built-test cycles.
dbDeploy,
dbMaintain,
etc.
Database
refactoring manager
Automates the process of establishing
which database refactorings need to be
run against a specific database in order
to migrate it to a particular build.
DbUnit, DbFit,
SQLUnit
Unit test automation Common tool to aid TDDD (Test-driven DB development).
Manage DB state between test runs, import/export test
datasets, run unit tests and log exceptions. Regression testing
of DDL, DML, stored procedures.
Developer
Env.

Repository
(Git)
CI Environment
Check
out
Build Tool
Deploy
& Test
Test server
Prod server
Project
Code
Check
in
Success /
Fail Tag
Continuous Refactoring & Releases of Databases
Dev
Sandbox
Project
Integration
Sandbox
Test / QA
Sandbox
Production
Highly iterative
development
Characteristics
Environment
Deployment
Frequency
Risk / impact
of bug
Project-Level
Testing
System Integration
Testing
Operations &
Support
Frequent Infrequent Controlled
Low impact
Medium impact
High impact
Based on presentation by Pramod Sadalage
Testing Test data set
(Used for TDD)
Test data set
Benchmark data
Production data
Continuous Integration:
Possible Configuration for Microsoft BI Stack
PowerDelivery
 Addresses TFS’s weakness in coordinating the promotion of builds
through multiple environments of the delivery pipeline: triggering build
on commit, promoting commit build to test, promoting test build to
prod
Windows PowerShell
 Task-based command-line shell & scripting language (built on .NET)
for task automation
Team Foundation Server
 Microsoft's application lifecycle management (ALM) solution.
Collaboration platform that supports agile delivery practices
 Build machine is configured for continuous integration, so latest
working version is refreshed and available to the entire distributed
team
SQL Server Data Tools
 Develop, debug, and execute database unit tests interactively
in Visual Studio.
 Puts database testing on an equal footing with application testing.
 Can then be run from command line or from a build machine
 Integrated with testing, bug tracking, and project management using
TFS

Más contenido relacionado

La actualidad más candente

ERP - Implementation is The Challenge
ERP - Implementation is The ChallengeERP - Implementation is The Challenge
ERP - Implementation is The Challengevinaya.hs
 
Is Agile Documentation An Oxymoron?
Is Agile Documentation An Oxymoron?Is Agile Documentation An Oxymoron?
Is Agile Documentation An Oxymoron?Kurt Solarte
 
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseRe-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseDell World
 
Mapping Manager Product Overview
Mapping Manager Product OverviewMapping Manager Product Overview
Mapping Manager Product OverviewRakesh Kumar
 
70-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 201270-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 2012siphocha
 
Why ask why? Try agile BI!
Why ask why? Try agile BI!Why ask why? Try agile BI!
Why ask why? Try agile BI!Excella
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceAlmog Ramrajkar
 
Validation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsValidation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsPerficient, Inc.
 
Data Centric Conference 2020
Data Centric Conference 2020Data Centric Conference 2020
Data Centric Conference 2020John O'Gorman
 
What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?SAP Analytics
 
Bi Applications - Oracle
Bi Applications - OracleBi Applications - Oracle
Bi Applications - Oraclejamesgj2004
 
CoBIT 5 (A brief Description)
CoBIT 5 (A brief Description)CoBIT 5 (A brief Description)
CoBIT 5 (A brief Description)Sam Mandebvu
 
Workday Integration Cloud Connect Datasheet
Workday Integration Cloud Connect DatasheetWorkday Integration Cloud Connect Datasheet
Workday Integration Cloud Connect DatasheetWorkday
 
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...MSHOWTO Bilisim Toplulugu
 

La actualidad más candente (20)

ERP - Implementation is The Challenge
ERP - Implementation is The ChallengeERP - Implementation is The Challenge
ERP - Implementation is The Challenge
 
Is Agile Documentation An Oxymoron?
Is Agile Documentation An Oxymoron?Is Agile Documentation An Oxymoron?
Is Agile Documentation An Oxymoron?
 
Sanjay Lakhanpal 2015
Sanjay Lakhanpal 2015Sanjay Lakhanpal 2015
Sanjay Lakhanpal 2015
 
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseRe-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
 
Mapping Manager Product Overview
Mapping Manager Product OverviewMapping Manager Product Overview
Mapping Manager Product Overview
 
70-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 201270-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 2012
 
Why ask why? Try agile BI!
Why ask why? Try agile BI!Why ask why? Try agile BI!
Why ask why? Try agile BI!
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Validation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsValidation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study Migrations
 
Resume
ResumeResume
Resume
 
Data Centric Conference 2020
Data Centric Conference 2020Data Centric Conference 2020
Data Centric Conference 2020
 
What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?
 
Bi Applications - Oracle
Bi Applications - OracleBi Applications - Oracle
Bi Applications - Oracle
 
CoBIT 5 (A brief Description)
CoBIT 5 (A brief Description)CoBIT 5 (A brief Description)
CoBIT 5 (A brief Description)
 
SUBRA0114E
SUBRA0114ESUBRA0114E
SUBRA0114E
 
SoniaP_Resume
SoniaP_ResumeSoniaP_Resume
SoniaP_Resume
 
Workday Integration Cloud Connect Datasheet
Workday Integration Cloud Connect DatasheetWorkday Integration Cloud Connect Datasheet
Workday Integration Cloud Connect Datasheet
 
Bala_Kalimuthu
Bala_KalimuthuBala_Kalimuthu
Bala_Kalimuthu
 
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...
2011 Sharepoint Summit - Microsoft's vision and strategy for the future of bu...
 
Resume
ResumeResume
Resume
 

Destacado

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDon Jackson
 
BI the Agile Way
BI the Agile WayBI the Agile Way
BI the Agile Waynvvrajesh
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsDavid Portnoy
 
Business Intelligence with Microsoft SQL 2014 - Presented by Atidan
Business Intelligence with Microsoft SQL 2014 - Presented by AtidanBusiness Intelligence with Microsoft SQL 2014 - Presented by Atidan
Business Intelligence with Microsoft SQL 2014 - Presented by AtidanDavid J Rosenthal
 
Agile Business Intelligence (or how to give management what they need when th...
Agile Business Intelligence (or how to give management what they need when th...Agile Business Intelligence (or how to give management what they need when th...
Agile Business Intelligence (or how to give management what they need when th...Evan Leybourn
 
Agile Process in a Nutshell
Agile Process in a NutshellAgile Process in a Nutshell
Agile Process in a Nutshellnvvrajesh
 
Agile BI Demystified
Agile BI DemystifiedAgile BI Demystified
Agile BI DemystifiedSenturus
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataAsis Mohanty
 
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)Maaret Pyhäjärvi
 
Ruben Timmerman @ Emerce Insight Banking 2007
Ruben Timmerman @ Emerce Insight Banking 2007Ruben Timmerman @ Emerce Insight Banking 2007
Ruben Timmerman @ Emerce Insight Banking 2007RubZie
 
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...Big Data Spain
 
Stockage des données dans les sgbd
Stockage des données dans les sgbdStockage des données dans les sgbd
Stockage des données dans les sgbdMarc Akoley
 
MicroStrategy 9 vs Qlikview 11
MicroStrategy 9 vs Qlikview 11MicroStrategy 9 vs Qlikview 11
MicroStrategy 9 vs Qlikview 11BiBoard.Org
 
Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...Daniel Upton
 
ETIS11 - Agile Business Intelligence - Presentation
ETIS11 -  Agile Business Intelligence - PresentationETIS11 -  Agile Business Intelligence - Presentation
ETIS11 - Agile Business Intelligence - PresentationDavid Walker
 
Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data TeamsOutreach Digital
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems divjeev
 

Destacado (20)

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
BI the Agile Way
BI the Agile WayBI the Agile Way
BI the Agile Way
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
Business Intelligence with Microsoft SQL 2014 - Presented by Atidan
Business Intelligence with Microsoft SQL 2014 - Presented by AtidanBusiness Intelligence with Microsoft SQL 2014 - Presented by Atidan
Business Intelligence with Microsoft SQL 2014 - Presented by Atidan
 
Agile Business Intelligence (or how to give management what they need when th...
Agile Business Intelligence (or how to give management what they need when th...Agile Business Intelligence (or how to give management what they need when th...
Agile Business Intelligence (or how to give management what they need when th...
 
Agile Process in a Nutshell
Agile Process in a NutshellAgile Process in a Nutshell
Agile Process in a Nutshell
 
Agile BI Demystified
Agile BI DemystifiedAgile BI Demystified
Agile BI Demystified
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)
Agile San Diego: Testing as Exploration (Continuous Delivery w/o Automation)
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Ruben Timmerman @ Emerce Insight Banking 2007
Ruben Timmerman @ Emerce Insight Banking 2007Ruben Timmerman @ Emerce Insight Banking 2007
Ruben Timmerman @ Emerce Insight Banking 2007
 
Hadoop Data Warehouse
Hadoop Data WarehouseHadoop Data Warehouse
Hadoop Data Warehouse
 
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...
Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DO...
 
Stockage des données dans les sgbd
Stockage des données dans les sgbdStockage des données dans les sgbd
Stockage des données dans les sgbd
 
MicroStrategy 9 vs Qlikview 11
MicroStrategy 9 vs Qlikview 11MicroStrategy 9 vs Qlikview 11
MicroStrategy 9 vs Qlikview 11
 
Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...
 
ETIS11 - Agile Business Intelligence - Presentation
ETIS11 -  Agile Business Intelligence - PresentationETIS11 -  Agile Business Intelligence - Presentation
ETIS11 - Agile Business Intelligence - Presentation
 
Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data Teams
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems
 

Similar a Agile Business Intelligence

Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
Subhoshree_ETLDeveloper
Subhoshree_ETLDeveloperSubhoshree_ETLDeveloper
Subhoshree_ETLDeveloperSubhoshree Deo
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionAcevedoApps
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapersKai Zhao
 
Business Process De Pillis Tool Comparison
Business Process De Pillis Tool ComparisonBusiness Process De Pillis Tool Comparison
Business Process De Pillis Tool ComparisonG.J. dePillis
 
Business Intelligence Module 3
Business Intelligence Module 3Business Intelligence Module 3
Business Intelligence Module 3Home
 
Agile software modelling
Agile software modellingAgile software modelling
Agile software modellingLikan Patra
 
Shrey_Kumar_Resume_01072016
Shrey_Kumar_Resume_01072016Shrey_Kumar_Resume_01072016
Shrey_Kumar_Resume_01072016Shrey Kumar
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
Informatica_Rajesh-CV 28_03_16
Informatica_Rajesh-CV 28_03_16Informatica_Rajesh-CV 28_03_16
Informatica_Rajesh-CV 28_03_16Rajesh Dheeti
 
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Alex Rayón Jerez
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
How AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad InfotechHow AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad InfotechCalidad Infotech
 

Similar a Agile Business Intelligence (20)

ETL
ETLETL
ETL
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Subhoshree resume
Subhoshree resumeSubhoshree resume
Subhoshree resume
 
Amit_Kumar_CV
Amit_Kumar_CVAmit_Kumar_CV
Amit_Kumar_CV
 
Subhoshree_ETLDeveloper
Subhoshree_ETLDeveloperSubhoshree_ETLDeveloper
Subhoshree_ETLDeveloper
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications Introduction
 
BAKKIYA_4YR
BAKKIYA_4YRBAKKIYA_4YR
BAKKIYA_4YR
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
 
Business Process De Pillis Tool Comparison
Business Process De Pillis Tool ComparisonBusiness Process De Pillis Tool Comparison
Business Process De Pillis Tool Comparison
 
Business Intelligence Module 3
Business Intelligence Module 3Business Intelligence Module 3
Business Intelligence Module 3
 
Agile software modelling
Agile software modellingAgile software modelling
Agile software modelling
 
Gowthami_Resume
Gowthami_ResumeGowthami_Resume
Gowthami_Resume
 
Shrey_Kumar_Resume_01072016
Shrey_Kumar_Resume_01072016Shrey_Kumar_Resume_01072016
Shrey_Kumar_Resume_01072016
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Informatica_Rajesh-CV 28_03_16
Informatica_Rajesh-CV 28_03_16Informatica_Rajesh-CV 28_03_16
Informatica_Rajesh-CV 28_03_16
 
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
RajeshS_ETL
RajeshS_ETLRajeshS_ETL
RajeshS_ETL
 
Abdul ETL Resume
Abdul ETL ResumeAbdul ETL Resume
Abdul ETL Resume
 
How AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad InfotechHow AI is transforming DevOps | Calidad Infotech
How AI is transforming DevOps | Calidad Infotech
 

Más de David Portnoy

DDOD framework infographic
DDOD framework infographicDDOD framework infographic
DDOD framework infographicDavid Portnoy
 
Impact of DDOD on Data Quality - White House 2016
Impact of DDOD on Data Quality -  White House 2016Impact of DDOD on Data Quality -  White House 2016
Impact of DDOD on Data Quality - White House 2016David Portnoy
 
Industry Uses of HHS Data
Industry Uses of HHS DataIndustry Uses of HHS Data
Industry Uses of HHS DataDavid Portnoy
 
Open Data Discoverability
Open Data DiscoverabilityOpen Data Discoverability
Open Data DiscoverabilityDavid Portnoy
 
DDOD for FOIA organizations
DDOD for FOIA organizationsDDOD for FOIA organizations
DDOD for FOIA organizationsDavid Portnoy
 
Intro to Demand-Driven Open Data for Data Owners
Intro to Demand-Driven Open Data for Data OwnersIntro to Demand-Driven Open Data for Data Owners
Intro to Demand-Driven Open Data for Data OwnersDavid Portnoy
 
Intro to Demand Driven Open Data for Data Users
Intro to Demand Driven Open Data for Data UsersIntro to Demand Driven Open Data for Data Users
Intro to Demand Driven Open Data for Data UsersDavid Portnoy
 
Case Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human GenomeCase Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human GenomeDavid Portnoy
 

Más de David Portnoy (8)

DDOD framework infographic
DDOD framework infographicDDOD framework infographic
DDOD framework infographic
 
Impact of DDOD on Data Quality - White House 2016
Impact of DDOD on Data Quality -  White House 2016Impact of DDOD on Data Quality -  White House 2016
Impact of DDOD on Data Quality - White House 2016
 
Industry Uses of HHS Data
Industry Uses of HHS DataIndustry Uses of HHS Data
Industry Uses of HHS Data
 
Open Data Discoverability
Open Data DiscoverabilityOpen Data Discoverability
Open Data Discoverability
 
DDOD for FOIA organizations
DDOD for FOIA organizationsDDOD for FOIA organizations
DDOD for FOIA organizations
 
Intro to Demand-Driven Open Data for Data Owners
Intro to Demand-Driven Open Data for Data OwnersIntro to Demand-Driven Open Data for Data Owners
Intro to Demand-Driven Open Data for Data Owners
 
Intro to Demand Driven Open Data for Data Users
Intro to Demand Driven Open Data for Data UsersIntro to Demand Driven Open Data for Data Users
Intro to Demand Driven Open Data for Data Users
 
Case Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human GenomeCase Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human Genome
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Agile Business Intelligence

  • 1. - David Portnoy http://LinkedIn.com/in/DavidPortnoy 312.970.9740- © Copyright 2012-2014 Datalytx, Inc. Applying Agile Delivery to– – Business Intelligence Topic: Data Integration & ETL
  • 2. This group extends the TDWI community online and is designed to foster peer network and discussion of key issues relevant to business intelligence and data warehousing managers. TDWI (The Data Warehousing Institute™) provides education, training, certification, news, and research for executives and information technology (IT) professionals worldwide. Founded in 1995, TDWI is the premier educational institute for business intelligence and data warehousing. Our Web site is www.tdwi.org. Why this topic?  There’s a lot of confusion and misconception about the meaning of Agile, especially as it applies to BI  Many in corporate IT still believe that Agile cannot easily be applied to BI  Posts on this topic in the TDWI forum in LinkedIn would benefit from being organized and summarized
  • 3. What we’ll cover  Misconceptions about Agile BI  Core techniques of Agile BI  Review of ETL tool landscape and benefits  Decision factors for choosing the ETL environment  Mitigating aspects of ETL tools that make Agile harder  How to implement an Agile BI development environment Due to the prevailing confusion and misconceptions, it’s easier to start with what Agile BI is not
  • 4. Misconceptions about Agile in the BI community There’s a common misconception that Agile BI applies to practically any methodology or tool that helps develop BI projects faster or in a more flexible way. Some examples of misconceptions:  Agile is primarily adding iterations to typical projects  Agile implies starting to code without planning or design  Agile involves particular data models, such as Data Vault  Agile involves rapid prototyping techniques, as can be achieved by certain metadata driven tools  Agile involves self-serve reporting, such as Tableau  Agile involves moving ETL from a separate code base into the reporting layer, as made possible by in-memory processing, such as with QlickView  Agile involves building real-time or low- latency DW, rather than traditional batch  Agile operates in a hosted cloud environment, especially PaaS (Platform as a Service)
  • 5. The culprits for the myths and misconceptions #1 Vendors claim that their products are agile. #2 The BI community as a whole does not have a long history or substantial practice with agile development. Therefore they are more likely to be swayed by vendor pitches.
  • 6. The culprits for the myths and misconceptions (cont.)  In the software development world, that’s equivalent to saying that new frameworks, such as Ruby on Rails, are needed for Agile development. (Few credible publications or developers would make such a claim.)  The implication that other BI tools can’t be used to achieve Agile BI is simply not true. (Even general purpose development platforms can be applied to BI.)  In reality, team composition, proficiency with existing technologies and management’s acceptance of agile is a bigger impact than a specific type of BI tool. “...Agile BI methodology differs from [agile software development] in that it requires new and different technologies and architectures for support. Metadata- generated BI applications are one such example...” Example source of misconceptions: The article goes on to claim that these particular tools are needed in order to achieve “development done faster”, “react[ing] more quickly to ... requirements“, incremental product delivery, “rapid prototypes versus specifications”, “reacting versus planning”, “personal interactions ... versus documentation”, etc. Forrester Research article “Agile Out of the Box”, 2010  This list is just buzz words associated with agile without substantial evidence of why other tools are insufficient.  Rapid prototyping is confused with the role of end-to-end working software.  On the contrary, arguments can be made why the tools identified could be detrimental to agile teams. (See TDWI LinkedIn group discussion “The Role of ETL tools in Agile BI”.) What’s wrong with itWhat’s being said
  • 7. The reality Yes, many of the items misclassified as necessary for Agile still help projects ramp up and complete faster. Yes, many improve the flexibility of dealing with changes in source data, business logic and reporting. Yes, many provide additional visibility into complex logic and functional changes across team members and stakeholders.  Data Vault model  Rapid prototyping tools  Metadata driven BI tools  Self-serve reporting  In-memory processing  Hosted cloud (PaaS) environment But none of them are required to have successful Agile BI projects
  • 8. So what are the requirements for implementing Agile BI? Productive Agile BI teams operate almost identically to Agile methodology used for software development. ...With just the minimal tweaks to accommodate: 1. Integration of available ETL and reporting tools into the development environment 2. Changes to regression testing due to the fact that databases have state 3. Challenges of managing large data sets in the deployment process
  • 9. Techniques for implementing Agile in BI  Timebox deliverables – of course  Measure completion with working software! (Prototypes using non- production tools are OK. But need to get end-to-end data flow working ASAP.)  Highly efficient, daily team synchronization in which entire team participates.  Monitor completion of features (stories), not time spent. Calculate team velocity to improve planning.  Hold sprint retrospectives to learn from mistakes.  Leverage techniques of Agile app dev:  Manage everything in version control, including data model and test data sets  Assume refactoring of working code can occur later to improve performance and maintainability  Use Test Driven Development (TDD), to ensure understanding of requirements and reduce rework  Implement Continuous Integration to automate build, tests, deployment  Measure project success by delivery of business value, not delivery of predefined requirements on time and on budget  Accept that it’s OK to fail, but fail early and adapt. (Non agile projects don’t recognize failure until time or budget runs out.)
  • 10. What’s the reason for low adoption of Agile in BI? Application Development Business Intelligence Development Environment Custom app development using standard, general purpose languages well suited for automation Proprietary vendor architectures and DSLs (domain specific languages) not well suited for automation Team skills Have skills to write automation for continuous integration Rely on vendors to provide these features Costs Low up front investment by leveraging open source platforms High up front investment in vendor- specific tools: DW appliance, data modeling, ETL, OLAP, Reporting, etc. Releases Software is stateless and therefore easier to test and deploy with each build Databases have state, with each build needing to start with a certain data set. High data volumes may take hours to load a changed data model or roll back changes. Agile is widely adopted in application development ...but not in BI Potential reasons might stem from differences between the two worlds
  • 11. Now let’s get into the specifics of ETL in Agile BI
  • 12. ETL tools have evolved over the years  Graphical development accomplishing ETL through parameterization and configuration, rather than code generation  Avoids complexities with code management and deployment  Intuitive development UI enabling developers to manipulate ETL metadata  From metadata, generate code in a general purpose (such as C or Java) or domain specific (such as SQL or MDX) language  Types: One-shot generators (that require switching to a native dev env) vs. full development environments with managed version deployments  Origin: Reusable code compiled from a few similar projects  Just change parameters to reuse for specific loading, logging, change data capture, database connections, etc.  One-time solutions  Built with focus on short-term delivery and minimal up front cost Custom Code Frameworks Code Generators Engines
  • 13. We can categorize the major ETL players The vendors  Traditional vendors: Informatica, SSIS, DataStage  Open source: Talend, Pentaho Kettle  Metadata driven, automated discovery, federated integration: Kalido, BI Ready, Wherescape, Composite Software The most common alternative  SQL + shell scripts  Native DB load utilities
  • 14. ETL tools have lots of value  Built-in commonly used features for transformation and job control  Without ETL tools, we’re reinventing the wheel on many BI design patterns that have been implemented countless times throughout history  Abstracts complex logic into a graphical components or domain specific language that leverages best practices and is often more maintainable over the potentially long project life span  Graphical representation of data model, data flow and job flow provide visibility into business logic, especially useful for less technical team members  Provides a degree of self-documentation without the need to update the graphical representation of logic separately from source code  Master Data Management (MDM)  Data cleansing  Change Data Capture (CDC)  Data lineage and data dependency functionality  Processing of SCD (Slowly Changing Dimensions)  Parallelization of tasks that can be run concurrently  Advanced merging functionality
  • 15. But many ETL tools are not well suited to an Agile BI environment
  • 16. First, these tools may not be ideal for Agile in general... Some ETL tools are...  Not well suited for code refactoring, branching, and merging because the code is not in text files that can be used modern version control, such as Git  Not well suited for use with automation in Continuous Integration, because they’re often standalone environments with no provisions for external automation  Not well suited for TDD (Test Driven Development), unless the vendors explicitly made provisions for unit test automation  Proprietary and have “black box” features that might make testing more challenging or decrease portability of test cases  Expensive, with high up-front license cost also putting more capital at risk – unless open source ETL, of course
  • 17. Second, they may negatively impact productivity of Agile teams ETL tools may...  Require a proprietary, vendor-specific skill set not present in the organization  Cause work priority to be stove-piped and limited to skill set, rather than overall business value  Prevent the ability to leverage the full dev team, since they fall under a separate development environment from the rest of apps  Result in a productivity hit, since some professional developers are more productive writing code in native languages than using GUI tools, even after training  Not provide compelling enough reasons for developers to learn any one ETL tool, since the lack of industry standards decreases skill portability
  • 18. Third, there are other challenges and considerations There are challenges and limitations with ETL tools even outside of Agile  Require allocation of additional resources to manage version upgrades of the ETL tool, even if the code base hasn’t been changing  When the type of processing needed is outside of core ETL tool features, complexity can grow quickly  Usefulness of visual representations for data models, data flows and job flows is reduced as complexity increases  Some find GUI development less efficient than traditional coding, especially for complex or unique type of processing  Often the sophisticated features are underutilized, resulting in expensive tools being used just for job scheduling
  • 19. Fourth, BI is increasingly involving Big Data Big Data implementations often make ETL tools less compelling  Large volumes make it more efficient to  Manipulate data in place using ELT, rather than have multiple staging areas  Use native methods (MapReduce /Java, SQL, Hive, etc.) that allow for more control and performance optimization  High velocity of data makes it harder to use ETL tools that have traditionally been designed around batch-oriented processing.  High variability of data makes ETL tools less attractive, since they expect a fixed schema and don’t gracefully accommodate changes. Common examples include unstructured web log data in flat files and logical objects from apps stored in key-value pair format.  MPP vendors, such as Teradata and Netezza make a case for doing ELT (rather than ETL) processing natively and provide built-in features to do so  Currently ETL tools are rarely used with the Hadoop ecosystem for many of the reasons stated, as well as licensing cost
  • 20. That said, how do we implement an- Agile BI environment?
  • 21. First, use ETL tools when it makes sense Pick the right ETL tool for the job...  We covered the potential benefits and problems of using such ETL tools for Agile BI. Look for situations where benefits outweigh the problems. For example, a good situation to employ ETL tools might be: A use case requiring sophisticated data cleansing transformations, complex job control logic, and data volumes easily handled by traditional SMP database architectures.  Outside of such situations, consider using SQL, DB-specific native code, or general purpose languages already in use elsewhere in the organization. Is it OK to start with using an ETL tool as a job scheduler?  Yes, assuming it’s an efficient way to handle much needed job control logic, including failures, event triggers, and dependencies.  Plus, you get the option to adopt other capabilities of the tool over time with low project risk. While traditional ETL tools can simplify a complex task, they can also overcomplicate a simple task.
  • 22. Second, when you do use ETL tools, look for ways to mitigate these issues identified So what’s the solution? L Issue Approach High up-front license cost Use open source tools or less expensive licenses like with SQL Server. Aggressive vendor negotiations, in light of lower cost alternatives. Use with Continuous Integration See following slides. Some vendors, like Microsoft, may make provisions for automated builds within their environment. Otherwise look for opportunities to simplify, partially automate, and notify team of build state. Use with version control Where possible, save ETL logic to XML, create dumps of repository, and generate code from metadata. Then manage in common version control tool. Decreased portability Move code to general purpose development languages, including SQL and MDX. Consider tools that generate generic code from GUI or metadata. Vendor-specific skill set Build cross-functional team by...  Training existing developers  Hiring well-rounded developers willing to learn ETL tools Risk of introducing another development environment Start using ETL tools now and “grow” into using the functionality  Continue coding in what you know: native RDBMS code or even general app dev languages  Start using ETL as a glorified job scheduler to wrap native code  When refactoring code, take the opportunity to push more logic into the ETL tool  Gradually start using other features such as MDM, data quality, notifications, enterprise service bus, etc.
  • 23. Continuous Integration: Methodology  Each developer should have a sandbox: 1-to-1 app instance to DB instance (CI by Martin Fowler)  Automate: Table deployment, usage stats, schema verification, data migration verification, DB testing, migration to prod  Version control all DB assets, ideally using a distributed tool like Git  Use tool like dbDeploy and link app build, DB version, and forward/reverse DDL & DML scripts  Generate a test data set with a dimension annotating what each is testing; Becomes a company asset that enables TDD of BI For cases where an application consumes data from the data warehouse:  BI developers should learn software coding practices; Application developers should learn data modeling, SQL, DB tuning  Consuming apps use 2 phased builds: Build 1, DB is stubbed out and runs within minutes Build 2, includes real DB for end-to-end testing, but might run for a while  Bugs found in Build 2, trigger additions to the test data set; Next time same bug is caught in Build 1 Shared developer schema Dev 1 Dev 2 Dev 3 Typical BI dev env with contention during development Sandboxed dev env appropriate for agile development Schema Dev 1Dev 1 Dev 2 Dev 3 Schema Dev 2 Schema Dev 3  
  • 24. Continuous Integration: Tools & Configuration How dbDeploy works dbDeploy is treated as a custom Ant task: 1. Logs & assigns version #s to changes in SQL files 2. Save changelog table since prior version 3. Generates DDL & DML scripts to apply to DB in other envs Tool Type Purpose Ant Build tool Automates steps to build & deploy software Jenkins Continuous Integration Monitors source code repository (Git) for checkins, automatically launching build-test cycles and publishing results. Git Source control / repository Source code repository optimized for branching and merging, making it efficient for each developer to have their own sandbox environment. It triggers CI built-test cycles. dbDeploy, dbMaintain, etc. Database refactoring manager Automates the process of establishing which database refactorings need to be run against a specific database in order to migrate it to a particular build. DbUnit, DbFit, SQLUnit Unit test automation Common tool to aid TDDD (Test-driven DB development). Manage DB state between test runs, import/export test datasets, run unit tests and log exceptions. Regression testing of DDL, DML, stored procedures. Developer Env.  Repository (Git) CI Environment Check out Build Tool Deploy & Test Test server Prod server Project Code Check in Success / Fail Tag
  • 25. Continuous Refactoring & Releases of Databases Dev Sandbox Project Integration Sandbox Test / QA Sandbox Production Highly iterative development Characteristics Environment Deployment Frequency Risk / impact of bug Project-Level Testing System Integration Testing Operations & Support Frequent Infrequent Controlled Low impact Medium impact High impact Based on presentation by Pramod Sadalage Testing Test data set (Used for TDD) Test data set Benchmark data Production data
  • 26. Continuous Integration: Possible Configuration for Microsoft BI Stack PowerDelivery  Addresses TFS’s weakness in coordinating the promotion of builds through multiple environments of the delivery pipeline: triggering build on commit, promoting commit build to test, promoting test build to prod Windows PowerShell  Task-based command-line shell & scripting language (built on .NET) for task automation Team Foundation Server  Microsoft's application lifecycle management (ALM) solution. Collaboration platform that supports agile delivery practices  Build machine is configured for continuous integration, so latest working version is refreshed and available to the entire distributed team SQL Server Data Tools  Develop, debug, and execute database unit tests interactively in Visual Studio.  Puts database testing on an equal footing with application testing.  Can then be run from command line or from a build machine  Integrated with testing, bug tracking, and project management using TFS