SlideShare una empresa de Scribd logo
1 de 4
Descargar para leer sin conexión
Ab Initio is suite of applications containing the various components, but generally when people
name Ab Initio, they mean “Ab Initio Co>operation system”, which is primarily a GUI based ETL
Application. It gives user the ability to drag and drop different components and attach them, quite
akin to drawing.

The strength of Ab Initio-ETL is massively parallel processing which gives it capability of handling
large volume of data.

Let’s componentise Ab Initio

    1. Co>operation System
    2. EME(Enterprise Meta>Environment)
    3. Additional Tools
          a. Data profiler
          b. Plan-IT etc

Co>operating System is ETL application; it comes packaged with EME (mentioned in next
paragraph). This is GUI based application. Quite simple is design due to drag and drop features, most
of the features are quite basic and so basic learning curve is quite steep. Now it has further two
flavour or sub classes:

    1. Batch Mode
    2. Continuous Flow

The both primarily doing the similar things, but classically different in mode of processing as the
name suggested. The “Batch Mode” is primarily used by most of costumer gives the benefit of
moving bulk data (daily/multiple times a day).

Continuous mode is more like “Click/Trigger” driven; say when you click on a web page the data flow
starts, some of very large web based application run on Ab Initio server using Continuous flow

EME is more like source control for Ab Initio, but it has many additional features like

    1. Meta data management
          a. Business Metadata management
          b. Process metadata management
    2. Impact Analysis
    3. Documentation tools
    4. Run History Tracking
    5. And surely Check-in and check-out

Ab Initio has come up with certain other application to complement the ETL suite; I will not be
covering these in details, just one liner

Data profiler – It is data profiling tool, got the features for data quality analysis

Plan-IT – It is primarily a scheduler built by Ab Initio to run Ab Initio jobs. It can be integrated with
Ab Initio jobs.
Categorising Ab Initio and labelling strength and weakness based on the following criteria. I am
giving each section points from 1 to 10(10 being best) based on my own experience and some
reference from web.

    1.   Cost to purchase(4)
    2.   Total Cost of Ownership(4)
    3.   Platform (OS and DBMS)(8)
    4.   Ease of use (wizards, drag & drop, etc)(8)
    5.   Learning curve(6)
    6.   Performance(9)
    7.   Available expertise(7)
    8.   Ab Initio Support and Other Resources(5)


Cost of Purchase – It is one of the costliest ETL tool in the market, with cost ranging from 500k to
5M, it depends upon the number of servers Ab Initio is installed, number of developer license and
type of license, and batch flow is comparatively cheaper than continuous flow.

Comparing it with other major ETL tools like Informatica with similar functionalities, the pricing
difference will be clearly evident

Total Cost of Ownership – The cost of ownership comes in 3 parts

    a) Annual maintenance charges
    b) Cost of employing/training Ab initio resources
    c) Development cost

Annual Maintenance charges – It is generally the percentage of initial cost and it is significant due to
high initial cost. This number may differ based on the NDA and initial investment. A rough 10%
maintenance charges is significant outflow.

Development cost – covered under training and resources

Available Expertise/Ab Initio Resources (training/employing) -

It is high end tool, so the developer community is not massive like many open source application, so
employing these resources come with premium price.

Additionally Ab Initio is such a close community, so if you are ETL developer and want to
explore/learn Ab Initio generally you will hit a wall and as I recall really there are 2 options

    1. Work for an Organisation who own Ab initio
    2. There are only handful of organisation who train in Ab Initio, so pay a premium to join the
       club

Platform – Like most of other ETL tools, it can work in various platforms.
On Database front, it can connect to all the major databases available in the market. So there is
nothing to choose between this tool with respect to others. It allows connection to DB either by
ODBC client or native mode, I believe some of other ETL tools may not have native mode supported

Ease of use – Being GUI based it is easy to use, simple component s, drag and drop and various
indicators if connections are not completely made. In comparison to other tools, there is nothing
much to chose in that end.

Creating custom based components and re-using those is one feature, I really liked in Ab Initio

There are certain set of components which are difficult to use and may require bit of scripting
experience, but it is okay

Learning Curve – Learning curve is quick to start with; the difficult components can take some time.
Ab Initio has designed certain components very cleverly, it takes bit of experience to utilize those
optimally and take bit of time. I guess learning time of about 15 man days for a programmer with
about 2-3 years experience will give enough fluency in designing application

Part of learning curve is covered in next section

Ab Initio Support and other resources –

Covering both topics in one go – Ab Initio is treating their application like a fort/sacred book, with
little information and literature available on the market, so as a consequence, there is not enough
information material on web.

    1. Not enough resources on web
    2. Hit productivity of team, when struck with a technical/design issue
    3. Unavailability of training material hurt training new employees

Just like any application/scripting language, there is potential of not using the application optimally, I
really believe lack of proper training and open discussion has hit Ab Initio developers really hard,
where they missing and still groping in dark having following set of problems

    1. No Access or standard set of best practices. Organisation tend to have their own best
       practices if any and under tough external review generally these will fall much short of best
    2. Missing Input from other communities having parallel functionalities and smaller developer
       pool restricts better ideas/inputs

Though Ab Initio provides training for users (costumers), but it does not cover each and every aspect
of Ab Initio and advanced training comes with a cost,

Ab Initio provide support to their customers, it is of decent quality, but it takes generally long
turnaround time

Performance –

I left this topic at end, as this is the reason Ab Initio is used
For massive data processing, where time is of essence and performance and through put is critical,
Ab Initio stands head and shoulder above others.

    1. Massively parallel Architecture
           a. Available data can be split and processed in parallel giving it huge processing
               advantage.
           b. Theoretically, it is possible to design a system using Ab Initio architect where any
               additional processing power can be achieved by adding additional resources in
               parallel, thus allowing any scale-up easy and possible
    2. Innovative component
           a. Ab Initio components like compressed indexed files and similar gives Ab initio an
               edge when dealing with huge dataset. Though this concept is not unheard of, in
               past, but Ab Initio implemented it successfully.
           b. Some new scripting features known as PDL (Program definition Language) in Ab
               Initio allows flexibility, which is quite well received by Ab Initio developers and not
               easily available in other ETL tools.
           c. Personal perception: Ab Initio has put some effort in component design taking care
               of small issues like memory management/memory foot print. Though these are not
               critical essentially, but in time critical system, these provide an edge.

Más contenido relacionado

Destacado

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 

Destacado (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

Ab Initio Overview

  • 1. Ab Initio is suite of applications containing the various components, but generally when people name Ab Initio, they mean “Ab Initio Co>operation system”, which is primarily a GUI based ETL Application. It gives user the ability to drag and drop different components and attach them, quite akin to drawing. The strength of Ab Initio-ETL is massively parallel processing which gives it capability of handling large volume of data. Let’s componentise Ab Initio 1. Co>operation System 2. EME(Enterprise Meta>Environment) 3. Additional Tools a. Data profiler b. Plan-IT etc Co>operating System is ETL application; it comes packaged with EME (mentioned in next paragraph). This is GUI based application. Quite simple is design due to drag and drop features, most of the features are quite basic and so basic learning curve is quite steep. Now it has further two flavour or sub classes: 1. Batch Mode 2. Continuous Flow The both primarily doing the similar things, but classically different in mode of processing as the name suggested. The “Batch Mode” is primarily used by most of costumer gives the benefit of moving bulk data (daily/multiple times a day). Continuous mode is more like “Click/Trigger” driven; say when you click on a web page the data flow starts, some of very large web based application run on Ab Initio server using Continuous flow EME is more like source control for Ab Initio, but it has many additional features like 1. Meta data management a. Business Metadata management b. Process metadata management 2. Impact Analysis 3. Documentation tools 4. Run History Tracking 5. And surely Check-in and check-out Ab Initio has come up with certain other application to complement the ETL suite; I will not be covering these in details, just one liner Data profiler – It is data profiling tool, got the features for data quality analysis Plan-IT – It is primarily a scheduler built by Ab Initio to run Ab Initio jobs. It can be integrated with Ab Initio jobs.
  • 2. Categorising Ab Initio and labelling strength and weakness based on the following criteria. I am giving each section points from 1 to 10(10 being best) based on my own experience and some reference from web. 1. Cost to purchase(4) 2. Total Cost of Ownership(4) 3. Platform (OS and DBMS)(8) 4. Ease of use (wizards, drag & drop, etc)(8) 5. Learning curve(6) 6. Performance(9) 7. Available expertise(7) 8. Ab Initio Support and Other Resources(5) Cost of Purchase – It is one of the costliest ETL tool in the market, with cost ranging from 500k to 5M, it depends upon the number of servers Ab Initio is installed, number of developer license and type of license, and batch flow is comparatively cheaper than continuous flow. Comparing it with other major ETL tools like Informatica with similar functionalities, the pricing difference will be clearly evident Total Cost of Ownership – The cost of ownership comes in 3 parts a) Annual maintenance charges b) Cost of employing/training Ab initio resources c) Development cost Annual Maintenance charges – It is generally the percentage of initial cost and it is significant due to high initial cost. This number may differ based on the NDA and initial investment. A rough 10% maintenance charges is significant outflow. Development cost – covered under training and resources Available Expertise/Ab Initio Resources (training/employing) - It is high end tool, so the developer community is not massive like many open source application, so employing these resources come with premium price. Additionally Ab Initio is such a close community, so if you are ETL developer and want to explore/learn Ab Initio generally you will hit a wall and as I recall really there are 2 options 1. Work for an Organisation who own Ab initio 2. There are only handful of organisation who train in Ab Initio, so pay a premium to join the club Platform – Like most of other ETL tools, it can work in various platforms.
  • 3. On Database front, it can connect to all the major databases available in the market. So there is nothing to choose between this tool with respect to others. It allows connection to DB either by ODBC client or native mode, I believe some of other ETL tools may not have native mode supported Ease of use – Being GUI based it is easy to use, simple component s, drag and drop and various indicators if connections are not completely made. In comparison to other tools, there is nothing much to chose in that end. Creating custom based components and re-using those is one feature, I really liked in Ab Initio There are certain set of components which are difficult to use and may require bit of scripting experience, but it is okay Learning Curve – Learning curve is quick to start with; the difficult components can take some time. Ab Initio has designed certain components very cleverly, it takes bit of experience to utilize those optimally and take bit of time. I guess learning time of about 15 man days for a programmer with about 2-3 years experience will give enough fluency in designing application Part of learning curve is covered in next section Ab Initio Support and other resources – Covering both topics in one go – Ab Initio is treating their application like a fort/sacred book, with little information and literature available on the market, so as a consequence, there is not enough information material on web. 1. Not enough resources on web 2. Hit productivity of team, when struck with a technical/design issue 3. Unavailability of training material hurt training new employees Just like any application/scripting language, there is potential of not using the application optimally, I really believe lack of proper training and open discussion has hit Ab Initio developers really hard, where they missing and still groping in dark having following set of problems 1. No Access or standard set of best practices. Organisation tend to have their own best practices if any and under tough external review generally these will fall much short of best 2. Missing Input from other communities having parallel functionalities and smaller developer pool restricts better ideas/inputs Though Ab Initio provides training for users (costumers), but it does not cover each and every aspect of Ab Initio and advanced training comes with a cost, Ab Initio provide support to their customers, it is of decent quality, but it takes generally long turnaround time Performance – I left this topic at end, as this is the reason Ab Initio is used
  • 4. For massive data processing, where time is of essence and performance and through put is critical, Ab Initio stands head and shoulder above others. 1. Massively parallel Architecture a. Available data can be split and processed in parallel giving it huge processing advantage. b. Theoretically, it is possible to design a system using Ab Initio architect where any additional processing power can be achieved by adding additional resources in parallel, thus allowing any scale-up easy and possible 2. Innovative component a. Ab Initio components like compressed indexed files and similar gives Ab initio an edge when dealing with huge dataset. Though this concept is not unheard of, in past, but Ab Initio implemented it successfully. b. Some new scripting features known as PDL (Program definition Language) in Ab Initio allows flexibility, which is quite well received by Ab Initio developers and not easily available in other ETL tools. c. Personal perception: Ab Initio has put some effort in component design taking care of small issues like memory management/memory foot print. Though these are not critical essentially, but in time critical system, these provide an edge.