2. More than ever, businesses today need to understand their operations, customers, suppliers, partners, employees, and stockholders. They need to know what is happening with the business, analyze their operations, reach to market conditions, make the right decisions to drive revenue growth, increase profits and improve productivity and efficiency.
3. CIOs are responding to their organizations’ strategic needs by developing IT initiatives that align corporate data with business objectives. These initiatives include: Business intelligence Master data management Business transformation Infrastructure rationalization Risk and compliance
4.
5.
6.
7.
8.
9.
10.
11. QualityStage Investigate Standardize Match Survive The Standardize stage allows you to reformat data from multiple systems to ensure that each data type has the correct and consistent content and format.
12.
13. QualityStage Investigate Standardize Match Survive Data matching is used to find records in a single data source or independent data sources Data matching is used to find records in a single data source or independent data sources that refer to the same entity Data matching is used to find records in a single data source or independent data sources that refer to the same entity (such as a person, organization, location, product, or material) regardless of the availability of a predetermined key.
14.
15.
16. QualityStage Investigate Standardize Match Survive Survivorship is used to create a ‘best record’ from all available information about an entity (such as a person, location, material, etc.). Survivorship and formatting ensure that the best available data survives and is correctly prepared for the target destination. Using the rules setup screen, it implements business and mapping rules, creating the necessary output structures for the target application and identifying fields that do not conform to load standards.
19. DataStage In its simplest form, DataStage performs data transformation and movement from source systems to target systems in batch and in real time. The data sources may include indexed files, sequential files, relational databases, archives, external data sources, enterprise applications and message queues.
23. DataStage Administrator Manager Designer Director DataStage Manager is the primary interface to the DataStage repository. In addition to table and file layouts, it displays the routines, transforms, and jobs that are defines in the project. It also allows us to move or copy ETL jobs from one project to another.
24.
25. DataStage Administrator Manager Designer Director Use DataStage Director to run, schedule, and monitor your DataStage jobs. You can also gather statistics as the job runs. Also used for looking at logs for debugging purposes.
30. DataStage Designer Transformer Stage The Transformer stage performs any data conversion required before the data is output to another stage in the job design. After you are done, compile and run the job.
Master data management – Reliably create and maintain consistent, complete, contextual and accurate business information about entities such as customers and products across multiple systems Business intelligence – Take the guesswork out of important decisions by gathering, storing, analyzing, and providing access to diverse enterprise information. Business transformation – Isolate users and applications from the underlying information completely to enable On Demand Business. Infrastructure rationalization – Quickly and accurately streamline corporate information by repurposing and reconciling data whenever it is required Risk and compliance - Deliver a dependable information management foundation to any quality control, corporate reporting visibility and data audit infrastructure.
DS Administrator is used for administration tasks such as setting up users, logging, creating and moving projects and setting up purging criteria
Permissions - Assign user categories to operating system user groups or enable operators to view all the details of an event in a job log file. Tracing – Enable or disable tracing on the server. Schedule – Set up a user name and password to use for running scheduled DataStage jobs. Mainframe – Set mainframe job properties and the default platform type. Turntables – Configure cache settings for Hashed File stages. Parallel – Set parallel job properties and defaults for date/time and number formats. Sequence – Set compilation defaults for job sequences. Remote – If you have specified that parallel jobs in the project are to be deployed on a USS system, this page allows you to specify deployment mode and USS machine details.
DataStage Designer – used to create DataStage applications (known as jobs). Each job specifies the data sources, the transformations required, and the destination of the data. Jobs are compiled to create executables that are scheduled by the Director and run on the server.
DataStage Director – used to validate, schedule, run, and monitor DataStage job sequences.
Constraint - Prevents data from getting into the processing piece of the ETL job (reject) Derivation - Logic at the field level (example: is it “open”? (“click through”))