SlideShare una empresa de Scribd logo
1 de 18
Recipes of
Data Warehouse
and
Business Intelligence
Massimo Cenci Data Warehouse Blog
http://massimocenci.blogspot.it
A messaging
system for Oracle
Data Warehouse
(part 1)
Micro ETL Foundation
Introduction (1)
• A goal of what I have called the Micro ETL Foundation (MEF) is to provide some simple and immediate solutions to the need of a
Data Warehouse. What could be more simple (and necessary) of a log message? It may seem strange to devote an entire article to
describe how to report the message "Hello world"; message, famous for all the programmers of the computer world. But we find
that what appears simple is actually a bit more complicated.
• If we take the first paragraph of any programming book, (eg. java) the solution is expressed with a single line:
System.out.println ( Hello Word! ");
• If we use another programming language (eg. PL / SQL Oracle), the solution is again expressed in a single line:
Dbms_output.put_line ('Hello World');
• What I want to emphasize is the incompleteness of the received message; In fact, the complexity lies not in the message itself: at
the end it is a simple function call. The complexity is the lack of context. Complexity is the metadata.
• Suppose that at this time are running more loading processes. The fact of receiving a warning or alert message is not enough.
What I want to know is when the message was sent, which job was running, how much time has elapsed since the previous
message, which was the current procedure at the time of the message.
Introduction (2)
• We find out that answer to all these questions is no longer trivial. When we go out from theoretical examples printed on the books
and enter into the reality of the daily work, everything becomes more complicated.
• We will see the theory and practice, then the code, which will give us an answer to contextual needs described above. And the
solution will be (surprise!), again expressed in a single line.
• The programming language used is the Oracle PL / SQL (simply because it is widely used in the Data Warehouse world and then I
turn to a wider audience), but the techniques exposed are easily playable on any other RDBMS. I forgot: we also see how to
ensure that the message is sent via e-mail.
• This, however, is not just for programmers. It 'an article that shows all the hard work behind a simple log message. Just imagine the
work that is behind the complete loading process of a Data Warehouse.
Definitions
Naming Convention
I spent a lot of words on the importance and need to always use a naming convention for the Data Warehouse projects. I summarize
some concepts. In general, the naming convention is the method by which you choose to assign a name to the various entities of the
system being designed. This method should produce a name that must be able to represent immediately the semantic nature of the
entities that we will use. It does not matter if they belong to a Data Warehouse system or an operational system.
ETL Process
The ETL process is the set of programs that load data from external systems into the Data Warehouse tables . Since this set can be very
complex and detailed, it is extremely important to have a messaging system that gives me the most information possible about the
process.
Job
A job is a unit of logic describe a very specific task, such as loading a dimension of analysis or of a fact table or both. A set of job-related
between them is in turn a job that is part of a schedule. Ie is activated at a preset time.
In turn, a job consists of simple and sequential processing units that we call unit.
Unit
The unit is the elementary processing unit (so it is code) which in turn can call other procedures and functions that can be defined in a
generic way, modules
Execution
A job usually runs at night, but it could also turn several times a day. Each execution of a job is a run that has to be identified with a
sequential number, which can be defined as exec counter.
For the time being be satisfied with these definitions very short. Later we will analyze in more detail
Definitions
Requirements
• The requirements are very simple. To have a basic messaging system that signals everything that it is useful to monitor the
execution of the ETL process.
• It is a task of the designer to decide the content of these messages, which may simply be informative, as the number of rows
processed, or of particular attention, such as the identification of errors or anomalies. In these cases it should be possible to send
the message via e-mail.
• The system should not merely store all messages generated, but must also have the context information.
• The context is described in the paragraph relative to the definition of the ETL process. More context information we insert, and
more we can control the system and we will be more efficient and faster to resolve the problems.
• I suppose a minimum knowledge of Oracle and SQL and PL/SQL languages.
Design
• The design of the messaging component of the Micro ETL Foundation (MEF), consists of 3 tables, 2 sequences and 1 package.
• We can see them as the basic components. The minimum necessary for the other MEF components that we will use in the future.
• The naming convention used is a smaller version of the one to be applied to all objects of a Data Warehouse. [See
http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques].
• But it would be unreasonable to apply to MEF the same logic that is used to organize the hundreds of tables typical of a large Data
Warehouse. Also we can use MEF in any type of project. We will use this simplified structure:
<entity name>=<area code>_ <type name>_<logical code>
• The basic package, being common to all, will be further simplified by eliminating the <logical name>.
• The area code is "MEF".
• The table MEF_CFT is the table of system configuration.
• The table MEF_EMAIL_CFT is the specific configuration table for e-mail addresses, the table MEF_MSG_LOT is one that will keep
the text messages.
• The sequence MEF_MSG_SEQ gives a sequence number to each message.
• The sequence MEF_RUN_SEQ is used to numerically identify each job execution.
• The Oracle package MEF is a library of programs that manages those objects.
• All scripts for creating objects, can be downloaded from slideshare that show in a pragmatic and fun way the system
[http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]. The scripts are very
minimal, with only the main structures, leaving the reader with the completion of all other accessory structures such as indexes,
constraints, etc..
• The tables will be created in the default tablespace, but you should always create ad-hoc tablespace.
The MEF_CFT table
• This table contains general information about the MEF and the Data Warehouse. Other columns will be added in the future (or on
your choice)
prj_cod: Code of the Data Warehouse project
user_cod: name of the Oracle user for ETL.
email_srv_cod: E-mail Server. Every corporate has always a server for managing e-mail, indicate here that server.
mef_root_txt: Path of the folder of MEF scripts.
mef_dir: Oracle directory pointing to mef_root_txt
The MEF_MSG_LOT table
• This table stores all log messages that are sent by loading process.
seq_num: Sequential number of the message. It is obtained from an Oracle sequence.
day_cod: Time stamp of the message insert in the format YYYYMMDD.
sched_cod: The identifier for the schedule to which the job belongs.
job_cod: job identifier. It is a logical entity in the sense that we think of it as the launch of a list of processing units.
unit_cod: Identifier of the processing unit within the job. We can think of it as a procedure or a function of the Oracle package.
module_cod: module Identifier. A unit, though complex, can in turn call the sub-routine or subfunctions, ie modules.
In this case, it is interesting to know yhid detail.
rows_num: Number of rows processed. Typically, this field is not set, but if we want to report the number of rows, for example,
inserted into a table, we also have this information.
line_txt: Message text
cline_txt: Message text in CLOB type.
ss_num: Number of seconds that have elapsed since the previous message. This information, together with the next two,
provides a summable data. If we wanted to know how long it took all the statements of a certain kind, we would be able to
calculate.
mi_num: Number of minutes that have elapsed since the previous message
hh_num: Number of hours that occurred since the previous message
elapsed_txt: Time elapsed since the previous message in the format HH24:MI:SS
stamp_dts: Time stamp insertion of the message.
exec_cnt: The identifier for the execution of the job. Every run of a job should be characterized by a number,
in turn, extracted from an Oracle sequence.
user_cod: Oracle user who posted the message. It is setted automatically by a session variable. It can be useful in cases
where multiple users contribute to the loading process(not recommended)
The MEF_EMAIL_CFT table
• This table configures the email addresses.
email_cod: Code to identify a group of recipients.
from_txt: Sender of the message. This name will appear as the sender of the e-mail message. Do not use special characters, nor
the blank between words. Eg. not set "Administrator ETL" but "Amministratore_ETL" otherwise you get a run-time error message
like:
ORA-29279: SMTP permanent error: 501
5.5.4 Invalid arguments
to_txt: Identifier of the message recipient
cc_txt: This e-mail address of the recipient in knowledge
subj_txt: Subject default message
status_cod: Status (1 = active, 0 = inactive) of the recipient
The MEF_MSG_SEQ sequence
• The sequence is an Oracle object. In practice it is a universal counter that increments each time you request it. Each message line
must have its sequential number. It is more functional than time stamp to sort the table.
• Because sometimes the messages are separated at a fraction of a second of each other, the time stamp might not be sufficiently
discriminating.
The MEF_RUN_SEQ sequence
• Sequence that indicates unambiguously, a run of a job.
The MEF package (1)
• This is the basic package. I suggest to develop all the code inside PL/SQL packages (they are basically libraries), which allow a better
management and use of the code. Now a short description of the units contained in the package.
• f_str: Utility function to generate a string after replacing the input variables.
• p_ins_msg_lot: This procedure perform the insertion of the message in the MEF_MSG_LOT table receiving a variable of row type
as input parameter. This procedure has the "pragma autonomous_transaction". It is very important and requires a thorough
description. It seems incredible, but if there were not, it would not be possible this messaging system. The pragma
autonomous_transaction allows us to commit (i.e. to validate into the database) only and exclusively the DML statements of the
unit which contains this compiler directive. This concept is crucial because it allows us to commit (so to insert data into the
database) without affecting the logic of the loading process. Let's clarify with an example. Typically, before to loading the daily data
into a table, you delete, first of all, the data of that day, and then load/reload the new data. (forget for a moment the partition
manipulation) You do the commit at the end, where the loading, (delete/insert), was successfully completed. If i ran a commit after
the delete and the insert has a problem, I could have the loss of data of the day. As the messaging needs to do a commit of the
message in the table, the execution of the innocent message "I have done the delete", would validate even the delete itself. And
this is a side effect not acceptable. The autonomous transaction solve the problem: the PL/SQL Oracle engine, produce a
"daughter" transaction who live an autonomous life, which validates the data in the MEF_MSG_LOT table without affecting the
logic of the parent transaction.
• p_rae: The management of the exception is standardized in the following manner. When any Oracle error happen, in the "when
others" instruction there is the call of the p_rae procedure that enriches the content of the error with other useful information,
such as, for example, where the exception occurred. The output will always be the standard error pv_error. The "when pv_error" is
used to make sure that you keep the original error.
• p_init: Other private procedure. It initializes the line_row variable to perform the insert in the table.
• delta_time: Procedure that, based on the input parameters, such as the date of the last message and the current date , calculates
all the delta-time information, how many seconds, minutes, hours have passed and a delta time in the 'HH24:MI:SS' format. The
ways in which we can calculate the delta time are numerous: one used is just one of many.
• f_get_seq_val: Function that extracts generalized, in a dynamic way, the next number of the Oracle sequence whose name is given
in the parameter.
• f_get_exec_cnt: Function that extracts the execution number of the job. Before calling the generic function f_get_seq_val, it verify
that has not already been set as a global variable: in this case, use the number of the current execution.
The MEF package (2)
• f_get_cft: Function that extracts the current configuration from the MEF_CFT table .
• p_esend: A function that performs the sending of email via the UTL_MAIL package.
• p_send: This is the procedure that sends the message.
• p_mail: Procedure for sending the e-mail. Using the email code in inputit looks for all the recipients in the MEF_EMAIL_CFT table
and calls the p_esend passing all required parameters.
System configuration (1)
• Before you can create all the structures described above, you must perform some environment check.
• First of all it is necessary to verify that the Oracle RDBMS is designed for sending emails.
• So you have to make sure that the Oracle user that sends the messages has all grant necessary for its operations. Let's see in detail.
SMTP check
• We need to verify that the Oracle RDBMS is designed for sending emails. In fact the sending is activated by calling a procedure that
is part of a package of the RDBMS. To verify this, connect to SQL * Plus with SYS user and check the package UTL_MAIL (from
Oracle 11):
Sqlplus / as sysdba
SQL> descr utl_mail
ERROR:
ORA-04043: object utl_mail does not exist
• If you get this error message, you need to install the package system UTL_MAIL and give the execute permission to the user.
• Always as user SYS, run the script to install the package from the rdbms/admin folder of the Oracle home, and give the execution
permissions to the ETL user.
• So you have to check the smtp server in the Oracle initialization parameter file; in the following example it is not setted and I
indicate you how to do it. Here is the sequence of instructions for making these checks (remember to replace the file path of the
utlmail.sql file with the relative path to your Oracle installation):
System configuration (2)
D:>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 25 15:35:49 2014
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning option
SQL> @...RDBMSADMINutlmail.sql
SQL> @...RDBMSADMINprvtmail.plb
Package created.
Synonym created.
SQL> grant execute on UTL_MAIL to <ETL user>;
Grant succeeded.
SQL> sho parameters smtp
NAME TYPE VALUE
------------------------------------ ----------- -------
smtp_out_server string
SQL> alter system set smtp_out_server = <email server> scope=both;
System altered.
System configuration (3)
• Almost always, in the company there is a mail server, the value should be include the domain, for example. exch.dev.com.
• The option "scope = both" make the change effective immediately and permanently.
• We must also create and configure an ACL (from Oracle11) using the system package dbms_network_acl_admin. The ACL (Access
Control List) is only one way to define external resources to the RDBMS (such as email servers) and allow access to users.
• Now open slideshare [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]
where you will find all the instructions on how to download and run the installation script of this messaging system. The script will
execute all the necessary settings for you.
• All this will take no longer than 5 minutes of work.
Test
• In addition to the tests showned in the slides of slideshare, we perform another test now, certainly more exhaustive, which clearly
shows the functionality simulating a piece of ETL load.
• We start by creating a test table, initializing it from a system table. At this point we run an anonymous block of code (anonymous
block is defined as the set of SQL statements included between a begin and an end) that will simulate a real load of a table with
insert and delete of data. The sequence of steps is quite simple: you initialize global variables package to better identify the various
steps in the messaging table.
create table SALES as select * from tabs;
begin
mef.pv_sched_cod := 'Daily';
mef.pv_job_cod := 'Staging tables';
mef.pv_unit_cod := 'Load sales table';
mef.pv_exec_cnt := 10;
mef.p_send('proc_prova','Load of SALES table');
mef.p_send('proc_prova','Deleting...');
delete from sales;
mef.p_send('proc_prova','Deleted');
mef.p_send('proc_prova','Loading...');
insert into sales select * from tabs;
mef.p_send('proc_prova','Loaded');
mef.p_mail('MEF','ETL_administrator','Sales table loaded');
mef.p_send('proc_prova','Load ended');
end;
/
• The final result obtained, which can be seen in the table MEF_MSG_LOT, it is very interesting, and gives you the wealth of
contextual information that is talked about in the beginning.
Conclusion
• We saw in detail the steps required to build a messaging system, simple, but very useful for all the people working in the Data
Warehouse projects.
• This implementation, which is the basis of my Micro-ETL-Foundation, obviously works on any Oracle-PL/SQL project, is non-
invasive, and can be applied at any time inserting simple procedure calls in an existing ETL process.

Más contenido relacionado

La actualidad más candente

Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Massimo Cenci
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Massimo Cenci
 
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Massimo Cenci
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Massimo Cenci
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 Richie Rump
 
MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017Dave Stokes
 
Getting Started with MySQL I
Getting Started with MySQL IGetting Started with MySQL I
Getting Started with MySQL ISankhya_Analytics
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
Multiple files single target single interface
Multiple files single target single interfaceMultiple files single target single interface
Multiple files single target single interfaceDharmaraj Borse
 
Oracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesOracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesAlex Zaballa
 
Oracle Data Redaction
Oracle Data RedactionOracle Data Redaction
Oracle Data RedactionAlex Zaballa
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain ExplainedJeremy Coates
 
Oracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionOracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionAlex Zaballa
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System iChuck Walker
 

La actualidad más candente (20)

Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
DataBase Management System Lab File
DataBase Management System Lab FileDataBase Management System Lab File
DataBase Management System Lab File
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
Dbms lab Manual
Dbms lab ManualDbms lab Manual
Dbms lab Manual
 
MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017
 
Getting Started with MySQL I
Getting Started with MySQL IGetting Started with MySQL I
Getting Started with MySQL I
 
Less08 Schema
Less08 SchemaLess08 Schema
Less08 Schema
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
Multiple files single target single interface
Multiple files single target single interfaceMultiple files single target single interface
Multiple files single target single interface
 
Oracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesOracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New Features
 
Oracle Data Redaction
Oracle Data RedactionOracle Data Redaction
Oracle Data Redaction
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 
Oracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionOracle Database 12c - Data Redaction
Oracle Database 12c - Data Redaction
 
Sql loader good example
Sql loader good exampleSql loader good example
Sql loader good example
 
Oracle sql loader utility
Oracle sql loader utilityOracle sql loader utility
Oracle sql loader utility
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
 

Similar a ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1)

Similar a ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1) (20)

stigbot_beta
stigbot_betastigbot_beta
stigbot_beta
 
Mca2050 computer architecture
Mca2050  computer architectureMca2050  computer architecture
Mca2050 computer architecture
 
Lab6 rtos
Lab6 rtosLab6 rtos
Lab6 rtos
 
Concurrency and parallel in .net
Concurrency and parallel in .netConcurrency and parallel in .net
Concurrency and parallel in .net
 
Bsc it winter 2013 2nd sem
Bsc it  winter 2013 2nd semBsc it  winter 2013 2nd sem
Bsc it winter 2013 2nd sem
 
++Matlab 14 sesiones
++Matlab 14 sesiones++Matlab 14 sesiones
++Matlab 14 sesiones
 
Nt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language AnalysisNt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language Analysis
 
The best ETL questions in a nut shell
The best ETL questions in a nut shellThe best ETL questions in a nut shell
The best ETL questions in a nut shell
 
Buffer overflow attacks
Buffer overflow attacksBuffer overflow attacks
Buffer overflow attacks
 
BigDataDebugging
BigDataDebuggingBigDataDebugging
BigDataDebugging
 
ELT Publishing Tool Overview V3_Jeff
ELT Publishing Tool Overview V3_JeffELT Publishing Tool Overview V3_Jeff
ELT Publishing Tool Overview V3_Jeff
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Multithreading by rj
Multithreading by rjMultithreading by rj
Multithreading by rj
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Introduction to sql server
Introduction to sql serverIntroduction to sql server
Introduction to sql server
 
5010
50105010
5010
 
Linux Assignment 3
Linux Assignment 3Linux Assignment 3
Linux Assignment 3
 
Chapter 3 chapter reading task
Chapter 3 chapter reading taskChapter 3 chapter reading task
Chapter 3 chapter reading task
 
Wireless Communication Network Communication
Wireless Communication Network CommunicationWireless Communication Network Communication
Wireless Communication Network Communication
 
DLL Tutor maXbox starter28
DLL Tutor maXbox starter28DLL Tutor maXbox starter28
DLL Tutor maXbox starter28
 

Más de Massimo Cenci

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaMassimo Cenci
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Massimo Cenci
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniMassimo Cenci
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmerMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiMassimo Cenci
 

Más de Massimo Cenci (16)

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging area
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etl
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmer
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sql
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
 

Último

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Último (20)

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1)

  • 1. Recipes of Data Warehouse and Business Intelligence Massimo Cenci Data Warehouse Blog http://massimocenci.blogspot.it A messaging system for Oracle Data Warehouse (part 1) Micro ETL Foundation
  • 2. Introduction (1) • A goal of what I have called the Micro ETL Foundation (MEF) is to provide some simple and immediate solutions to the need of a Data Warehouse. What could be more simple (and necessary) of a log message? It may seem strange to devote an entire article to describe how to report the message "Hello world"; message, famous for all the programmers of the computer world. But we find that what appears simple is actually a bit more complicated. • If we take the first paragraph of any programming book, (eg. java) the solution is expressed with a single line: System.out.println ( Hello Word! "); • If we use another programming language (eg. PL / SQL Oracle), the solution is again expressed in a single line: Dbms_output.put_line ('Hello World'); • What I want to emphasize is the incompleteness of the received message; In fact, the complexity lies not in the message itself: at the end it is a simple function call. The complexity is the lack of context. Complexity is the metadata. • Suppose that at this time are running more loading processes. The fact of receiving a warning or alert message is not enough. What I want to know is when the message was sent, which job was running, how much time has elapsed since the previous message, which was the current procedure at the time of the message.
  • 3. Introduction (2) • We find out that answer to all these questions is no longer trivial. When we go out from theoretical examples printed on the books and enter into the reality of the daily work, everything becomes more complicated. • We will see the theory and practice, then the code, which will give us an answer to contextual needs described above. And the solution will be (surprise!), again expressed in a single line. • The programming language used is the Oracle PL / SQL (simply because it is widely used in the Data Warehouse world and then I turn to a wider audience), but the techniques exposed are easily playable on any other RDBMS. I forgot: we also see how to ensure that the message is sent via e-mail. • This, however, is not just for programmers. It 'an article that shows all the hard work behind a simple log message. Just imagine the work that is behind the complete loading process of a Data Warehouse.
  • 4. Definitions Naming Convention I spent a lot of words on the importance and need to always use a naming convention for the Data Warehouse projects. I summarize some concepts. In general, the naming convention is the method by which you choose to assign a name to the various entities of the system being designed. This method should produce a name that must be able to represent immediately the semantic nature of the entities that we will use. It does not matter if they belong to a Data Warehouse system or an operational system. ETL Process The ETL process is the set of programs that load data from external systems into the Data Warehouse tables . Since this set can be very complex and detailed, it is extremely important to have a messaging system that gives me the most information possible about the process. Job A job is a unit of logic describe a very specific task, such as loading a dimension of analysis or of a fact table or both. A set of job-related between them is in turn a job that is part of a schedule. Ie is activated at a preset time. In turn, a job consists of simple and sequential processing units that we call unit. Unit The unit is the elementary processing unit (so it is code) which in turn can call other procedures and functions that can be defined in a generic way, modules Execution A job usually runs at night, but it could also turn several times a day. Each execution of a job is a run that has to be identified with a sequential number, which can be defined as exec counter. For the time being be satisfied with these definitions very short. Later we will analyze in more detail
  • 6. Requirements • The requirements are very simple. To have a basic messaging system that signals everything that it is useful to monitor the execution of the ETL process. • It is a task of the designer to decide the content of these messages, which may simply be informative, as the number of rows processed, or of particular attention, such as the identification of errors or anomalies. In these cases it should be possible to send the message via e-mail. • The system should not merely store all messages generated, but must also have the context information. • The context is described in the paragraph relative to the definition of the ETL process. More context information we insert, and more we can control the system and we will be more efficient and faster to resolve the problems. • I suppose a minimum knowledge of Oracle and SQL and PL/SQL languages.
  • 7. Design • The design of the messaging component of the Micro ETL Foundation (MEF), consists of 3 tables, 2 sequences and 1 package. • We can see them as the basic components. The minimum necessary for the other MEF components that we will use in the future. • The naming convention used is a smaller version of the one to be applied to all objects of a Data Warehouse. [See http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques]. • But it would be unreasonable to apply to MEF the same logic that is used to organize the hundreds of tables typical of a large Data Warehouse. Also we can use MEF in any type of project. We will use this simplified structure: <entity name>=<area code>_ <type name>_<logical code> • The basic package, being common to all, will be further simplified by eliminating the <logical name>. • The area code is "MEF". • The table MEF_CFT is the table of system configuration. • The table MEF_EMAIL_CFT is the specific configuration table for e-mail addresses, the table MEF_MSG_LOT is one that will keep the text messages. • The sequence MEF_MSG_SEQ gives a sequence number to each message. • The sequence MEF_RUN_SEQ is used to numerically identify each job execution. • The Oracle package MEF is a library of programs that manages those objects. • All scripts for creating objects, can be downloaded from slideshare that show in a pragmatic and fun way the system [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]. The scripts are very minimal, with only the main structures, leaving the reader with the completion of all other accessory structures such as indexes, constraints, etc.. • The tables will be created in the default tablespace, but you should always create ad-hoc tablespace.
  • 8. The MEF_CFT table • This table contains general information about the MEF and the Data Warehouse. Other columns will be added in the future (or on your choice) prj_cod: Code of the Data Warehouse project user_cod: name of the Oracle user for ETL. email_srv_cod: E-mail Server. Every corporate has always a server for managing e-mail, indicate here that server. mef_root_txt: Path of the folder of MEF scripts. mef_dir: Oracle directory pointing to mef_root_txt
  • 9. The MEF_MSG_LOT table • This table stores all log messages that are sent by loading process. seq_num: Sequential number of the message. It is obtained from an Oracle sequence. day_cod: Time stamp of the message insert in the format YYYYMMDD. sched_cod: The identifier for the schedule to which the job belongs. job_cod: job identifier. It is a logical entity in the sense that we think of it as the launch of a list of processing units. unit_cod: Identifier of the processing unit within the job. We can think of it as a procedure or a function of the Oracle package. module_cod: module Identifier. A unit, though complex, can in turn call the sub-routine or subfunctions, ie modules. In this case, it is interesting to know yhid detail. rows_num: Number of rows processed. Typically, this field is not set, but if we want to report the number of rows, for example, inserted into a table, we also have this information. line_txt: Message text cline_txt: Message text in CLOB type. ss_num: Number of seconds that have elapsed since the previous message. This information, together with the next two, provides a summable data. If we wanted to know how long it took all the statements of a certain kind, we would be able to calculate. mi_num: Number of minutes that have elapsed since the previous message hh_num: Number of hours that occurred since the previous message elapsed_txt: Time elapsed since the previous message in the format HH24:MI:SS stamp_dts: Time stamp insertion of the message. exec_cnt: The identifier for the execution of the job. Every run of a job should be characterized by a number, in turn, extracted from an Oracle sequence. user_cod: Oracle user who posted the message. It is setted automatically by a session variable. It can be useful in cases where multiple users contribute to the loading process(not recommended)
  • 10. The MEF_EMAIL_CFT table • This table configures the email addresses. email_cod: Code to identify a group of recipients. from_txt: Sender of the message. This name will appear as the sender of the e-mail message. Do not use special characters, nor the blank between words. Eg. not set "Administrator ETL" but "Amministratore_ETL" otherwise you get a run-time error message like: ORA-29279: SMTP permanent error: 501 5.5.4 Invalid arguments to_txt: Identifier of the message recipient cc_txt: This e-mail address of the recipient in knowledge subj_txt: Subject default message status_cod: Status (1 = active, 0 = inactive) of the recipient
  • 11. The MEF_MSG_SEQ sequence • The sequence is an Oracle object. In practice it is a universal counter that increments each time you request it. Each message line must have its sequential number. It is more functional than time stamp to sort the table. • Because sometimes the messages are separated at a fraction of a second of each other, the time stamp might not be sufficiently discriminating. The MEF_RUN_SEQ sequence • Sequence that indicates unambiguously, a run of a job.
  • 12. The MEF package (1) • This is the basic package. I suggest to develop all the code inside PL/SQL packages (they are basically libraries), which allow a better management and use of the code. Now a short description of the units contained in the package. • f_str: Utility function to generate a string after replacing the input variables. • p_ins_msg_lot: This procedure perform the insertion of the message in the MEF_MSG_LOT table receiving a variable of row type as input parameter. This procedure has the "pragma autonomous_transaction". It is very important and requires a thorough description. It seems incredible, but if there were not, it would not be possible this messaging system. The pragma autonomous_transaction allows us to commit (i.e. to validate into the database) only and exclusively the DML statements of the unit which contains this compiler directive. This concept is crucial because it allows us to commit (so to insert data into the database) without affecting the logic of the loading process. Let's clarify with an example. Typically, before to loading the daily data into a table, you delete, first of all, the data of that day, and then load/reload the new data. (forget for a moment the partition manipulation) You do the commit at the end, where the loading, (delete/insert), was successfully completed. If i ran a commit after the delete and the insert has a problem, I could have the loss of data of the day. As the messaging needs to do a commit of the message in the table, the execution of the innocent message "I have done the delete", would validate even the delete itself. And this is a side effect not acceptable. The autonomous transaction solve the problem: the PL/SQL Oracle engine, produce a "daughter" transaction who live an autonomous life, which validates the data in the MEF_MSG_LOT table without affecting the logic of the parent transaction. • p_rae: The management of the exception is standardized in the following manner. When any Oracle error happen, in the "when others" instruction there is the call of the p_rae procedure that enriches the content of the error with other useful information, such as, for example, where the exception occurred. The output will always be the standard error pv_error. The "when pv_error" is used to make sure that you keep the original error. • p_init: Other private procedure. It initializes the line_row variable to perform the insert in the table. • delta_time: Procedure that, based on the input parameters, such as the date of the last message and the current date , calculates all the delta-time information, how many seconds, minutes, hours have passed and a delta time in the 'HH24:MI:SS' format. The ways in which we can calculate the delta time are numerous: one used is just one of many. • f_get_seq_val: Function that extracts generalized, in a dynamic way, the next number of the Oracle sequence whose name is given in the parameter. • f_get_exec_cnt: Function that extracts the execution number of the job. Before calling the generic function f_get_seq_val, it verify that has not already been set as a global variable: in this case, use the number of the current execution.
  • 13. The MEF package (2) • f_get_cft: Function that extracts the current configuration from the MEF_CFT table . • p_esend: A function that performs the sending of email via the UTL_MAIL package. • p_send: This is the procedure that sends the message. • p_mail: Procedure for sending the e-mail. Using the email code in inputit looks for all the recipients in the MEF_EMAIL_CFT table and calls the p_esend passing all required parameters.
  • 14. System configuration (1) • Before you can create all the structures described above, you must perform some environment check. • First of all it is necessary to verify that the Oracle RDBMS is designed for sending emails. • So you have to make sure that the Oracle user that sends the messages has all grant necessary for its operations. Let's see in detail. SMTP check • We need to verify that the Oracle RDBMS is designed for sending emails. In fact the sending is activated by calling a procedure that is part of a package of the RDBMS. To verify this, connect to SQL * Plus with SYS user and check the package UTL_MAIL (from Oracle 11): Sqlplus / as sysdba SQL> descr utl_mail ERROR: ORA-04043: object utl_mail does not exist • If you get this error message, you need to install the package system UTL_MAIL and give the execute permission to the user. • Always as user SYS, run the script to install the package from the rdbms/admin folder of the Oracle home, and give the execution permissions to the ETL user. • So you have to check the smtp server in the Oracle initialization parameter file; in the following example it is not setted and I indicate you how to do it. Here is the sequence of instructions for making these checks (remember to replace the file path of the utlmail.sql file with the relative path to your Oracle installation):
  • 15. System configuration (2) D:>sqlplus / as sysdba SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 25 15:35:49 2014 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Partitioning option SQL> @...RDBMSADMINutlmail.sql SQL> @...RDBMSADMINprvtmail.plb Package created. Synonym created. SQL> grant execute on UTL_MAIL to <ETL user>; Grant succeeded. SQL> sho parameters smtp NAME TYPE VALUE ------------------------------------ ----------- ------- smtp_out_server string SQL> alter system set smtp_out_server = <email server> scope=both; System altered.
  • 16. System configuration (3) • Almost always, in the company there is a mail server, the value should be include the domain, for example. exch.dev.com. • The option "scope = both" make the change effective immediately and permanently. • We must also create and configure an ACL (from Oracle11) using the system package dbms_network_acl_admin. The ACL (Access Control List) is only one way to define external resources to the RDBMS (such as email servers) and allow access to users. • Now open slideshare [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2] where you will find all the instructions on how to download and run the installation script of this messaging system. The script will execute all the necessary settings for you. • All this will take no longer than 5 minutes of work.
  • 17. Test • In addition to the tests showned in the slides of slideshare, we perform another test now, certainly more exhaustive, which clearly shows the functionality simulating a piece of ETL load. • We start by creating a test table, initializing it from a system table. At this point we run an anonymous block of code (anonymous block is defined as the set of SQL statements included between a begin and an end) that will simulate a real load of a table with insert and delete of data. The sequence of steps is quite simple: you initialize global variables package to better identify the various steps in the messaging table. create table SALES as select * from tabs; begin mef.pv_sched_cod := 'Daily'; mef.pv_job_cod := 'Staging tables'; mef.pv_unit_cod := 'Load sales table'; mef.pv_exec_cnt := 10; mef.p_send('proc_prova','Load of SALES table'); mef.p_send('proc_prova','Deleting...'); delete from sales; mef.p_send('proc_prova','Deleted'); mef.p_send('proc_prova','Loading...'); insert into sales select * from tabs; mef.p_send('proc_prova','Loaded'); mef.p_mail('MEF','ETL_administrator','Sales table loaded'); mef.p_send('proc_prova','Load ended'); end; / • The final result obtained, which can be seen in the table MEF_MSG_LOT, it is very interesting, and gives you the wealth of contextual information that is talked about in the beginning.
  • 18. Conclusion • We saw in detail the steps required to build a messaging system, simple, but very useful for all the people working in the Data Warehouse projects. • This implementation, which is the basis of my Micro-ETL-Foundation, obviously works on any Oracle-PL/SQL project, is non- invasive, and can be applied at any time inserting simple procedure calls in an existing ETL process.