Cagando Datos con APEX_DATA_PARSER

@ViscosityNAwww.viscosityna.com
Cargando datos desde cualquier
ambiente a tu base de Datos Oracle
usando APEX_DATA_PARSER
• Rodolfo Rodriguez

Agenda
• Data Load - Metodos
• Usando SQL Workshop – Data Worksho
• APEX Data Loader Wizard
• Demo 1
• APEX_DATA_PARSER
• Revision Detallada del API
• Demo 2

Acerca de mi:
• 25+ años de experiencia en tecnologias Oracle
• 11+ años usando, aprendiendo y enseñando APEX
• Oracle APEX evangelizador
• Co-fundador de TIACA (pioneros con APEX en Venezuela desde
2008)
• Software Developer en el equipo APEX de Viscosity NA desde 2018
Rodolfo Rodriguez
@rrd71

Viscosity es un quipo de expertos especializados en
Oracle Database, Oracle Cloud,
Oracle APEX, and Engineered Systems.
Email hello@viscosityna.com para solicitar mis laminas

Pilares de Viscosity y Modelos de Servicios
Database
InfrastructureAppsData
Data Integration
IAAS / PAAS
Data Warehousing
Analytics
GoldenGate
SAAS / PAAS
CX ERP SCM HCM
Middleware
Web Applications
Mobility
EBS / JDE / PS
IAAS
Public Cloud
Bare Metal Cloud
AWS Azure Bluemix
Engineered Systems
Exa- ZFS @Customer
X86 / Sparc
Workshops Assessments
Proof of
Concepts
Training Turnkey Projects
Managed
Services

Preambulo: Porque cargar datos?
• Migraciones
• Datos de configuración para nuevas
implementaciones
• Procesos ETL
• DW datafeeds
• Mudarnos al Cloud
• APIs
• Agregar todos los demas …

Metodos
de Cargar Datos
SQL*Loader
IMP /
DATADUMP
External
Tables
Otras
Herramientas
APIs propios

SQL*Loader
express mode
Desde Oracle 12.1 el sqlldr command-line utilitario
supporta “Express Mode”
• Por defecto lee un csv file
• Carga los datos en la table especificada
• Las tablas deben tener la misma estructura que los
archivos CSV.
$ sqlldr userid=scott/tiger table=emp

APEX
SQL Workshop
• Simple y rapida manera de cargar datos
• Soporta varios formatos de archivos
• Puedes crear nuevas tablas desde
archivos Excel
• Puedes crear “Lookup Tables” basadas en
los datos cargados

APEX
Data Loader
Wizard
• Disponible desde APEX 5.0
• “Out of the box” data loader para
nuestras aplicaciones
• Crea un conjunto de paginas par ael
proceso de cargar datos
• Crea los “shared components”
necesarios
• Facil de usar

DEMO 1
SQL WORKSHOP – IMPOTRAR
DATOS DESDE EXCEL Y CREAR UNA
TABLA DESDE ALLI

Justificación
• Limitaciones del Wizard
• Necesidad de manejar otros formatos
• Reglas de negocio
• Necesidad de manejar datos dinamicamente
…
APEX_DATA_PARSER

APEX_DATA_PARSER API
• APEX_DATA_PARSER PL/SQL package es una util interfaz
para cargar datos.
• APEX_DATA_PARSER soporta los siguientes formatos de
archivos : (CSV, Tab-Delimited), JSON, XML and XLSX).
• El “parser” es implementado como una “function table” –de
esta manera puedes utilizer el resultado como una tabla
SQL: select * from table( apex_data_parser.parse( ….

APEX_DATA_PARSER – PARSE
• Esta es la función principal.
• Permite analizar archivos XML, XLSX, CSV or JSON files.
• Retorna una table con la siguiente estructura:
LINE_NUMBER COL001 COL002 COL003 COL004 ... COL300
• Todos los valores son retornados como VARCHAR2.
• Maneja un maximo de 300 columnas.
• La longitude maxima de cada columna es 4000 bytes

APEX_DATA_PARSER – PARSE
Pero, cómo obtengo el
archivo? BLOB Column in a
table
APEX
File Browser ITEM
DBMS_LOB
BFILE
Function
Returning BLOB
Others
BLOB PARSE
TABLE
RESULT
SET

APEX_DATA_PARSER
TYPES
Esta columna is una Tabla
del tipo: t_file_column

APEX_DATA_PARSER
PARSE
Parameter Description
P_CONTENT The file content to be parsed as a BLOB
P_FILE_NAME The name of the file; only used to derive the file type. Either P_FILE_NAME, P_FILE_TYPE or P_FILE_PROFILE must be
passed in.
P_FILE_TYPE The type of the file to be parsed. Use this to explicitly pass the file type in.
Either P_FILE_NAME, P_FILE_TYPE or P_FILE_PROFILE must be passed in.
P_FILE_PROFILE File profile to be used for parsing. The file profile might have been computed in a previous PARSE() invocation. If passed
in again, the function will skip some profile detection logic and use the passed in profile - in order to improve
performance.
P_DETECT_DATA_TYPES Whether to detect data types (NUMBER, DATE, TIMESTAMP) during parsing. If set to 'Y', the function will compute the
file profile and also add data type information to it. If set to 'N', no data types will be detected and all columns will
be VARCHAR2. Default is 'Y'.
P_DECIMAL_CHAR Use this decimal character when trying to detect NUMBER data types. If not specified,the procedure will auto-detect the
decimal character.
P_XLSX_SHEET_NAME For XLSX workbooks. The name of the worksheet to parse. If omitted, the function uses the first worksheet found.
P_ROW_SELECTOR For JSON and XML files. Pointer to the array / list of rows within the JSON or XML file. If omitted, the function will:
•For XML files: Use "/*/*" (first tag under the root tag) as the row selector.
•For JSON files: Look for a JSON array and use the first array found.
P_CSV_ROW_DELIMITER Override the default row delimiter for CSV parsing.
P_CSV_COL_DELIMITER Use a specific CSV column delimiter. If omitted, the function will detect the column delimiter based on the first row
contents.
P_CSV_ENCLOSED Override the default enclosure character for CSV parsing.
P_SKIP_ROWS Skip the first N rows when parsing.
P_ADD_HEADERS_ROW For XML, JSON: Emit the column headers (tag, attr names) as the first row.
P_FILE_CHARSET File encoding, if not UTF-8 (AL32UTF8).
P_MAX_ROWS Stop parsing after P_MAX_ROWS have been returned.
P_RETURN_ROWS Amount of rows to return. This is useful when the parser shall to parse more rows (for data type detection), than it is
supposed to return. When the specified amount of rows have been emitted, the function will continue parsing (and
refining the detected data types) until P_MAX_ROWS has been reached, or until the ROWNUM < x clause of the SQL
query kicks in and stops execution.
P_STORE_PROFILE_TO_COLLECTION Store the File profile which has been computed during parse into a collection. The collection will be cleared, if it exists.
Only be used for computed profiles.

APEX_DATA_PARSER –PARSE
Necesitamos el
contenido del archivo
como un BLOB

APEX_DATA_PARSER – DISCOVER
• Es una function que nos
permite descubrir el “profile”
de un archivo
• Retorna el “profile” del
archivo en formato JSON
como un CLOB .

APEX_DATA_PARSER –
DISCOVER
p_content The file content to be parsed as a BLOB
P_FILE_NAME The name of the file used to derive the file type.
P_DECIMAL_CHAR Use this decimal character when trying to detect NUMBER data types. If not specified, the
procedure will auto-detect the decimal character.
P_XLSX_SHEET_NAME For XLSX workbooks. The name of the worksheet to parse. If omitted, the function uses the
first worksheet found.
P_ROW_SELECTOR Whether to detect data types (NUMBER, DATE, TIMESTAMP) during parsing. If set to 'Y', the
function will compute the file profile and also add data type information to it. If set to 'N', no
data types will be detected and all columns will be VARCHAR2. Default is 'Y'.
P_DECIMAL_CHAR Use this decimal character when trying to detect NUMBER data types. If not specified,the
procedure will auto-detect the decimal character.
P_XLSX_SHEET_NAME For XLSX workbooks. The name of the worksheet to parse. If omitted, the function uses the
first worksheet found.
P_ROW_SELECTOR •For JSON and XML files. Pointer to the array / list of rows within the JSON or XML file. If
omitted, the function will:For XML files: Use "/*/*" (first tag under the root tag) as the row
selector.
•For JSON files: Look for a JSON array and use the first array found.
P_CSV_COL_DELIMITER Use a specific CSV column delimiter. If omitted, the function detects the column delimiter
based on the first row contents.
P_CSV_ENCLOSED Override the default enclosure character for CSV parsing.
P_FILE_CHARSET File encoding, if not UTF-8 (AL32UTF8).
P_MAX_ROWS Stop discovery after P_MAX_ROWS rows have been processed.

GET_COLUMNS
• Esta función retorna las columnas de un “profile” previsamente parseado con el
objetivo de ser consumido por los componentes de APEX.
• El “profile” del archive desde haber sido procesado previamente usando las
funciones: PARSE() o DISCOVER() .
• RETORNA una table del tipo: APEX_T_PARSER_COLUMNS
P_FILE_PROFILE File profile to be used for parsing. The file profile might have been
computed in a previous PARSE() or DISCOVER() invocation.

select * from table(
apex_data_parser.get_columns(
apex_data_parser.discover(
p_content => {BLOB containing
XLSX file},
p_file_name=>'large.xlsx' )));
apex_data_parser.get_columns
apex_data_parser.discovery
Data in BLOB format

GET_FILE_PROFILE
• Esta función retorna el ’profile” del archivo procesado en formato JSON.
• El ”profile” del archive contiene meta data correspondiente a los archivos parseados, por
ejemplo, el delimitador para un archivos CSV, los nombres de las hojas de trabajo para
archivos Excel (XLSX) y las columnas encontradas durante en la fase de parseo.
•
- None.

GET_FILE_TYPE
• Esta función retorna el tipo de archivo derivado de la extension del
archivo.
p_file_name File name to get the file type.

APEX_DATA_PARSER – GET_XLSX_WORKSHEETS_TYPE
• Esta function retorna información sobre las hojas de trabajo contenidas en un
archivo Excel (XLSX) como una lista del tipo : apex_t_parser_worksheet instances.
p_content XLSX worksheet as a BLOB

Parse Method Discover Method
Excel with Multiple
Worksheets
Web service + Data
Parser
Upload File via
WebService - Parse
Data (IR + PACKAGE)
DEMO 2

Use Case 1
• Recibimos varios archivos, con
diferentes formatos (Excel Files,
CSV Files, Text Files), asi que
ncesitamos explorarlos y buscar
una información en especifico.

Use Case 2
• Ya identificamos el archivo correcto, sin embargo, ahora necesitamos
comparar la estructura del archivo con la estructura de la tabla

Use Case 3
• Tenemos un archivo Excel que contiene multiples hojas de trabajo, asi que
queremos obtener una lista de ellas, seleccionar alguna y entonces
mostrar la data que contiene

Use Case 4
• Queremos obtener algunos datos desde un ”endpoint” REST, este nos
devolvera en la respuesta en formato JSON y necesitamos parsiarlo
usando apex_data_parser

Use Case 5
• Imaginemos que recibimos datos referents a tarjetas de credito via un
web service en ORDS, asi que implementremos una solucion en nuestro
“backend” para procesar el archivo.
ORDS - Web Service
APEX_DATA_PARSER

APEX_DATA_PARSE es INCREIBLE!!!

Referencias:
• https://docs.oracle.com/en/database/oracle/application-
express/19.1/aeapi/APEX_DATA_PARSER.html#GUID-07E9397C-DF26-
40F7-AC73-F46961E1088A
• https://blogs.oracle.com/apex/super-easy-csv-xlsx-json-or-xml-
parsing-about-the-apex_data_parser-package
• https://nuijten.blogspot.com/2019/10/ogb-appreciation-day-
apexdataparser.html
• https://oracle-base.com/articles/misc/apex_data_parser

Rodolfo Rodriguez
rodolfo.rodriguez@viscosityna.com
GRACIAS!

Cagando Datos con APEX_DATA_PARSER

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Cagando Datos con APEX_DATA_PARSER

Similar a Cagando Datos con APEX_DATA_PARSER (20)

Último

Último (7)

Cagando Datos con APEX_DATA_PARSER