2. Agenda
• XML in DB2
• Saving hordes of lines of code and precious time too
• SQL/XML
• XQuery
• RSS Generator
• Workflow example
• XML in oracle
• DB2 x Oracle (xquery performance)
3. XML in DB2
(Native XML Database)
• DB2 code name was Viper
• Native XML support
– Native Storage for XML
– Stores parsed XML
– Big documents get divided into regions
• Indexes on XML column
– Internal indexes
– User Created indexes
• XML Schema
• Xquery
– XPATH
– FLOWR
4. Native XML support
• Storing it hierarchical
• Keep XML as XML
• DB2 will store XML in parsed hierarchical format (similar to the DOM representation)
• “Native” = the best-suited on-disk representation of XML
CREATE TABLE dept ( deptID char(8), … , doc xml);
• Relational columns are stored in relational format
• XML columns are stored natively
• All XML data is stored in XML-typed columns
5. Native XML Storage
•1 String table per database
•Database wide dictionary for
all tags in all xml columns
7. XML in DB2
DB2 Client/Application
SQL/XML XQuery
Relational Interface XML Interface
DB2 Engine
SQL/XML Parser XQuery Parser
Hybrid SQL/XQuery Compiler
Query Evaluation and Runtime XML Navigation
8. XML in DB2
• CREATE TABLE msg ( item XML)
• INSERT INTO msg VALUES (
XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
PRESERVE WHITESPACE)
)
• REGISTER XMLSCHEMA 'http://sample/po'
FROM 'file:item.xsd'
AS xscma COMPLETE
• INSERT INTO msg VALUES (
XMLVALIDATE(XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma)
)
• CREATE INDEX xind_newsgroup
ON msg(item)
GENERATE KEY USING XMLPATTERN '//@newsgroup„
AS SQL VARCHAR(50)
9. Saving hordes of lines of code
• Web applications use databases
• What they get from database is relational data
• Relational data need to be used to form xml in the end and this
involves DOM/SAX operations
• But what if they get the required xml formed direct from database
by firing a single xquery?
• With DB2 XML, you
– Don't involve so many relational tables
– Don't keep fetching relational records out
– Don't need external DOM/SAX operations
– Just need a single Xquery and required xml doc is ready in one fetch
– Save a lot of execution time and also hordes of lines of code
10. SQL/XML
• A standardized mechanism for using SQL and XML together
• Retrieve data as XML from relational objects
• A set of functions
xmlelement() Creates an XML element, allowing the name to be specified
xmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute
xmlroot() Creates the root node of an XML document
xmlcomment() Creates an XML comment
xmlpi() Creates an XML processing instruction
xmlparse() Parses a string as XML and returns the resulting XML structure
xmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding element
xmlconcat() Combines a list of individual XML values to create a single value containing an XML forest
xmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.
11. SQL/XML
SELECT XMLELEMENT(NAME, “Department”,
XMLATTRIBUTES( e.department AS “name”),
XMLAGG( XMLELEMENT(NAME “emp”, e.firstname))
) AS “department_list”
FROM employee e
WHERE . . .
GROUP BY e.department
department_list
firstname lastname department <Department name=“A00”>
<emp>CHRISTINE</emp>
SEAN LEE A00 <emp>VINCENZO</emp>
<emp>SEAN</emp>
MICHAEL JOHNSON B01 </Department>
VINCENZO BARELLI A00
<Department name=“B01”>
CHRISTINE SMITH A00 <emp>MICHAEL</emp>
</Department>
12. SQL/XML
Traditional way SQL/XML
db2 select empno, firstnme, lastname from employee SELECT XMLSerialize(
XMLELEMENT(NAME "TABLE",
EMPNO FIRSTNME LASTNAME -- XMLATTRIBUTES(‟80%‟ AS “width”)
------ ------------ --------------- XMLAGG( XMLELEMENT(NAME "TR",
000010 CHRISTINE HAAS XMLELEMENT(NAME "TD", empno),
000020 MICHAEL THOMPSON XMLELEMENT(NAME "TD", firstnme),
. XMLELEMENT(NAME "TD", lastname))))
. AS varchar(4000)) FROM employee
000030 SALLY KWAN
200340 ROY ALONZO
42 record(s) selected.
// fetching relational data // single fetch and html(xml) is ready
//construct html table <% rs.next(); %>
<table …. <%=(rs.getString(1))%>
<!—setting table attributes -> // job done
<%While(rs.next())// 42 fetches {%>
// construct table rows
<tr…>
<!—setting row attributes ->
//construct table columns
<!—setting column attributes ->
<td…><%=(rs.getString(“EMPNO”))%>
<td…><%=(rs.getString(“FIRSTNME”))%>
<td…><%=(rs.getString(“LASTNAME”))%>
</tr>
<%}%>
13. XQuery
New kid on the block
• A language for running queries against XML-tagged documents in files and “databases”
• Provides XPath compatibility
• Supports conditional expressions, element constructors
• FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path
• Result of an XQuery is an instance of XML Query Data Model
• Uses XML Schema types, offers static typing at compile time and dynamic typing at run time,
supports primitive and derived types
• could evaluate to simple node values (such as elements and attributes) or atomic values (such as
strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values.
• XQuery update is planned
14. FLWOR Expression
• FOR: iterates through a sequence, bind variables to items
• LET: binds a variable to a sequence
• WHERE: eliminates items of the iteration
• ORDER: reorders items of the iteration
• RETURN: constructs query results
FOR $movie in db2-fn:xmlcolumn(„MOVIE.DOC‟)
LET $actors :=$movie/actor
WHERE $movie/duration > 90
ORDER by $movie/@year
RETURN <movie>
<title>Chicago</title>
<movie> <actor>Renne Zellweger</actor>
{$movie/title, $actors} <actor>Richard Gere</actor>
<actor>Catherine Zita-Jones</actor>
</movie> </movie>
16. XQuery
examples
• Getting the list of messages where the description contains a particular string
(“uninitialized” in this case)
xquery
for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($a/item/description,"uninitialized")
return $a
• Getting the first 3 messages sent by an author to the news group
xquery
let $a :=
(
for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($b/item/author,"Shridhar")
return $b
)
return $a [position() < 4]
17. XQuery
examples
• Getting the last 5 messages sent by an author to the news group
xquery
let $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($b/item/author,"Shridhar")
return $b
let $c := count($a)
let $d := $c - 5
return $a [position() > $d]
• Returns the list of authors and the number of messages they have sent to the group
xquery
let $a := db2-fn:xmlcolumn('MSG2.ITEM')/msg/item/author
let $b := distinct-values($a)
for $e in ($b)
let $d := count(for $c in db2-fn:xmlcolumn('MSG2.ITEM')/msg/item
where $c/author = $e
return $c )
return
<result>
<author>{$e}</author>
<message-count>{$d}</message-count>
</result>
18. RSS Generator
• Really Simple Syndication (lightweight XML format designed for sharing data)
• A web application to generate RSS and ATOM feeds
• Source: data (messages) from news servers
• Uploading messages from news server to xmldb2 in xml document format
• Used XML Schema definition support for validation at database level
• Used xml indexes as necessary based on XQueries
• Need just a single xquery fetch to generate RSS/ATOM feeds
20. RSS Generator
(Administration)
news
message xml record
News Updater
NXD
Uploading newsgroup messages to NXD
News Server Database
<msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟>
<item>
<title>Gold Mobile</title>
<link><d1nl7v$4lug$5@news.boulder.ibm.com></link>
<author>Nadine <Nadine.grantham@gmail.com></author>
<pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate>
<description>some description here…</description>
</item>
</msg>
xml record
21. RSS Generator
NXD
Xquery
<xml>
response
request Generator
Web Browser Web Browser
22. RSS Generator
• One Xquery and the job is done
• Result of XQuery is a single record which is a RSS document
• No DOM/SAX stuff
• Not even 2nd fetch
xquery
for $a in ( 1 to 1 )
return
<rss version="2.0">
<channel>
<title> newsServer:newsGroup </title>
<link>http://newsServer</link>
<description>The latest content from newsServer:newsGroup</description>
<lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate>
{
let $e :=
( for $b in
db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="newsServer"][@newsgroup="newsGroup"]
where $b/item[fn:contains(title,"subject")]
and/or $b/item[fn:contains(author,"author")]
and/or $b/item[fn:contains(description,"description")]
order by fn:number($b/@id) descending
return $b
)
for $i in ( 1 to n)
return $e[$i]/item
}
</channel>
</rss>
24. Workflow Example
A Document Approval System ( One simple, Content Management Use Case)
• A Web Application
• Uses Native XML features
• Just a single xquery fetch and html (xml) is ready
• Simple and easy to use
• Facilitates document review process
• Uses NXD to store document state related info
• Facilitates easy querying of requests based on assignee, reviewer, request states etc
26. XML in OracleStorage
• XMLType Storage( Gone Relational and not Native)
– CLOB
• Whole Document Stored in one column
• Requires DOM operations
• Text Indexing
• Inefficient update
– Object Relational
• Document Shredded across tables, rows and columns
• Requires XML Schema
• Insert/retrieval requires (de) composition
27. XML in Oracle Index
XML
• CTXXPATH( Gone Relational and not Native)
– When you need to speed up existsNode() queries on an XMLType
column.
– e.g.
• CREATE INDEX [schema.]index on
[schema.]table(XMLType column) INDEXTYPE IS
ctxsys.CTXXPATH [PARAMETERS ('[storage
storage_pref] [memory memsize]')];
– Looks a bit complicated
– No XML specific index support
28. XML in OracleXquery
• No full support for Xquery
e.g.
SELECT XMLQuery(„Xquery for $a in
ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=“pspl.misc”]
Return <root>{$a/item/title}</root>‟) from MSG
This xquery will return some “null” values where newsgroup condition doesn‟t match
The xquery will need to be modified to suppress the „null‟ values and so to get the proper result
SELECT XMLQuery(„Xquery for $a in
ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=„pspl.misc‟]
Return <root>{$a/item/title}</root>‟) from MSG
WHERE ExistsNode(ITEM,‟/msg[@newsgroup=“pspl.misc”]‟)=1
More the conditions, bigger the query with more number of ExistsNode calls
29. XML in OracleXquery
• No full support for Xquery
another example
SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg
Where contains($a/item/title,”join”)
Return <root>{$a/item/title}</root>‟) from MSG
This xquery will return some “null” values where contains return false
– Now there is no workaround for this. One can not modify this query to give proper result as
one can not specify “contains” function within ExistsNode. So possible workaround is to add
some code at application level to suppress „null‟ values
30. XML x Oracle
(Sample Database Design) Xquery performance
Table1: msg
Item (xml)
DB2 xml index:
<msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟>
create index xind_newsserver
<item>
on msg(item)
<title>Re: SIGPIPE - Finding the thread</title>
generate key using
<link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link>
xmlpattern '//@newsserver'
<author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author>
as sql varchar(50);
<pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate>
<description>some description here…</description>
</item>
</msg>
<msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟>
<item>
<title>Gold Mobile</title>
<link><d1nl7v$4lug$5@news.boulder.ibm.com></link>
<author>Nadine <Nadine.grantham@gmail.com></author>
<pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate>
<description>some description here…</description>
</item>
</msg>
Oracle CTXXPATH index:
CREATE INDEX on MSG (ITEM) around 4, 50 000 xml records
INDEXTYPE IS ctxsys.CTXXPATH on both side DB2 and oracle
31. XML x OracleXquery performance
Db2_1.sql
xquery
for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($a/item/description,“sample")
return $a
Execution time in milliseconds: 187525
ora_1.sql
select
xmlquery ('for $a in /msg
where contains($a/item/description,"sample")
return $a'
passing item returning content) result
from msg
ORA-04030: out of process memory
32. XML x OracleXquery performance
Db2_2.sql
xquery
for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($a/item/title,"Lint")
return $a
Execution time in milliseconds: 198474
ora_2.sql
select
xmlquery ('for $a in /msg
where contains($a/item/title,“Lint")
return $a'
passing item returning content) result
from msg
ORA-04030: out of process memory
33. XML x OracleXquery performance
Db2_3.sql
xquery
for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
where $a/@newsgroup = "control.cancel"
return $a
Execution time in milliseconds: 126858
ora_3.sql
select
xmlquery ('for $a in /msg
return $a'
passing item returning content) result
from msg
Where existsNode(ITEM,'/msg[@newsgroup="control.cancel"]')=1
Took more than an hour to fetch all records
34. XML x OracleXquery performance
Db2_4.sql
xquery
let $a := (
for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
where contains($b/item/author,"Shantanu Gadgil")
return $b
)
return $a [position() < 10]
Execution time in milliseconds: 173419
ora_4.sql
select * from
(
select xmlquery ('for $a in /msg
where contains($a/item/author,"Shantanu Gadgil ")
order by $a[@id]
return $a'
passing item returning content) result
from msg
)
where rownum <= 10
ORA-04030: out of process memory