2. Objectives What is XPath? An introduction to the XPath 1.0 language XML refresher XPath basics What else can you do with XPath 1.0? Where to go for more information
3. XPathXML Path Language Path notation with slashes newsItem/rightsInfo/copyrightHolder recipe/ingredientList/ingredient Like UNIX directory paths or URLS
4. What is XPath? Syntax for defining parts of an XML document Locate elements or attributes Performing operations over data XPath contains a library of standard functions Numeric, string, boolean A major part of several XML standards XSLT, XQuery, XML Schema, Schematron
5. XPath Introduction:XML Refresher XML documents contain one or more elements, delimited by start and end tags <foo> </foo> Elements can be nested to any depth <foo> <bar></bar> </foo>
6. XML Attributes and Text Content Elements can have attributes <foo lang=“fr”> <bar id=“theOne” lang=“en”></bar> </foo> Elements can have text content <foo lang=“fr”> <bar lang=“en”>theOne</bar> </foo> Empty elements have no children or text <foo></foo> A shorthand for writing empty elements <foo />
7. XML Namespaces Elements can be defined in different namespaces Namespaces look like URLs You can use xmlns to declare a default namespace <newsItemxmlns='http://iptc.org/std/nar/2006-10-01/'> <itemMeta> <title>Pope Blesses Astronauts</title> </itemMeta></newsItem> newsItemis in the http://iptc.org/std/nar/2006-10-01/namespace itemMeta and title are also in the http://iptc.org/std/nar/2006-10-01/ ns Child elements inherit from their parents
8. XML Namespace Prefixes You can use xmlns:prefix to declare a namespace and bind it to a prefix <nar:newsItemxmlns:nar='http://iptc.org/std/nar/2006-10-01/'> <nar:itemMeta> <nar:title>Pope Blesses Astronauts</nar:title> </nar:itemMeta></nar:newsItem> newsItem is in the http://iptc.org/std/nar/2006-10-01/ namespace itemMeta and title are also in the http://iptc.org/std/nar/2006-10-01/ namespace To an XML parser, this document and the previous one are identical
9. XPath Crash CourseThe Basics: Selecting Elements The simplest XPath form: one or more tag names, separated by slashes (/) newsItem/itemMeta/title<- title under itemMeta Use * instead of a tag name to match anything newsItem/*/title <- title grandchildren of newsItem An empty tag searches all levels of the tree //title Every title element in the doc newsItem//title Every title under newsItem
10. XPath: Using Attributes Attribute values are indicated by @ @rel<- The rel attribute of the current element Element and Attribute values are tied by /@ link/@rel<- The rel attribute of the link element Use [] for conditional selections link[@rel] <- link element with a rel attribute link[@rel = “parent”] link[@size < “1000”] link[not(@href)]
11. XPath and Namespaces XPath supports namespaces nitf:p <- The p element from the nitf namespace xhtml:p <- The p element from the xhtml ns nar:* <- Any element from the nar namespace @atom:* <- Any attribute from the atom ns Protip: if you can’t figure out why your XPath expression isn’t matching, check the namespace
12. What Else Can XPath Do?Numeric, String, Boolean Functions Publication/FilingMetadata[1] Publication/FilingMetadata[last()] Publication/FilingMetadata[last() - 1] FilingMetadata[position() mod 2 = 0] FilingMetadata[Category = “q” or Category = “j”] not(contains(SlugLine, “advisory”)) starts-with(FilingOnlineCode, “1”) And XPath 2.0 adds even more functions, including regular expressions
13. More XPath Information List of examples: http://msdn.microsoft.com/en-us/library/ms256086.aspx Introductory, interactive tutorial: http://www.zvon.org/comp/r/tut-XPath_1.html More advanced tutorial: http://www.ibm.com/developerworks/xml/tutorials/x-xpath/section2.html XPath chapter from XML in a Nutshell: http://oreilly.com/catalog/xmlnut/chapter/ch09.html