SlideShare una empresa de Scribd logo
1 de 77
Descargar para leer sin conexión
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




            eXtensible Markup Language APIs in Java 1.6
                Simple and efficient XML parsing using Java lanaguage


                                            Wojciech Podg´rski
                                                         o
                                              http://podgorski.wordpress.com




                                                   April 8, 2008



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                           eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Presentation outline
      1    Introduction
              What is parsing
              Diffrent ways of parsing documents
      2    XML API’s in Java
              SAX
              DOM
              StAX
      3    Capabilities and performance comparison
      4    CASE STUDY: Parsing Really Simple Syndication (RSS) doc
      5    What next? Alternatives to API’s, Java SE 7.0 features
      6    Summary
      7    Further reading...
       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      Parsing definition
      Parsing, more formally called syntactic analysis is the process of
      analyzing a sequence of tokens to determine grammatical structure
      with respect to a given formal grammar.
                                                                      Source: http://en.wikipedia.org/wiki/Parsing




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      We can distinguish three main models of parsing XML documents.
      Each one of them differs with mechanism of traversing between
      the nodes and idea of processing XML data.
      Those models are:

              SAX - Simple API for XML
              DOM - Document Object Model
              StAX - Streaming API for XML




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      We can distinguish three main models of parsing XML documents.
      Each one of them differs with mechanism of traversing between
      the nodes and idea of processing XML data.
      Those models are:

              SAX - Simple API for XML
              DOM - Document Object Model
              StAX - Streaming API for XML




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      We can distinguish three main models of parsing XML documents.
      Each one of them differs with mechanism of traversing between
      the nodes and idea of processing XML data.
      Those models are:

              SAX - Simple API for XML
              DOM - Document Object Model
              StAX - Streaming API for XML




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      We can distinguish three main models of parsing XML documents.
      Each one of them differs with mechanism of traversing between
      the nodes and idea of processing XML data.
      Those models are:

              SAX - Simple API for XML
              DOM - Document Object Model
              StAX - Streaming API for XML




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      That’s not all! There are other approaches, which won’t be
      described in this presentation.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      That’s not all! There are other approaches, which won’t be
      described in this presentation.

              JAXB - Java XML Binding API
              Technology providing ability to marshal Java objects into
              XML and the reverse, i.e. to unmarshal XML elements back
              into Java objects. Working on top of another parser (mostly
              streaming parsers).




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




              Javolution
              Library providing real-time StAX-like implementation which
              does not force object creation and has smaller effect on
              memory footprint/garbage collection, using eg. lookup tables
              for retriving and reusing data.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
                                                              What is parsing
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
                                                              Diffrent ways of parsing documents
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




              Javolution
              Library providing real-time StAX-like implementation which
              does not force object creation and has smaller effect on
              memory footprint/garbage collection, using eg. lookup tables
              for retriving and reusing data.

              VTD-XML - Virtual Token Descriptor for XML
              Collection of efficient processing technologies, centered
              around a non-extractive and ‘document-centric‘ parsing
              technique called VTD. Supports random access’ and XPath



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


SAX as a processing model


      While describing SAX, firstly it should be considered as a specific
      processing mechanism, rather then simple API. SAX represents
      event-driven architecture. It means, that parser would perform
      an operation each time when a particular event will occur.

      To handle these occurences, user defines a number of callback
      methods, which will be called when parser is notified about
      encountered element.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




                                 Figure: Top-down parsing in SAX API


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




      In Java language, SAX API is a collection of classes and interfaces,
      which should be implemented while constructing XML parser.
      Package containing this collection is:




                                      org.xml.sax.*


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




                            Figure: org.xml.sax.* package class diagram


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Basic class structure

 1    // D e c l a r e document URI
 2    S t r i n g xmlURI = ” h t t p : / / e x a m p l e . com/ r e p o r t . xml ” ;
 3

 4    // C r e a t e r e a d e r i n s t a n c e
 5    XMLReader r e a d e r = XMLReaderFactory . createXMLReader ( ) ;
 6
 7    // S e t i m p l e m n t a t i o n c l a s s o f C o n t e n t H a n d l e r
 8    r e a d e r . s e t C o n t e n t H a n d l e r ( new MyContentHandler ( ) ) ;
 9

10    // R e s o l v e document s o u r c e
11    I n p u t S o u r c e i n p u t S o u r c e = new I n p u t S o u r c e ( xmlURI ) ;
12
13    // P a r s e document
14    reader . parse ( inputSource );

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Diffrent SAX implementations


 1    // X e r c e s i m p l e m e n t a t i o n
 2    XMLReader r e a d e r =
 3                 new o r g . a p a c h e . x e r c e s . p a r s e r s . SAXParser ( ) ;
 4
 5    // JAXP i m p l e m e n t a t i o n
 6    SAXParser p a r s e r = S A X P a r s e r F a c t o r y . newSAXParser ( ) ;
 7    XMLReader r e a d e r = p a r s e r ;
 8

 9    // P i c c o l o i m p l e m e n t a t i o n
10    XMLReader r e a d e r = new com . b l u e c a s t . xml . P i c c o l o ( ) ;




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Other SAX features

      SAX provides number of interfaces for correct data handling. Some
      of them, not only process the content of document, but also it’s
      structure.

      Interfaces such as:
              ErrorHandler
              EntityResolver
              DTDHandler

      Analyze also structure of the document, for possible errors, entity
      links or elements describing other elements.

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Advanced SAX features I


      SAX API is considered as very flexible solution. Mainly because it
      can be configured by properites and features.
 1    void setProperty ( S t r i n g propertyID , Object value ) ;
 2    void setFeature ( String featureID , boolean state ) ;

      Properties and features modify parser behaviour while processing
      document. For example, we can validate if document is well-formed
      XML file, or validate it against the schema related to it.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Advanced SAX features II
      Among many other interesting SAX features, one is very important
      and radically extends SAX capabilities. Interface XMLFilter allows
      to create a cascade of parsers, each for a different processing
      operation. It greatly accelerates parsing as a one piece.




                     Figure: Cascade processing using XMLFilter interface


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... I




      Q:       Why do we need other mechanisms, if SAX is so good?




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... I




      Q:       Why do we need other mechanisms, if SAX is so good?

      A:  SAX has some serious limitations due to his sequential data
      access.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II


      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II


      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II


      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II


      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II


      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


What SAX cannot do... II

      SAX parse data from beginning to end. It doesn’t allow to go
      back. And also got some other negative issues.:

              it is unable to modify content or structure of document
              it cannot access specific or random elements
              it cannot access sibling elements
              it is not serializable

      So it seems, that it is useless. THAT’S NOT TRUE! (comparison
      section). Every issue mentioned above can be resolved by SAX
      complement...

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


DOM as a processing model



      Document Object Model is based on a whole different idea.
      It doesn’t parse document and react to specific events (though it is
      able to), instead of this it builds up a tree based on documents
      structure, and store it in memory as an object.
      Due to this, every node in this tree is always available and can be
      accessed later on, many times. Moreover, strucutre stored in
      memory, can be easily transformed in many ways.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison              SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc               DOM
    What next? Alternatives to API’s, Java SE 7.0 features            StAX
                                                 Summary
                                         Further reading...


DOM architecture I

      DOM, in contrary to SAX, is a standard developed by W3C1 . Due
      to standarization it has strict architecture divided into levels, each
      containing required and optional modules.

      To claim to support a level, an application must implement all the
      requirements of the claimed level and the levels below it. There are
      3 levels, the newest (DOM 3) has been developed in 2004 and is
      the current release of the DOM specification.

      Every level has it’s core, which is a root element for other modules
      (figure)

           1
               Refernce to the standard could be found on W3C sites
       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                                 eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




      Figure: Document Object Model architecture                        (Adapted from original W3C specification)




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




      In Java language, DOM has a different structure than SAX. Almost
      every class representing Document Object Model implements
      interfaces inherited from org.w3c.dom.Node interface.

      Such framework, allows very simple data manipulation and
      traversing between nodes contained in tree structure. It is essential
      to understand how elements are stored in tree (figure).

      For example if we want to read text data from element A, we
      should get his child element contatining text, not extract elements
      A content.



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...




                      Figure: org.w3c.dom.* package class diagram                       From [1]




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Basic class structure using Java implementation

 1    S t r i n g docURI = ” h t t p : / / e x a m p l e . o r g / n u t r i t i o n . xml ” ;
 2    // g e t new D o c u m e n t B u i l d e r F a c t o r y
 3    DocumentBuilderFactory docBuilderFactory =
 4                 DocumentBuilderFactory . newInstance ( ) ;
 5    // g e t new D o c u m e n t B u i l d e r
 6    DocumentBuilder d o c B u i l d e r =
 7                 d o c B u i l d e r F a c t o r y . n ew Do c um en t Bu il de r ( ) ;
 8    // i n i t i a l i z e document w i t h n u l l
 9    Document doc = n u l l ;
10    // p a r s e document
11    doc = d o c B u i l d e r . p a r s e ( docURI ) ;
12    // e x t r a c t r o o t e l e m e n t and
13    // n o r m l i z e w h o l e t r e e ( o p t i o n a l )
14    doc . getDocumentElement ( ) . n o r m a l i z e ( ) ;

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Accessing elements

 1    NodeList elements = n u l l ;
 2    // g e t ” f o o d ” e l e m e n t s
 3    e l e m e n t s = doc . getElementsByTagName ( ” f o o d ” ) ;
 4    f o r ( i n t i =0; i <e l e m e n t s . g e t L e n g t h ( ) ; i ++)
 5    // g e t ” Avocado D i p s ”
 6    S t r i n g foodName = e l e m e n t s . i t e m ( i ) . getNodeName ( ) ;
 7    i f ( foodName . c o n t a i n s ( ” Avocado Dip ” ) )
 8    {
 9      NodeList l = elements . item ( i ) . getChildNodes ( ) ;
10      f o r ( i n t j =0; j <l . g e t L e n g t h ( ) ; j ++)
11    // p r i n t o u t c a l o r i e s
12         i f ( l . i t e m ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) )
13           System . o u t . p r i n t l n ( l . i t e m ( j ) . g e t T e x t C o n t e n t ( ) ) ;
14    }

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Modyfing elements

 1    ...
 2    i f ( l . i t e m ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) )
 3    {
 4    I n t e g e r c a l =( I n t e g e r ) ( l . i t e m ( j ) . g e t T e x t C o n t e n t ( ) ) ;
 5    // i f f o o d a v o c a d o d i p h a s more t h a n 300 c a l .
 6      i f ( c a l > 300)
 7      {
 8        El em e n t a v o c a d o d i p = l . i t e m ( j ) . g e t P a r e n t N o d e ( ) ;
 9    // r e p l a c e i t w i t h low f a t f o o d
10        El em e n t newfood=doc . c r e a t e E l e m e n t ( ” LowFatFood ” ) ;
11        doc . r e p l a c e C h i l d ( newfood , a v o c a d o d i p ) ;
12      }
13    }

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Diffrent DOM implementations

 1    // X e r c e s DOM i m p l e m e n t a t i o n
 2    DOMParser p=new o r g . a p a c h e . x e r c e s . p a r s e r s . DOMParser ( ) ;
 3    p . p a r s e ( new I n p u t S o u r c e ( xmlURI ) ) ;
 4    Document doc = p . getDocument ( ) ;
 5
 6    // JDOM DOM i m p l e m e n t a t i o n
 7    DOMBuilder b u i l d e r = o r g . jdom . i n p u t . DOMBuilder ( ) ;
 8    Document d=b u i l d e r . b u i l d ( new F i l e I n p u t S t r e a m ( xmlURI ) ) ;
 9    // i t ’ s o r g . jdom . Document n o t o r g . w3c . dom . Document !
10
11    // dom4j DOM i m p l e m e n t a t i o n
12    SAXReader r e a d e r = new o r g . dom4j . i o . SAXReader ( ) ;
13    Document document = r e a d e r . r e a d ( xmlURI ) ;
14    // i t ’ s o r g . dom4j . Document n o t o r g . w3c . dom . Document !

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Advanced DOM features I


      DOM provides many advanced functionalities with modules
      specified in standard (mainly level 3 modules). Some of them:

              MutationEvents module provides methods for changes
              listining
              LS, LS-Async modules provides methods for various kinds of
              serialization
              Validation module provides methods for real-time validation



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Advanced DOM features II



      It is important, while using specified API, to check what modules
      and in what version are implemented. To do this, we can use:

 1    boolean hasFeature ( String feature , String v e r s i o n ) ;




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Streaming API for XML - different approach



      The third approach to processing XML data is based on idea to
      treat incoming information, about events, as a stream.

      Streaming API for XML use technique called pull parsing which
      provides a sequential access to the document adapting iterator
      design pattern. Associating this with java.util.Iterator is not
      accidenatial, because part of API implements this interface.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


StAX architecture

      StAX in Java divides into two (theoretically) seperate APIs:

              cursor API represented by XMLStreamReader and
              XMLStreamWriter classes. Maintained as a fast and most
              efficient solution.
              event API represented by XMLEventReader and
              XMLEventWriter classes. Regarded as a simple and and
              flexible solution.

      Both are specified in JSR173 and contained in javax.xml.stream.*


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Difference between SAX event-driven architecture


      Common view as if StAX API is similar to SAX is wrong.

      SAX architecture provides number of interfaces to handle incoming
      events. StAX Event API provides methods for iterating through
      event stream, and proper handling specific occurences.

      Moreover StAX is symmetric Read/Write API which allows also
      to modify and store elements.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Basic class structure


 1    /∗ C r e a t i n g r e a d e r s . . . ∗/
 2

 3    // c r e a t i n g i n p u t f a c t o r y
 4    S t r i n g xmlURI = ” h t t p : / / e x a m p l e . o r g / n u t r i t i o n . xml ”
 5    S t r i n g R e a d e r s r = new S t r i n g R e a d e r ( xmlURI ) ;
 6    XMLInputFactory i f = XMLInputFactory . n e w I n s t a n c e ( ) ;
 7
 8    // c u r s o r API r e a d e r
 9    XMLStreamReader c u r = i f . createXMLStreamReader ( s r ) ;
10    // e v e n t API r e a d e r
11    XMLEventReader e v e n t = i f . c r e a t e X M L E v e n t R e a d e r ( s r ) ;



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison        SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc         DOM
    What next? Alternatives to API’s, Java SE 7.0 features      StAX
                                                 Summary
                                         Further reading...


Identifying events I

      Main issue while using StAX is how to identify event which has
      just occured. There are many ways to do that, most simple is to
      check the constant connected with an event (cursor API).
      Constants are declared in XMLStreamConstants interface2 .
      For example:
                1 - START ELEMENT
                2 - END ELEMENT
                3 - PROCESSING INSTRUCTION
      And so on...

           2
               https://java.sun.com/webservices/docs/1.5/api/javax/xml/stream/XMLStreamConstants.html
       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                           eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Accessing elements by iterator I (cursor API)

 1    s t a r t E l e m = XMLStreamConstants . START ELEMENT ;
 2    // w h i l e t h e r e i s n e x t e v e n t
 3    w h i l e ( cur . hasNext ( ) )
 4    {
 5    // c a t c h e v e n t t y p e
 6        i n t eventType = cur . next ( ) ;
 7        System . o u t . p r i n t l n ( evenType ) ;
 8    // i f e v e n t t y p e i s START ELEMENT
 9    // p r i n t e l e m e n t s t e x t c o n t e n t
10         i f ( e v e n t T y p e == s t a r t E l e m )
11             System . o u t . p r i n t l n ( c u r . g e t E l e m e n t T e x t ( ) ) ;
12    }


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Identifying events II

      In event API identyfing events is a bit different. XMLEventReader
      Provides methods:
 1    XMLEvent n e x t E v e n t ( ) ;
 2    boolean hasNext ( ) ;

      So, to identify catched event, we must analyse XMLEvent object
      returned from the first method. Once again there are few ways to
      do that. Getting event type method can be called:
 1     i n t getEventType ( ) ;

      Or we can test if element is certain type, by one of “is“ methods.

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Accessing elements by iterator II (event API)

 1    // w h i l e t h e r e i s n e x t e v e n t
 2    w h i l e ( event . hasNext ( ) )
 3    {
 4       XMLEvent e = e v e n t . n e x t E v e n t ( ) ;
 5    // i d e n t i f y e v e n t by c a s t i n g !
 6       i f ( e instanceof StartElement )
 7       {
 8    // c a s t e v e n t t o s p e c i f i c e l e m e n t
 9            StartElement se = ( StartElement ) e ;
10            QName name = s e . getName ( ) ;
11    // p r i n t e l e m e n t name
12            System . o u t . p r i n t l n ( name . g e t L o c a l P a r t ( ) ) ;
13       }
14    }

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Advanced iteration methods

      Both StAX APIs provides more complex iteration methods.
 1    XMLEvent nextTag ( ) ;
 2    // o n l y i n XMLEventReader
 3    XMLEvent p e e k ( ) ;
 4    // o n l y i n XMLStreamReader
 5    v o i d r e q u i r e ( i n t t y p e , S t r i n g nsURI , S t r i n g l o c a l N ) ;

      First method moves cursor omitting events, until the start or end
      of the element. Second allows to check next event before moving
      cursor. And third compares cursor position with wanted value.
      All methods are well documented and should reviewed by reader.


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


EventFilters and StreamFilters I

      StAX API allows to create filtered readers. It’s not necessary to
      create complex stream handlers to process specific events. Only
      thing that should be done is implementing one (or both) interface
      containing singular method.
      Interfaces:
 1     E v e n t F i l t e r ( extends XMLFilter )
 2     S t r e a m F i l t e r ( extends XMLFilter )

      Methods:
 1     p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t )



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


EventFilters and StreamFilters II

      Implementing filter is simple:
 1    p u b l i c c l a s s C h a r F i l t e r implements E v e n t F i l t e r
 2    {
 3       p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t )
 4       {
 5            r e t u r n ( e v e n t . g e t E v e n t T y p e ( ) ==
 6                    XMLStreamConstants . CHARACTERS ) ;
 7       }
 8    }

      Filter above will only react to characters elements.



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Writing elements I



      StAX as a symmetric API providing I/O handling is able to write
      XML data. It provides to interfaces to do that:
 1    XMLEventWriter ( e x t e n d s XMLEventConsumer )
 2    XMLStreamWriter

      Basic difference between them, is that XMLEventWriter has less
      functionalities.




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Writing elements II

 1    // u s i n g XMLStreamWriter
 2    OutputStream c o n s o l e = System . o u t ;
 3    XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ;
 4    XMLStreamWriter sw = o f . c r e a t e X M L S t r e a m W r i t e r ( c o n s o l e ) ;
 5    sw . w r i t e S t a r t D o c u m e n t ( ” 1 . 0 ” ) ;
 6    // c r e a t e document w i t h one meal
 7    sw . w r i t e S t a r t E l e m e n t ( ” n u t r i t i o n ” ) ;
 8    sw . w r i t e S t a r t E l e m e n t ( ” f o o d ” ) ;
 9    sw . w r i t e S t a r t E l e m e n t ( ”name” ) ;
10    sw . w r i t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ;
11    sw . w r i t e E n d E l e m e n t ( ) ;
12    sw . w r i t e E n d E l e m e n t ( ) ;
13    sw . w r i t e E n d E l e m e n t ( ) ;
14    sw . writeEndDocument ( ) ;

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


Writing elements III

 1    // t h e same u s i n g XMLEventWriter
 2    OutputStream c o n s o l e = System . o u t ;
 3    XMLEventFactory x e f = XMLEventFactory . n e w I n s t a n c e ( ) ;
 4    XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ;
 5    XMLEventWriter ew = o f . c r e a t e X M L E v e n t W r i t e r ( c o n s o l e ) ;
 6    ew . add ( x e f . c r e a t e S t a r t D o c u m e n t ( ”UTF8” , ” 1 . 0 ” ) ) ;
 7    ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” n u t r i t i o n ” ) ) ;
 8    ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” f o o d ” ) ) ;
 9    ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ”name” ) ) ;
10    ew . add ( x e f . c r e a t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ) ;
11    ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;
12    ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;
13    ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ;
14    ew . add ( x e f . createEndDocument ( ) ) ;

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison      SAX
CASE STUDY: Parsing Really Simple Syndication (RSS) doc       DOM
    What next? Alternatives to API’s, Java SE 7.0 features    StAX
                                                 Summary
                                         Further reading...


XmlPull


      XmlPull is ancestor of StAX. Although StAX is a popular standard
      for parsing XML data, XmlPull didn’t retire. Due to its lightweight
      (JAR file - only 9 kB) XmlPull found applicable for devices with
      limited memory. It is often used in developing mobile applications.


                                   http://www.xmlpull.org/



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Comparing capabilities I


      Developing applications processing XML data, always relates with
      parser choice.

      Selection of proper API is essential to success of the project.
      Although choice is not an easy task. Before making decision, ask
      yourself few questions:
              What needs to be done (using parser)?
              Is application platform-dependent? If so, what’s the platform?
              Is it a distributed system?



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Comparing capabilities I


      Developing applications processing XML data, always relates with
      parser choice.

      Selection of proper API is essential to success of the project.
      Although choice is not an easy task. Before making decision, ask
      yourself few questions:
              What needs to be done (using parser)?
              Is application platform-dependent? If so, what’s the platform?
              Is it a distributed system?



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Comparing capabilities I


      Developing applications processing XML data, always relates with
      parser choice.

      Selection of proper API is essential to success of the project.
      Although choice is not an easy task. Before making decision, ask
      yourself few questions:
              What needs to be done (using parser)?
              Is application platform-dependent? If so, what’s the platform?
              Is it a distributed system?



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Comparing capabilities I


      Developing applications processing XML data, always relates with
      parser choice.

      Selection of proper API is essential to success of the project.
      Although choice is not an easy task. Before making decision, ask
      yourself few questions:
              What needs to be done (using parser)?
              Is application platform-dependent? If so, what’s the platform?
              Is it a distributed system?



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Comparing capabilities II




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Benchmarks I




                                  Figures: From http://piccolo.sourceforge.net/bench.html




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                           eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Benchmarks II




                                  Figures: From http://piccolo.sourceforge.net/bench.html




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                           eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Benchmarks III




                                      Figures: From http://www.xml.com/lpt/a/1702




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Benchmarks IV




                             Figure:      From: http://www.ximpleware.com/benchmark1.html




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




                                    CASE STUDY
             Parsing Really Simple Syndication documents




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      RSS definition
      RSS is a family of Web feed formats used to publish frequently
      updated content. An RSS document (which is called a ”feed“ or
      ”web feed“ or ”channel“) contains either a summary of content
      from an associated web site or the full text stored as a XML. RSS
      makes it possible for people to keep up with web sites in an
      automated manner that can be piped into applications or filtered
      displays.
                                                                         Source: http://en.wikipedia.org/wiki/RSS




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      The initials ”RSS” are used to refer to the following formats:
              Really Simple Syndication (RSS 2.0)
              RDF Site Summary (RSS 1.0 and RSS 0.90)
              Rich Site Summary (RSS 0.91)

      While creating solution for reading/writing RSS documents we
      must remember that, RSS is not a standard, and doesn’t have
      XMLSchema doc descrbing it’s strucutre (or DTD)! Only
      reference could be found on:

                           http://www.rssboard.org/rss-specification



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




                                               The Code
                       Presenting jNivo RSS Exterior Plugin v.0.1




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      Every previous presented API, can be thought as difficult to learn
      and use. It’s partly true, XML APIs in Java have rather difficult
      syntax, and hundreds of classes and interfaces, which should be
      handled to process XML data.

      Another thing is that, there are few standards:
              javax.xml.stream.* (StAX, JSR-173)
              org.w3c.dom.* (DOM standard)
              org.xml.sax.* (SAX standard)
              JAXP



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      Mark Reinhold3 suggested different way of expressing XML in
      Java language4 .

     Built in:                                   java.lang.String ”foo“

     New type:                                   java.lang.XML <foo> (syntax!)
     New package:                                java.lang.xml.* (XML Literlas!)


           3
               Chief Engineer for the Java Platform, Standard Edition, at Sun Microsystems.
           4
               Java Technical Session 3441 (TS-3441)
       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                               eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Proposed syntax I




                                                   Figure:    From [3]




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                          eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Proposed syntax II




                                                   Figure:    From [3]




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                          eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...


Much more...

      Obviously new syntax is not just syntactic sugar, it helps improve
      proper structure of the document, and prevent from wrong
      instruction order.
      Mark Reinhold proposed also:
              datatype coders
              collections
              hybrid event/tree API
              accessing by XPath
      And more! His blog:

                                        http://blogs.sun.com/mr/

       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...



      Three different approaches to XML parsing
              SAX - keywords: event-based, callback model, fast, cannot
              modify structure, interfaced based API
              DOM - keywords: builds tree in memory, divided into
              modules, rather slow, can generate and modify documents
              StAX -keywords: pull parsing, events catched from stream,
              consistent code!, can be used on mobile devices (XmlPull)

      RSS parsing? Difficult to make decision about parsing model,
      most efficient are already implemented APIs for example ROME

                                         http://rome.dev.java.net


       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




             Brett McLaughlin, Justin Edelson
             Java & XML
             O’Reilly Media, 3rd edition, 1 December 2006

             Cay S. Horstmann, Gary Cornell
             Core Java, Volume II — Advanced Features
             Prentice Hall PTR, 8th edition, 7 April 2008

             Mark Reinhold
             Integrating XML into the Java Programming Language TS-3441
             http://developers.sun.com/learning/javaoneonline/sessions/2006/TS-
             3441/index.htm



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




             Jurgen Salecker
             Hybrid Parser Architectural Pattern
             http://developerlife.com/tutorials/?p=53
             Various APIs documentation
             For starters it’s good to search wikipedia...
             Xerces 2 Java Parser http://xerces.apache.org/xerces2-j/
             JAXP reference implementation https://jaxp.dev.java.net/
             XOM - XML Object Model http://www.xom.nu/
             JDOM - Java Document Object Model http://www.jdom.org/
             StAX - Streaming API for XML http://stax.codehaus.org/
             VTD - XML - new way of processing XML
             http://vtd-xml.sourceforge.net/

             AND OTHER...



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




      Why?...


                                         Questions ?
                                                                                               What if?...




       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6
Introduction
                                       XML API’s in Java
                 Capabilities and performance comparison
CASE STUDY: Parsing Really Simple Syndication (RSS) doc
    What next? Alternatives to API’s, Java SE 7.0 features
                                                 Summary
                                         Further reading...




                                     THANK YOU



       Wojciech Podg´rski http://podgorski.wordpress.com
                    o                                         eXtensible Markup Language APIs in Java 1.6

Más contenido relacionado

Similar a eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsing using Java lanaguage

Java JSON Parser Comparison
Java JSON Parser ComparisonJava JSON Parser Comparison
Java JSON Parser ComparisonAllan Huang
 
Assist software awesome scala
Assist software   awesome scalaAssist software   awesome scala
Assist software awesome scalaAssistSoftware
 
Single API for library services (poster)
Single API for library services (poster)Single API for library services (poster)
Single API for library services (poster)Milan Janíček
 
Rollin onj Rubyv3
Rollin onj Rubyv3Rollin onj Rubyv3
Rollin onj Rubyv3Oracle
 
Comparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java PlatformComparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java PlatformComputer Science
 
Java 7 Dolphin manjula kollipara
Java 7 Dolphin manjula kolliparaJava 7 Dolphin manjula kollipara
Java 7 Dolphin manjula kolliparaManjula Kollipara
 
Facilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic WebFacilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic WebRoberto García
 
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!panagenda
 
XML Technologies for RESTful Services Development
XML Technologies for RESTful Services DevelopmentXML Technologies for RESTful Services Development
XML Technologies for RESTful Services Developmentruyalarcon
 
Java New Evolution
Java New EvolutionJava New Evolution
Java New EvolutionAllan Huang
 
JSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingJSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingHeiko Scherrer
 
Parsing XML & JSON in Apex
Parsing XML & JSON in ApexParsing XML & JSON in Apex
Parsing XML & JSON in ApexAbhinav Gupta
 
Introduction Java Web Framework and Web Server.
Introduction Java Web Framework and Web Server.Introduction Java Web Framework and Web Server.
Introduction Java Web Framework and Web Server.suranisaunak
 
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalacheIasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalacheCodecamp Romania
 

Similar a eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsing using Java lanaguage (20)

Java JSON Parser Comparison
Java JSON Parser ComparisonJava JSON Parser Comparison
Java JSON Parser Comparison
 
Assist software awesome scala
Assist software   awesome scalaAssist software   awesome scala
Assist software awesome scala
 
eureka09
eureka09eureka09
eureka09
 
eureka09
eureka09eureka09
eureka09
 
Web Spa
Web SpaWeb Spa
Web Spa
 
Single API for library services (poster)
Single API for library services (poster)Single API for library services (poster)
Single API for library services (poster)
 
Rollin onj Rubyv3
Rollin onj Rubyv3Rollin onj Rubyv3
Rollin onj Rubyv3
 
Comparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java PlatformComparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java Platform
 
Java 7 Dolphin manjula kollipara
Java 7 Dolphin manjula kolliparaJava 7 Dolphin manjula kollipara
Java 7 Dolphin manjula kollipara
 
Facilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic WebFacilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic Web
 
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!
ICON UK '13 - Apache Software: The FREE Java toolbox you didn't know you had !!
 
XML Technologies for RESTful Services Development
XML Technologies for RESTful Services DevelopmentXML Technologies for RESTful Services Development
XML Technologies for RESTful Services Development
 
Java New Evolution
Java New EvolutionJava New Evolution
Java New Evolution
 
Json
JsonJson
Json
 
JSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingJSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML Binding
 
Scala a case4
Scala a case4Scala a case4
Scala a case4
 
Parsing XML & JSON in Apex
Parsing XML & JSON in ApexParsing XML & JSON in Apex
Parsing XML & JSON in Apex
 
E05412327
E05412327E05412327
E05412327
 
Introduction Java Web Framework and Web Server.
Introduction Java Web Framework and Web Server.Introduction Java Web Framework and Web Server.
Introduction Java Web Framework and Web Server.
 
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalacheIasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
 

Más de Wojciech Podgórski

[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase
[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase
[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBaseWojciech Podgórski
 
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjneWojciech Podgórski
 
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11xWojciech Podgórski
 
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...Wojciech Podgórski
 
Metryki obiektowe i ich interpretacja
Metryki obiektowe i ich interpretacjaMetryki obiektowe i ich interpretacja
Metryki obiektowe i ich interpretacjaWojciech Podgórski
 
[PL] XPrince: balance between agility and discipline
[PL] XPrince: balance between agility and discipline[PL] XPrince: balance between agility and discipline
[PL] XPrince: balance between agility and disciplineWojciech Podgórski
 
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...Wojciech Podgórski
 

Más de Wojciech Podgórski (7)

[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase
[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase
[PL] Złożone przetwarzanie zdarzeń w SZSBD StreamBase
 
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne
[PL] Krótkozasięgowe systemy telemetryczne i identyfikacyjne
 
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x
[PL] Mechanizmy bezpieczeństwa w sieciach z rodziny 802.11x
 
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...
Rola projektowania architektonicznego w inżynierii oprogramowania zorientowan...
 
Metryki obiektowe i ich interpretacja
Metryki obiektowe i ich interpretacjaMetryki obiektowe i ich interpretacja
Metryki obiektowe i ich interpretacja
 
[PL] XPrince: balance between agility and discipline
[PL] XPrince: balance between agility and discipline[PL] XPrince: balance between agility and discipline
[PL] XPrince: balance between agility and discipline
 
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

eXtensible Markup Language APIs in Java 1.6 - Simple and efficient XML parsing using Java lanaguage

  • 1. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... eXtensible Markup Language APIs in Java 1.6 Simple and efficient XML parsing using Java lanaguage Wojciech Podg´rski o http://podgorski.wordpress.com April 8, 2008 Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 2. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Presentation outline 1 Introduction What is parsing Diffrent ways of parsing documents 2 XML API’s in Java SAX DOM StAX 3 Capabilities and performance comparison 4 CASE STUDY: Parsing Really Simple Syndication (RSS) doc 5 What next? Alternatives to API’s, Java SE 7.0 features 6 Summary 7 Further reading... Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 3. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Parsing definition Parsing, more formally called syntactic analysis is the process of analyzing a sequence of tokens to determine grammatical structure with respect to a given formal grammar. Source: http://en.wikipedia.org/wiki/Parsing Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 4. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... We can distinguish three main models of parsing XML documents. Each one of them differs with mechanism of traversing between the nodes and idea of processing XML data. Those models are: SAX - Simple API for XML DOM - Document Object Model StAX - Streaming API for XML Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 5. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... We can distinguish three main models of parsing XML documents. Each one of them differs with mechanism of traversing between the nodes and idea of processing XML data. Those models are: SAX - Simple API for XML DOM - Document Object Model StAX - Streaming API for XML Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 6. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... We can distinguish three main models of parsing XML documents. Each one of them differs with mechanism of traversing between the nodes and idea of processing XML data. Those models are: SAX - Simple API for XML DOM - Document Object Model StAX - Streaming API for XML Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 7. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... We can distinguish three main models of parsing XML documents. Each one of them differs with mechanism of traversing between the nodes and idea of processing XML data. Those models are: SAX - Simple API for XML DOM - Document Object Model StAX - Streaming API for XML Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 8. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... That’s not all! There are other approaches, which won’t be described in this presentation. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 9. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... That’s not all! There are other approaches, which won’t be described in this presentation. JAXB - Java XML Binding API Technology providing ability to marshal Java objects into XML and the reverse, i.e. to unmarshal XML elements back into Java objects. Working on top of another parser (mostly streaming parsers). Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 10. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Javolution Library providing real-time StAX-like implementation which does not force object creation and has smaller effect on memory footprint/garbage collection, using eg. lookup tables for retriving and reusing data. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 11. Introduction XML API’s in Java Capabilities and performance comparison What is parsing CASE STUDY: Parsing Really Simple Syndication (RSS) doc Diffrent ways of parsing documents What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Javolution Library providing real-time StAX-like implementation which does not force object creation and has smaller effect on memory footprint/garbage collection, using eg. lookup tables for retriving and reusing data. VTD-XML - Virtual Token Descriptor for XML Collection of efficient processing technologies, centered around a non-extractive and ‘document-centric‘ parsing technique called VTD. Supports random access’ and XPath Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 12. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... SAX as a processing model While describing SAX, firstly it should be considered as a specific processing mechanism, rather then simple API. SAX represents event-driven architecture. It means, that parser would perform an operation each time when a particular event will occur. To handle these occurences, user defines a number of callback methods, which will be called when parser is notified about encountered element. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 13. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Figure: Top-down parsing in SAX API Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 14. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... In Java language, SAX API is a collection of classes and interfaces, which should be implemented while constructing XML parser. Package containing this collection is: org.xml.sax.* Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 15. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Figure: org.xml.sax.* package class diagram Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 16. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Basic class structure 1 // D e c l a r e document URI 2 S t r i n g xmlURI = ” h t t p : / / e x a m p l e . com/ r e p o r t . xml ” ; 3 4 // C r e a t e r e a d e r i n s t a n c e 5 XMLReader r e a d e r = XMLReaderFactory . createXMLReader ( ) ; 6 7 // S e t i m p l e m n t a t i o n c l a s s o f C o n t e n t H a n d l e r 8 r e a d e r . s e t C o n t e n t H a n d l e r ( new MyContentHandler ( ) ) ; 9 10 // R e s o l v e document s o u r c e 11 I n p u t S o u r c e i n p u t S o u r c e = new I n p u t S o u r c e ( xmlURI ) ; 12 13 // P a r s e document 14 reader . parse ( inputSource ); Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 17. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Diffrent SAX implementations 1 // X e r c e s i m p l e m e n t a t i o n 2 XMLReader r e a d e r = 3 new o r g . a p a c h e . x e r c e s . p a r s e r s . SAXParser ( ) ; 4 5 // JAXP i m p l e m e n t a t i o n 6 SAXParser p a r s e r = S A X P a r s e r F a c t o r y . newSAXParser ( ) ; 7 XMLReader r e a d e r = p a r s e r ; 8 9 // P i c c o l o i m p l e m e n t a t i o n 10 XMLReader r e a d e r = new com . b l u e c a s t . xml . P i c c o l o ( ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 18. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Other SAX features SAX provides number of interfaces for correct data handling. Some of them, not only process the content of document, but also it’s structure. Interfaces such as: ErrorHandler EntityResolver DTDHandler Analyze also structure of the document, for possible errors, entity links or elements describing other elements. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 19. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Advanced SAX features I SAX API is considered as very flexible solution. Mainly because it can be configured by properites and features. 1 void setProperty ( S t r i n g propertyID , Object value ) ; 2 void setFeature ( String featureID , boolean state ) ; Properties and features modify parser behaviour while processing document. For example, we can validate if document is well-formed XML file, or validate it against the schema related to it. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 20. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Advanced SAX features II Among many other interesting SAX features, one is very important and radically extends SAX capabilities. Interface XMLFilter allows to create a cascade of parsers, each for a different processing operation. It greatly accelerates parsing as a one piece. Figure: Cascade processing using XMLFilter interface Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 21. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... I Q: Why do we need other mechanisms, if SAX is so good? Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 22. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... I Q: Why do we need other mechanisms, if SAX is so good? A: SAX has some serious limitations due to his sequential data access. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 23. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 24. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 25. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 26. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 27. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 28. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... What SAX cannot do... II SAX parse data from beginning to end. It doesn’t allow to go back. And also got some other negative issues.: it is unable to modify content or structure of document it cannot access specific or random elements it cannot access sibling elements it is not serializable So it seems, that it is useless. THAT’S NOT TRUE! (comparison section). Every issue mentioned above can be resolved by SAX complement... Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 29. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... DOM as a processing model Document Object Model is based on a whole different idea. It doesn’t parse document and react to specific events (though it is able to), instead of this it builds up a tree based on documents structure, and store it in memory as an object. Due to this, every node in this tree is always available and can be accessed later on, many times. Moreover, strucutre stored in memory, can be easily transformed in many ways. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 30. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... DOM architecture I DOM, in contrary to SAX, is a standard developed by W3C1 . Due to standarization it has strict architecture divided into levels, each containing required and optional modules. To claim to support a level, an application must implement all the requirements of the claimed level and the levels below it. There are 3 levels, the newest (DOM 3) has been developed in 2004 and is the current release of the DOM specification. Every level has it’s core, which is a root element for other modules (figure) 1 Refernce to the standard could be found on W3C sites Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 31. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Figure: Document Object Model architecture (Adapted from original W3C specification) Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 32. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... In Java language, DOM has a different structure than SAX. Almost every class representing Document Object Model implements interfaces inherited from org.w3c.dom.Node interface. Such framework, allows very simple data manipulation and traversing between nodes contained in tree structure. It is essential to understand how elements are stored in tree (figure). For example if we want to read text data from element A, we should get his child element contatining text, not extract elements A content. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 33. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Figure: org.w3c.dom.* package class diagram From [1] Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 34. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Basic class structure using Java implementation 1 S t r i n g docURI = ” h t t p : / / e x a m p l e . o r g / n u t r i t i o n . xml ” ; 2 // g e t new D o c u m e n t B u i l d e r F a c t o r y 3 DocumentBuilderFactory docBuilderFactory = 4 DocumentBuilderFactory . newInstance ( ) ; 5 // g e t new D o c u m e n t B u i l d e r 6 DocumentBuilder d o c B u i l d e r = 7 d o c B u i l d e r F a c t o r y . n ew Do c um en t Bu il de r ( ) ; 8 // i n i t i a l i z e document w i t h n u l l 9 Document doc = n u l l ; 10 // p a r s e document 11 doc = d o c B u i l d e r . p a r s e ( docURI ) ; 12 // e x t r a c t r o o t e l e m e n t and 13 // n o r m l i z e w h o l e t r e e ( o p t i o n a l ) 14 doc . getDocumentElement ( ) . n o r m a l i z e ( ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 35. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Accessing elements 1 NodeList elements = n u l l ; 2 // g e t ” f o o d ” e l e m e n t s 3 e l e m e n t s = doc . getElementsByTagName ( ” f o o d ” ) ; 4 f o r ( i n t i =0; i <e l e m e n t s . g e t L e n g t h ( ) ; i ++) 5 // g e t ” Avocado D i p s ” 6 S t r i n g foodName = e l e m e n t s . i t e m ( i ) . getNodeName ( ) ; 7 i f ( foodName . c o n t a i n s ( ” Avocado Dip ” ) ) 8 { 9 NodeList l = elements . item ( i ) . getChildNodes ( ) ; 10 f o r ( i n t j =0; j <l . g e t L e n g t h ( ) ; j ++) 11 // p r i n t o u t c a l o r i e s 12 i f ( l . i t e m ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) ) 13 System . o u t . p r i n t l n ( l . i t e m ( j ) . g e t T e x t C o n t e n t ( ) ) ; 14 } Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 36. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Modyfing elements 1 ... 2 i f ( l . i t e m ( j ) . getNodeName ( ) . e q u a l s ( ” c a l o r i e s ” ) ) 3 { 4 I n t e g e r c a l =( I n t e g e r ) ( l . i t e m ( j ) . g e t T e x t C o n t e n t ( ) ) ; 5 // i f f o o d a v o c a d o d i p h a s more t h a n 300 c a l . 6 i f ( c a l > 300) 7 { 8 El em e n t a v o c a d o d i p = l . i t e m ( j ) . g e t P a r e n t N o d e ( ) ; 9 // r e p l a c e i t w i t h low f a t f o o d 10 El em e n t newfood=doc . c r e a t e E l e m e n t ( ” LowFatFood ” ) ; 11 doc . r e p l a c e C h i l d ( newfood , a v o c a d o d i p ) ; 12 } 13 } Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 37. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Diffrent DOM implementations 1 // X e r c e s DOM i m p l e m e n t a t i o n 2 DOMParser p=new o r g . a p a c h e . x e r c e s . p a r s e r s . DOMParser ( ) ; 3 p . p a r s e ( new I n p u t S o u r c e ( xmlURI ) ) ; 4 Document doc = p . getDocument ( ) ; 5 6 // JDOM DOM i m p l e m e n t a t i o n 7 DOMBuilder b u i l d e r = o r g . jdom . i n p u t . DOMBuilder ( ) ; 8 Document d=b u i l d e r . b u i l d ( new F i l e I n p u t S t r e a m ( xmlURI ) ) ; 9 // i t ’ s o r g . jdom . Document n o t o r g . w3c . dom . Document ! 10 11 // dom4j DOM i m p l e m e n t a t i o n 12 SAXReader r e a d e r = new o r g . dom4j . i o . SAXReader ( ) ; 13 Document document = r e a d e r . r e a d ( xmlURI ) ; 14 // i t ’ s o r g . dom4j . Document n o t o r g . w3c . dom . Document ! Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 38. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Advanced DOM features I DOM provides many advanced functionalities with modules specified in standard (mainly level 3 modules). Some of them: MutationEvents module provides methods for changes listining LS, LS-Async modules provides methods for various kinds of serialization Validation module provides methods for real-time validation Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 39. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Advanced DOM features II It is important, while using specified API, to check what modules and in what version are implemented. To do this, we can use: 1 boolean hasFeature ( String feature , String v e r s i o n ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 40. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Streaming API for XML - different approach The third approach to processing XML data is based on idea to treat incoming information, about events, as a stream. Streaming API for XML use technique called pull parsing which provides a sequential access to the document adapting iterator design pattern. Associating this with java.util.Iterator is not accidenatial, because part of API implements this interface. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 41. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... StAX architecture StAX in Java divides into two (theoretically) seperate APIs: cursor API represented by XMLStreamReader and XMLStreamWriter classes. Maintained as a fast and most efficient solution. event API represented by XMLEventReader and XMLEventWriter classes. Regarded as a simple and and flexible solution. Both are specified in JSR173 and contained in javax.xml.stream.* Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 42. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Difference between SAX event-driven architecture Common view as if StAX API is similar to SAX is wrong. SAX architecture provides number of interfaces to handle incoming events. StAX Event API provides methods for iterating through event stream, and proper handling specific occurences. Moreover StAX is symmetric Read/Write API which allows also to modify and store elements. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 43. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Basic class structure 1 /∗ C r e a t i n g r e a d e r s . . . ∗/ 2 3 // c r e a t i n g i n p u t f a c t o r y 4 S t r i n g xmlURI = ” h t t p : / / e x a m p l e . o r g / n u t r i t i o n . xml ” 5 S t r i n g R e a d e r s r = new S t r i n g R e a d e r ( xmlURI ) ; 6 XMLInputFactory i f = XMLInputFactory . n e w I n s t a n c e ( ) ; 7 8 // c u r s o r API r e a d e r 9 XMLStreamReader c u r = i f . createXMLStreamReader ( s r ) ; 10 // e v e n t API r e a d e r 11 XMLEventReader e v e n t = i f . c r e a t e X M L E v e n t R e a d e r ( s r ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 44. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Identifying events I Main issue while using StAX is how to identify event which has just occured. There are many ways to do that, most simple is to check the constant connected with an event (cursor API). Constants are declared in XMLStreamConstants interface2 . For example: 1 - START ELEMENT 2 - END ELEMENT 3 - PROCESSING INSTRUCTION And so on... 2 https://java.sun.com/webservices/docs/1.5/api/javax/xml/stream/XMLStreamConstants.html Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 45. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Accessing elements by iterator I (cursor API) 1 s t a r t E l e m = XMLStreamConstants . START ELEMENT ; 2 // w h i l e t h e r e i s n e x t e v e n t 3 w h i l e ( cur . hasNext ( ) ) 4 { 5 // c a t c h e v e n t t y p e 6 i n t eventType = cur . next ( ) ; 7 System . o u t . p r i n t l n ( evenType ) ; 8 // i f e v e n t t y p e i s START ELEMENT 9 // p r i n t e l e m e n t s t e x t c o n t e n t 10 i f ( e v e n t T y p e == s t a r t E l e m ) 11 System . o u t . p r i n t l n ( c u r . g e t E l e m e n t T e x t ( ) ) ; 12 } Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 46. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Identifying events II In event API identyfing events is a bit different. XMLEventReader Provides methods: 1 XMLEvent n e x t E v e n t ( ) ; 2 boolean hasNext ( ) ; So, to identify catched event, we must analyse XMLEvent object returned from the first method. Once again there are few ways to do that. Getting event type method can be called: 1 i n t getEventType ( ) ; Or we can test if element is certain type, by one of “is“ methods. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 47. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Accessing elements by iterator II (event API) 1 // w h i l e t h e r e i s n e x t e v e n t 2 w h i l e ( event . hasNext ( ) ) 3 { 4 XMLEvent e = e v e n t . n e x t E v e n t ( ) ; 5 // i d e n t i f y e v e n t by c a s t i n g ! 6 i f ( e instanceof StartElement ) 7 { 8 // c a s t e v e n t t o s p e c i f i c e l e m e n t 9 StartElement se = ( StartElement ) e ; 10 QName name = s e . getName ( ) ; 11 // p r i n t e l e m e n t name 12 System . o u t . p r i n t l n ( name . g e t L o c a l P a r t ( ) ) ; 13 } 14 } Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 48. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Advanced iteration methods Both StAX APIs provides more complex iteration methods. 1 XMLEvent nextTag ( ) ; 2 // o n l y i n XMLEventReader 3 XMLEvent p e e k ( ) ; 4 // o n l y i n XMLStreamReader 5 v o i d r e q u i r e ( i n t t y p e , S t r i n g nsURI , S t r i n g l o c a l N ) ; First method moves cursor omitting events, until the start or end of the element. Second allows to check next event before moving cursor. And third compares cursor position with wanted value. All methods are well documented and should reviewed by reader. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 49. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... EventFilters and StreamFilters I StAX API allows to create filtered readers. It’s not necessary to create complex stream handlers to process specific events. Only thing that should be done is implementing one (or both) interface containing singular method. Interfaces: 1 E v e n t F i l t e r ( extends XMLFilter ) 2 S t r e a m F i l t e r ( extends XMLFilter ) Methods: 1 p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t ) Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 50. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... EventFilters and StreamFilters II Implementing filter is simple: 1 p u b l i c c l a s s C h a r F i l t e r implements E v e n t F i l t e r 2 { 3 p u b l i c b o o l e a n a c c e p t ( XMLEvent e v e n t ) 4 { 5 r e t u r n ( e v e n t . g e t E v e n t T y p e ( ) == 6 XMLStreamConstants . CHARACTERS ) ; 7 } 8 } Filter above will only react to characters elements. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 51. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Writing elements I StAX as a symmetric API providing I/O handling is able to write XML data. It provides to interfaces to do that: 1 XMLEventWriter ( e x t e n d s XMLEventConsumer ) 2 XMLStreamWriter Basic difference between them, is that XMLEventWriter has less functionalities. Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 52. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Writing elements II 1 // u s i n g XMLStreamWriter 2 OutputStream c o n s o l e = System . o u t ; 3 XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ; 4 XMLStreamWriter sw = o f . c r e a t e X M L S t r e a m W r i t e r ( c o n s o l e ) ; 5 sw . w r i t e S t a r t D o c u m e n t ( ” 1 . 0 ” ) ; 6 // c r e a t e document w i t h one meal 7 sw . w r i t e S t a r t E l e m e n t ( ” n u t r i t i o n ” ) ; 8 sw . w r i t e S t a r t E l e m e n t ( ” f o o d ” ) ; 9 sw . w r i t e S t a r t E l e m e n t ( ”name” ) ; 10 sw . w r i t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ; 11 sw . w r i t e E n d E l e m e n t ( ) ; 12 sw . w r i t e E n d E l e m e n t ( ) ; 13 sw . w r i t e E n d E l e m e n t ( ) ; 14 sw . writeEndDocument ( ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 53. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... Writing elements III 1 // t h e same u s i n g XMLEventWriter 2 OutputStream c o n s o l e = System . o u t ; 3 XMLEventFactory x e f = XMLEventFactory . n e w I n s t a n c e ( ) ; 4 XMLOutputFactory o f = XMLOutputFactory . n e w I n s t a n c e ( ) ; 5 XMLEventWriter ew = o f . c r e a t e X M L E v e n t W r i t e r ( c o n s o l e ) ; 6 ew . add ( x e f . c r e a t e S t a r t D o c u m e n t ( ”UTF8” , ” 1 . 0 ” ) ) ; 7 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” n u t r i t i o n ” ) ) ; 8 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ” f o o d ” ) ) ; 9 ew . add ( x e f . c r e a t e S t a r t E l e m e n t ( n u l l , n u l l , ”name” ) ) ; 10 ew . add ( x e f . c r e a t e C h a r a c t e r s ( ” C h o c o l a t e i c e cream ” ) ) ; 11 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ; 12 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ; 13 ew . add ( x e f . c r e a t e E n d E l e m e n t ( ) ; 14 ew . add ( x e f . createEndDocument ( ) ) ; Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 54. Introduction XML API’s in Java Capabilities and performance comparison SAX CASE STUDY: Parsing Really Simple Syndication (RSS) doc DOM What next? Alternatives to API’s, Java SE 7.0 features StAX Summary Further reading... XmlPull XmlPull is ancestor of StAX. Although StAX is a popular standard for parsing XML data, XmlPull didn’t retire. Due to its lightweight (JAR file - only 9 kB) XmlPull found applicable for devices with limited memory. It is often used in developing mobile applications. http://www.xmlpull.org/ Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 55. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Comparing capabilities I Developing applications processing XML data, always relates with parser choice. Selection of proper API is essential to success of the project. Although choice is not an easy task. Before making decision, ask yourself few questions: What needs to be done (using parser)? Is application platform-dependent? If so, what’s the platform? Is it a distributed system? Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 56. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Comparing capabilities I Developing applications processing XML data, always relates with parser choice. Selection of proper API is essential to success of the project. Although choice is not an easy task. Before making decision, ask yourself few questions: What needs to be done (using parser)? Is application platform-dependent? If so, what’s the platform? Is it a distributed system? Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 57. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Comparing capabilities I Developing applications processing XML data, always relates with parser choice. Selection of proper API is essential to success of the project. Although choice is not an easy task. Before making decision, ask yourself few questions: What needs to be done (using parser)? Is application platform-dependent? If so, what’s the platform? Is it a distributed system? Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 58. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Comparing capabilities I Developing applications processing XML data, always relates with parser choice. Selection of proper API is essential to success of the project. Although choice is not an easy task. Before making decision, ask yourself few questions: What needs to be done (using parser)? Is application platform-dependent? If so, what’s the platform? Is it a distributed system? Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 59. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Comparing capabilities II Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 60. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Benchmarks I Figures: From http://piccolo.sourceforge.net/bench.html Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 61. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Benchmarks II Figures: From http://piccolo.sourceforge.net/bench.html Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 62. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Benchmarks III Figures: From http://www.xml.com/lpt/a/1702 Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 63. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Benchmarks IV Figure: From: http://www.ximpleware.com/benchmark1.html Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 64. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... CASE STUDY Parsing Really Simple Syndication documents Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 65. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... RSS definition RSS is a family of Web feed formats used to publish frequently updated content. An RSS document (which is called a ”feed“ or ”web feed“ or ”channel“) contains either a summary of content from an associated web site or the full text stored as a XML. RSS makes it possible for people to keep up with web sites in an automated manner that can be piped into applications or filtered displays. Source: http://en.wikipedia.org/wiki/RSS Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 66. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... The initials ”RSS” are used to refer to the following formats: Really Simple Syndication (RSS 2.0) RDF Site Summary (RSS 1.0 and RSS 0.90) Rich Site Summary (RSS 0.91) While creating solution for reading/writing RSS documents we must remember that, RSS is not a standard, and doesn’t have XMLSchema doc descrbing it’s strucutre (or DTD)! Only reference could be found on: http://www.rssboard.org/rss-specification Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 67. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... The Code Presenting jNivo RSS Exterior Plugin v.0.1 Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 68. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Every previous presented API, can be thought as difficult to learn and use. It’s partly true, XML APIs in Java have rather difficult syntax, and hundreds of classes and interfaces, which should be handled to process XML data. Another thing is that, there are few standards: javax.xml.stream.* (StAX, JSR-173) org.w3c.dom.* (DOM standard) org.xml.sax.* (SAX standard) JAXP Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 69. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Mark Reinhold3 suggested different way of expressing XML in Java language4 . Built in: java.lang.String ”foo“ New type: java.lang.XML <foo> (syntax!) New package: java.lang.xml.* (XML Literlas!) 3 Chief Engineer for the Java Platform, Standard Edition, at Sun Microsystems. 4 Java Technical Session 3441 (TS-3441) Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 70. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Proposed syntax I Figure: From [3] Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 71. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Proposed syntax II Figure: From [3] Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 72. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Much more... Obviously new syntax is not just syntactic sugar, it helps improve proper structure of the document, and prevent from wrong instruction order. Mark Reinhold proposed also: datatype coders collections hybrid event/tree API accessing by XPath And more! His blog: http://blogs.sun.com/mr/ Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 73. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Three different approaches to XML parsing SAX - keywords: event-based, callback model, fast, cannot modify structure, interfaced based API DOM - keywords: builds tree in memory, divided into modules, rather slow, can generate and modify documents StAX -keywords: pull parsing, events catched from stream, consistent code!, can be used on mobile devices (XmlPull) RSS parsing? Difficult to make decision about parsing model, most efficient are already implemented APIs for example ROME http://rome.dev.java.net Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 74. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Brett McLaughlin, Justin Edelson Java & XML O’Reilly Media, 3rd edition, 1 December 2006 Cay S. Horstmann, Gary Cornell Core Java, Volume II — Advanced Features Prentice Hall PTR, 8th edition, 7 April 2008 Mark Reinhold Integrating XML into the Java Programming Language TS-3441 http://developers.sun.com/learning/javaoneonline/sessions/2006/TS- 3441/index.htm Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 75. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Jurgen Salecker Hybrid Parser Architectural Pattern http://developerlife.com/tutorials/?p=53 Various APIs documentation For starters it’s good to search wikipedia... Xerces 2 Java Parser http://xerces.apache.org/xerces2-j/ JAXP reference implementation https://jaxp.dev.java.net/ XOM - XML Object Model http://www.xom.nu/ JDOM - Java Document Object Model http://www.jdom.org/ StAX - Streaming API for XML http://stax.codehaus.org/ VTD - XML - new way of processing XML http://vtd-xml.sourceforge.net/ AND OTHER... Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 76. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... Why?... Questions ? What if?... Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6
  • 77. Introduction XML API’s in Java Capabilities and performance comparison CASE STUDY: Parsing Really Simple Syndication (RSS) doc What next? Alternatives to API’s, Java SE 7.0 features Summary Further reading... THANK YOU Wojciech Podg´rski http://podgorski.wordpress.com o eXtensible Markup Language APIs in Java 1.6