SlideShare una empresa de Scribd logo
1 de 41
XML Processing Md. Asfak Mahamud KAZ Software Ltd.
XML and Other Markup Languages SGML (1973) HTML (1989) XML (1996) “ XML  has  several  favorable  attributes  that  distinguish  it  from  other  competing technologies.  Programmers find XML easy to learn because it is  human-readable .  The downside, however,  is  that an XML document  needs  to be parsed   for  it  to become machine-readable.” Ref: XML on a Chip? “ A specially prepared document for Sun Microsystem by XimpleWare [ 6/9/2003 ]“
Regular Language ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Ref:  http://www.cs.nott.ac.uk/~txa/g51mal/notes-3x.pdf
XML is not regular ,[object Object],Ref: A Parallel Approach to XML Parsing Wei Lu, Kenneth Chiu,Yinfei Pan By Pumping Lemma we can prove it. A proof:  http://welbog.homeip.net/glue/53/XML-is-not-regular
Typical XML Processing Symantic Analysis Parsing input XML Output XML Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Symantic Analysis
Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Performance Bottleneck Symantic Analysis
Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Performance Bottleneck Performance affected  by parsing models Symantic Analysis
Steps in Parsing Parsing Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character  Conversion Lexical  Analysis (FSM) Syntactic Analysis  (PDA) Bit  Sequence 36 61 3E Character  Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array)
Steps in Parsing Parsing Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character  Conversion Lexical  Analysis (FSM) Syntactic Analysis  (PDA) Bit  Sequence 36 61 3E Character  Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array) Invariant among  different  parsing  models
Steps in Parsing Parsing Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character  Conversion Lexical  Analysis (FSM) Syntactic Analysis  (PDA) Bit  Sequence 36 61 3E Character  Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array) PARSING MODEL DEPENDENT Invariant among  different  parsing  models Different among different parsing models
Xml Processing: DOM & SAX or StAX Ref:  XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
Why DOM is memory intensive? ,[object Object],[object Object],[object Object],[object Object],Ref: XML on a Chip? “ A specially prepared document for Sun Microsystem by XimpleWare [ 6/9/2003 ]“
Efficiency Problems of DOM and SAX/StAX Parsing Models ,[object Object],Ref: VTD-XML-based Design and Implementation of  GML Parsing Project     Lan Xiaoji, Su Jianqiang, Cai Jinbao
Efficiency Problems of DOM and SAX/StAX Parsing Models  (contd.) ,[object Object],Ref: VTD-XML-based Design and Implementation of  GML Parsing Project     Lan Xiaoji, Su Jianqiang, Cai Jinbao “ Even a small change does the  DOM model make on the XML document;  it must decode the entire document first, and then  build the structure. It is a virtually overhead.”
XML Processing: VTD V irtual   T oken   D escriptor ,[object Object],[object Object],[object Object],[object Object]
VTD-XML ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Ref:  http://en.wikipedia.org/wiki/VTD-XML
VTD: inside VTD record Ref:  XML Document Parsing: Operational and Performance Characteristics  Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
Xml Processing: VTD  Ref:  XML Document Parsing: Operational and Performance Characteristics  Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
VTD-XML Parsed Representation of XML.   Image:  http://vtd-xml.sourceforge.net/technical/2.html
VTD-XML Resolving child elements using Location Cache.   Image:  http://vtd-xml.sourceforge.net/technical/2.html
James Clark  (on 2002) ,[object Object],[object Object],[object Object],Ref: Keeping pace with James Clark https://www.ibm.com/developerworks/xml/library/x-jclark.html?dwzone=xml http://www.jclark.com/bio.htm
VTD-XML has both DOM and SAX like features. ,[object Object],[object Object],[object Object],Ref:  http://vtd-xml.sourceforge.net/technical/3.html
[object Object]
VTD ,[object Object],Ref:  http://vtd-xml.sourceforge.net/benchmark4.html   http://vtd-xml.sourceforge.net/technical/2.html   n1   = total tokens (including ending tags)  n2  =  tokens for starting tags s  = document of size (in bytes)  (n1 - n2) x8   =  Total size of VTD  records in bytes  (without ending tags) n2x8 =  Total size of LCs (totally indexed,  i.e. one LC entry per element).  Memory usage in bytes:  ( s + 8x(n1-n2) + 8xn2) = s + 8xn1.
VTD ,[object Object],[object Object],[object Object],Ref:  http://vtd-xml.sourceforge.net/benchmark4.html
VTD ,[object Object],[object Object],[object Object],[object Object],Ref:  http://vtd-xml.sourceforge.net
Incremental Update  (Do not touch un-required content) ,[object Object],<color> red </color> ,[object Object],[object Object],[object Object],[object Object],DOM Approach: 1. Build the DOM tree 2. Navigate to and then update the text node 3. Write the updated structure back into XML Ref:  http://www.javaworld.com/javaworld/jw-07-2006/jw-0724-vtdxml.html    ” if we humans can edit XML like this, why can't XML parsers “ - Jimmy Zhang, JavaWorld.com, 07/24/06
[object Object]
VTD on Android Platform Ref: Analyzing XML Parsers Performance for Android Platform    M V Uttam Tej ,Dhanaraj Cheelu, M.Rajasekhara Babu, P Venkata Krishna SCSE, VIT University, Vellore, Tamil Nadu
Ref:  XML Document Parsing: Operational and Performance Characteristics  Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
Comparisons  (contd.) Ref:  XML Document Parsing: Operational and Performance Characteristics  Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
Comparisons   (contd.) Ref:  XML Document Parsing: Operational and Performance Characteristics  Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
VTD-XML’s Limitations ,[object Object],[object Object],[object Object],Ref:  http://en.wikipedia.org/wiki/VTD-XML
Parallel Approach to XML Parsing A Parallel Approach to XML Parsing Wei Lu, Kenneth Chiu, Yinfei Pan
Parallel Approach to XML Parsing (cont.) A Parallel Approach to XML Parsing Wei Lu, Kenneth Chiu, Yinfei Pan
Limitations of PXP “ First, the skeleton requires extra memory that is proportional to the number of node in the DOM tree.  Further, the partitioning scheme based on subtrees can cause load imbalance on processing cores for XML documents with irregular or deep  tree structures (e.g., TREEBANK with parts-of-speech tagging [29]).  This scheme severely limits the granularity of parallelism that can be  achieved, and thus cannot scale with increasing core count.” Ref:  2.2 PriorWork on Parallel XML Parsing “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2  and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
ParDOM Ref:  “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2  and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
ParDOM (contd) Ref:  “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2  and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
ParDOM (contd) Ref:  “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2  and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
[object Object]

Más contenido relacionado

La actualidad más candente

2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf OpenflydataJun Zhao
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLPRobert Viseur
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationssChandan Deb
 
C, C++ Training Institute in Chennai , Adyar
C, C++ Training Institute in Chennai , AdyarC, C++ Training Institute in Chennai , Adyar
C, C++ Training Institute in Chennai , AdyarsasikalaD3
 
Object Relational Mapping with LINQ To SQL
Object Relational Mapping with LINQ To SQLObject Relational Mapping with LINQ To SQL
Object Relational Mapping with LINQ To SQLShahriar Hyder
 
ZODB, the Zope Object Database (May 2003)
ZODB, the Zope Object Database (May 2003)ZODB, the Zope Object Database (May 2003)
ZODB, the Zope Object Database (May 2003)Kiran Jonnalagadda
 
Talking about bugs with bugs
Talking about bugs with bugsTalking about bugs with bugs
Talking about bugs with bugsESUG
 
Python 45 minutes hangout #3
Python 45 minutes hangout #3Python 45 minutes hangout #3
Python 45 minutes hangout #3Al Sayed Gamal
 
Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012 Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012 ArangoDB Database
 
Java IO Package and Streams
Java IO Package and StreamsJava IO Package and Streams
Java IO Package and Streamsbabak danyal
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsFabrizio Fortino
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSSumant Tambe
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in javasharma230399
 

La actualidad más candente (18)

Using MRuby in a database
Using MRuby in a databaseUsing MRuby in a database
Using MRuby in a database
 
2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata
 
Streams&amp;io
Streams&amp;ioStreams&amp;io
Streams&amp;io
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLP
 
Javaiostream
JavaiostreamJavaiostream
Javaiostream
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationss
 
C, C++ Training Institute in Chennai , Adyar
C, C++ Training Institute in Chennai , AdyarC, C++ Training Institute in Chennai , Adyar
C, C++ Training Institute in Chennai , Adyar
 
Object Relational Mapping with LINQ To SQL
Object Relational Mapping with LINQ To SQLObject Relational Mapping with LINQ To SQL
Object Relational Mapping with LINQ To SQL
 
ZODB, the Zope Object Database (May 2003)
ZODB, the Zope Object Database (May 2003)ZODB, the Zope Object Database (May 2003)
ZODB, the Zope Object Database (May 2003)
 
Javaiostream
JavaiostreamJavaiostream
Javaiostream
 
Talking about bugs with bugs
Talking about bugs with bugsTalking about bugs with bugs
Talking about bugs with bugs
 
Python 45 minutes hangout #3
Python 45 minutes hangout #3Python 45 minutes hangout #3
Python 45 minutes hangout #3
 
Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012 Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012
 
Java IO Package and Streams
Java IO Package and StreamsJava IO Package and Streams
Java IO Package and Streams
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data Relationships
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in java
 

Similar a Xml processing-by-asfak

Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Marco Gralike
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processorHimanshu Soni
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processorHimanshu Soni
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7Deniz Kılınç
 
unit_5_XML data integration database management
unit_5_XML data integration database managementunit_5_XML data integration database management
unit_5_XML data integration database managementsathiyabcsbs
 
DATA INTEGRATION (Gaining Access to Diverse Data).ppt
DATA INTEGRATION (Gaining Access to Diverse Data).pptDATA INTEGRATION (Gaining Access to Diverse Data).ppt
DATA INTEGRATION (Gaining Access to Diverse Data).pptcareerPointBasti
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 
Applied xml programming for microsoft 2
Applied xml programming for microsoft  2Applied xml programming for microsoft  2
Applied xml programming for microsoft 2Raghu nath
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1Marco Gralike
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best PraticicesAndrzej Zydroń MBCS
 
advDBMS_XML.pptx
advDBMS_XML.pptxadvDBMS_XML.pptx
advDBMS_XML.pptxIreneGetzi
 
Applied xml programming for microsoft
Applied xml programming for microsoftApplied xml programming for microsoft
Applied xml programming for microsoftRaghu nath
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27Max Kleiner
 
[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyondCarles Farré
 

Similar a Xml processing-by-asfak (20)

Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
Real World Experience With Oracle Xml Database 11g An Oracle Ace’s Perspectiv...
 
uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
unit_5_XML data integration database management
unit_5_XML data integration database managementunit_5_XML data integration database management
unit_5_XML data integration database management
 
DATA INTEGRATION (Gaining Access to Diverse Data).ppt
DATA INTEGRATION (Gaining Access to Diverse Data).pptDATA INTEGRATION (Gaining Access to Diverse Data).ppt
DATA INTEGRATION (Gaining Access to Diverse Data).ppt
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
Applied xml programming for microsoft 2
Applied xml programming for microsoft  2Applied xml programming for microsoft  2
Applied xml programming for microsoft 2
 
Xml writers
Xml writersXml writers
Xml writers
 
Unit 2.3
Unit 2.3Unit 2.3
Unit 2.3
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
 
advDBMS_XML.pptx
advDBMS_XML.pptxadvDBMS_XML.pptx
advDBMS_XML.pptx
 
Applied xml programming for microsoft
Applied xml programming for microsoftApplied xml programming for microsoft
Applied xml programming for microsoft
 
XML-talk
XML-talkXML-talk
XML-talk
 
Unit 2.3
Unit 2.3Unit 2.3
Unit 2.3
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
E05412327
E05412327E05412327
E05412327
 
[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond
 

Último

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Último (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Xml processing-by-asfak

  • 1. XML Processing Md. Asfak Mahamud KAZ Software Ltd.
  • 2. XML and Other Markup Languages SGML (1973) HTML (1989) XML (1996) “ XML has several favorable attributes that distinguish it from other competing technologies. Programmers find XML easy to learn because it is human-readable . The downside, however, is that an XML document needs to be parsed for it to become machine-readable.” Ref: XML on a Chip? “ A specially prepared document for Sun Microsystem by XimpleWare [ 6/9/2003 ]“
  • 3.
  • 4.
  • 5. Typical XML Processing Symantic Analysis Parsing input XML Output XML Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 6. Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Symantic Analysis
  • 7. Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Performance Bottleneck Symantic Analysis
  • 8. Typical XML Processing Parsing Access Modification Serialization input XML Output XML Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Performance Bottleneck Performance affected by parsing models Symantic Analysis
  • 9. Steps in Parsing Parsing Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character Conversion Lexical Analysis (FSM) Syntactic Analysis (PDA) Bit Sequence 36 61 3E Character Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array)
  • 10. Steps in Parsing Parsing Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character Conversion Lexical Analysis (FSM) Syntactic Analysis (PDA) Bit Sequence 36 61 3E Character Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array) Invariant among different parsing models
  • 11. Steps in Parsing Parsing Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University Character Conversion Lexical Analysis (FSM) Syntactic Analysis (PDA) Bit Sequence 36 61 3E Character Sequence ‘ <‘ ‘a’ ‘>’ Token Sequence (‘<a>’ ‘X’ ‘</a>’) Data Representation (tree, event, integer array) PARSING MODEL DEPENDENT Invariant among different parsing models Different among different parsing models
  • 12. Xml Processing: DOM & SAX or StAX Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. VTD: inside VTD record Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 19. Xml Processing: VTD Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 20. VTD-XML Parsed Representation of XML. Image: http://vtd-xml.sourceforge.net/technical/2.html
  • 21. VTD-XML Resolving child elements using Location Cache. Image: http://vtd-xml.sourceforge.net/technical/2.html
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. VTD on Android Platform Ref: Analyzing XML Parsers Performance for Android Platform M V Uttam Tej ,Dhanaraj Cheelu, M.Rajasekhara Babu, P Venkata Krishna SCSE, VIT University, Vellore, Tamil Nadu
  • 31. Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 32. Comparisons (contd.) Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 33. Comparisons (contd.) Ref: XML Document Parsing: Operational and Performance Characteristics Tak Cheung Lam and Jianxun Jason Ding (Cisco Systems) Jyh-Charn Liu, Texas A&M University
  • 34.
  • 35. Parallel Approach to XML Parsing A Parallel Approach to XML Parsing Wei Lu, Kenneth Chiu, Yinfei Pan
  • 36. Parallel Approach to XML Parsing (cont.) A Parallel Approach to XML Parsing Wei Lu, Kenneth Chiu, Yinfei Pan
  • 37. Limitations of PXP “ First, the skeleton requires extra memory that is proportional to the number of node in the DOM tree. Further, the partitioning scheme based on subtrees can cause load imbalance on processing cores for XML documents with irregular or deep tree structures (e.g., TREEBANK with parts-of-speech tagging [29]). This scheme severely limits the granularity of parallelism that can be achieved, and thus cannot scale with increasing core count.” Ref: 2.2 PriorWork on Parallel XML Parsing “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2 and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
  • 38. ParDOM Ref: “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2 and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
  • 39. ParDOM (contd) Ref: “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2 and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
  • 40. ParDOM (contd) Ref: “ A Data Parallel Algorithm for XML DOM Parsing” Bhavik Shah 1 , Praveen R. Rao 1 , and Bongki Moon 2 and Mohan Rajagopalan 3 1 University of Missouri-Kansas City 2 University of Arizona 3 Intel Research Labs
  • 41.