SlideShare una empresa de Scribd logo
1 de 40
Trinity College, Dublin: 8 – 11 June 2010
An Introduction to Open XML
CRAIG MURPHY
An Introduction to Open XML
Housekeeping
 Mobile ‘phones
 Fire Exits
 Toilets
3
An Introduction to Open XML 4
Session Overview
 This session will provide an explanation and
demonstration of how we can programmatically create
and use WordML and ExcelML documents
 I will be using the Open XML SDK to make life easier
 No manual creation and management of .zip files / containers
 Let System.IO.Packaging, etc. take care of that
 Avoids a discussion about code bloat, XML bloat and
performance (which is actually very good)
 It won’t be a political view of the “document wars” debate
 There will be no XPS vs PDF vs Open XML vs ODF / OpenDocument
content!
An Introduction to Open XML
If you learn one thing from my session…
 On this day…June 8th…
 1978: Woman takes world sailing record
Yachtswoman Naomi James breaks the solo round-the-world
sailing record by two days.
An Introduction to Open XML
Office 2010 – First Run
6
An Introduction to Open XML
Disclaimer
 This session includes some content from Microsoft slide
decks
 Not going to be an in-depth look at the Open XML API
 Code and demonstrations to get you started
 Simplified version of the methods I use to generate custom
reports in a non-production version of production application!
 I’m a developer, not a designer!
 No flashy graphics or fancy documents 
 Let’s ignore the i4i injunction a Judge in Texas imposed on
Microsoft Word!
7
An Introduction to Open XML
About Me
 60+ presentations delivered:
 IMTC 2008, epicenter 2009
 NRW06, NRW07
 DeveloperDeveloperDeveloper (UK / Ireland Community Events)
 Scottish Developers
 Agile Scotland
 British Computer Society (BCS)
 UK Borland User Group (DDG)
 Visual Basic User Group (VBUG)
 VBUG .net Winter 2001 conference
 XML One 2001
 60+ articles/book reviews published:
 The Delphi Magazine
 developers’ magazine (Dotnet Developers’ Group - DDG)
 ASPToday.com (now Wiley, previously Wrox)
 ASP.NET Pro, International Developer
 CSharpCorner, DeveloperFusion
8
Open XML
XML
XSLT
XQuery
XML Schema
SOAP
WML
IntraWeb
Web Services
C# InterOp with Delphi
RUP
UML
TDD in C#, VB.net and Delphi 8
Scrum
An Introduction to Open XML 9
Agenda
 Motivation
 The Tools
 What: Open XML SDK 2, API Design
 How: Demos, Code Generation, Injection, Content Controls
 Why: Summary
 Resources
An Introduction to Open XML 10
Motivation
 There are times when we are too focused on application
development
 New/useful tools techniques are passed by
 60-90 minute sessions like these, personally, help me save time by:
 Identifying new/useful tools and techniques
 Demonstrating new/useful tools and techniques
 Your takeaway: is Open XML something you should be investigating
further, or not as the case may be
 I have been using Excel automation (COM type libraries) for
report creation…since 1999
 Gone through the “macro dilemma” – to use macros or not?
 For Win32 Borland Delphi applications
 For Win32 .net C# applications
An Introduction to Open XML 11
The Tools
 Visual Studio 2010 Professional
 Open XML SDK 2 RTM (March 2010)
 Sits inside the .NET 3.5 SP1 space (more about this later on);
SDK makes use of LINQ
 Office 2010 Standard
 Only required for viewing documents
 Unlike COM-based automation, an Office client is not required
 A boon if you are preparing reports server-side
 Previously used
 Visual Studio 2008 Professional
 Office 2007
 Open XML SDK CTPs
An Introduction to Open XML 12
Agenda
 Motivation
 The Tools
 What: Open XML SDK 2, API Design
 How: Demos - Manual, Code Generation, Injection
 Why: Summary
 Resources
An Introduction to Open XML
Open XML SDK 2
 Productivity Tool
 DocumentReflector for code generation
 OpenXMLClassExplorer explore the Open XML markup and the
ECMA 376 specification
 OpenXMLDiff graphically compare Open XML files
 OpenXMLValidator to validate entire documents or “document
parts” against Office 2007 or Office 2010 file formats
13
An Introduction to Open XML
What is Open XML?
 …an open standard for word-processing documents,
presentations, and spreadsheets that can be freely
implemented by multiple applications on different
platforms
 …faithful representation of existing word-processing
documents, presentations, and spreadsheets that are
encoded in binary formats defined by Microsoft® Office
applications, i.e. tightly coupled
 …purpose of the Open XML standard is to de-couple
documents created by Microsoft Office applications so
that they can be manipulated by other applications
independent of proprietary formats and without the loss
of data
https://connect.microsoft.com/content/content.aspx?ContentID=9521&SiteID=589&wa=wsignin1.0
14
An Introduction to Open XML
Before…Open XML SDK V2
 Namespaces, element names and attributes were irksome
to remember and to get right
 Generally, constants were used to make managing
namespaces, etc. that bit easier
 Lack of strong typing
 Code would compile
 May produce incorrect results at run-time
15
<w:document xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main'>
<w:body><w:p><w:r><w:t>some text</w:t></w:r></w:p>
</w:body>
</w:document>
An Introduction to Open XML
Now…Open XML SDK V2
 Strongly Typed Object Model
 Node identification using strings is a thing of the past
 Loosely typed System.Xml.Linq.XElement usage can be replaced
 e.g. DocumentFormat.OpenXml.WordProcessing.Paragraph
 Spelling mistakes are caught by compile-time type checking
 Obviously strong typing is preferable
16
AFTER
var paragraphs = doc.MainDocumentPart
.Document.Body.Elements<Paragraph>()
.Select
BEFORE
var paragraphs =
doc.MainDocumentPart
.GetXDocument()
.Element(w + "document")
.Element(w + "body")
.Elements(w + "p")
.Select
An Introduction to Open XML
API Design
17
An Introduction to Open XML
API Design
System Support
18
 .Net Framework 3.5 – The Open XML SDK leverages the advanced technology provided
by .Net Framework 3.5, especially LINQ To XML, which makes manipulating XML much
easier and more intuitive
 System.IO.Packaging – The Open XML SDK needs to be able to add/remove parts
contained within the Open XML Format packages. Included as part of .Net Framework
3.0 were a set of generic packaging APIs capable of adding and removing parts of OPC
(Open Package Convention) conforming packages. Given that Open XML Formats are
based on OPC, the SDK uses System.IO.Packaging APIs to open, edit and save Open XML
Packages
 Open XML Schemas – The Open XML SDK is based on Open XML Formats, which are
represented and described as schemas. These schemas make up the foundation of the
Open XML SDK, since the SDK enables Open XML developers to build solutions on top
of Open XML Formats
An Introduction to Open XML
API Design
Open XML File Format Base Level
 Stream Reading/Writing
 includes stream reader and writer interfaces targeting Open XML elements and attributes
 similar to XmlReader/XmlWriter, easier to use as the interfaces are Open XML aware
 Open XML Low Level DOM
 Manipulate the Open XML tree directly by working with strongly typed objects and classes
instead of traditional XML nodes
 Awareness of namespaces as well as element/attribute names is reduced
 Intellisense for properties, etc.
 Leverages LINQ
 Open XML Packaging API
 Sits above System.IO.Packaging (.NET 3.0)
 allows developers to manipulate Open XML parts with strongly typed classes and objects
 Shipped in Open XML SDK v1.0
19
An Introduction to Open XML
API Design
Validation & Helpers
 Validation Layer
 Open XML base layer does not guarantee creation of valid Open
XML documents!
 Our reliance on XML Schema, XSD files, is reduced if not removed
 The SDK takes care of it on our behalf
 Helper Functions
 Work directly on the XML elements and are functionally limited
by the file format standard
 e.g. deletion of a WordML paragraph – a helper function may
ensure that all additional steps are taken to leave the document is
a valid state…
20
An Introduction to Open XML
The Importance of Validation
 http://blogs.msdn.com/brian_jones/archive/2009/04/08/
announcing-the-release-of-the-open-xml-sdk-version-2-
april-2009-ctp.aspx
21
<w:body>
<w:p>
<w:r>
<w:t>hello world</w:t>
</w:r>
</w:p>
...
</w:body>
<w:body>
<w:p>
<w:t>hello world</w:t>
</w:p>
...
</w:body>
An Introduction to Open XML 22
Agenda
 Motivation
 The Tools
 What: Open XML SDK 2, API Design
 How: Demos, Code Generation, Injection, Content Controls
 Why: Summary
 Resources
An Introduction to Open XML
WordML
Document Structure
23
 Take a .docx, an .xlsx or a
.pptx file, rename it as a
.zip file
 Open using Compressed
Folders or your favourite
zip utility
 Very readable, but
without the SDK, difficult
to manage, especially in
code
An Introduction to Open XML
Document Parts
 A document part is…
 analogous to a file on the file system
 stored inside the package in a specific location reachable via a URI
 stored with a specific content type
 mainly XML but other native types as well
 Images, sounds, video, OLE objects
 Content type is enforced
 Example: cannot tag JPEG part as GIF
 [Open Excel - sample file – look for the image]
24
An Introduction to Open XML
ExcelML
Document Structure
25
 Relationships are stored
in XML streams in the
package
 Ties elements inside the
package to each other
 Allows navigation of
document without parsing
parts
 Package relationships
stream URI: /_rels/.rels
 Part relationships stream
URI: _rels/[partname].rels
An Introduction to Open XML 26
demo
WordML and ExcelML
An Introduction to Open XML
Content Controls
 New in Word 2007
 Manageable via the Word Content Control Toolkit
 Programmatic access to specific “fields” within a
document
 “Bindable”
 Can be bound to XML nodes
 Makes use of the customXML folder
27
An Introduction to Open XML
Enabling the Developer ribbon – Word 2007
28
An Introduction to Open XML
Enabling the Developer ribbon – Word 2010
29
An Introduction to Open XML
Why Use Content Controls?
 In situations where small amounts information is collected
from many users:
 How often have you seen a spreadsheet being e-mailed to
hundreds of users, asking them to fill in “some” cells?
 Give them a Word document with Content Controls
 Use a custom-written .NET application that aggregates the
information in the Content Controls into an Excel spreadsheet
30
An Introduction to Open XML 31
demo
Content Controls
CustomXML
in Word 2007 / Word 2010
An Introduction to Open XML
Deployment
 All that you need to deploy are:
 Your OpenXML-enabled application
 DocumentFormat.OpenXml.dll
 WindowsBase.dll
 .NET (VPC test…)
 c:Program FilesReference AssembliesMicrosoftFrameworkv3.0WindowsBase.dll
 http://blogs.msdn.com/dmahugh/archive/2006/12/14/finding-windowsbase-dll.aspx
33
An Introduction to Open XML 34
Agenda
 Motivation
 The Tools
 What: Open XML SDK 2, API Design
 How: Demos - Manual, Code Generation, Injection
 Why: Summary
 Resources
An Introduction to Open XML
Summary
 Open XML is little more than a moderately complex XML
document
 XML is readily accessible
 in the .NET framework
 in VB6
 in Java
 in Python, etc.
 An Office installation is not required
 Office client not required on the server
 Enables Office document creation from non-Microsoft platforms
 “…it’s just zip, it’s just XML…” - Doug Mahugh
 http://channel9.msdn.com/posts/AdamKinney/Open-XML-File-Formats
35
An Introduction to Open XML
Summary
 Start from a template document
 Easy replication of existing [client] documents
 Use the DocumentRefector to generate Open XML code
 Refactor your report data into the generated code
 Learn from the reflected / generated code
 Open XML code is cleaner, more readable and more
maintainable than its COM counterpart
 Open XML documents can be consumed using
applications and platforms from vendors other than
Microsoft
36
An Introduction to Open XML 37
Resources (web-sites & blogs)
 Open XML Format SDK 2.0
 http://url.ie/tik
 Microsoft’s Open XML portal
 http://www.openxmldeveloper.org/
 If you are interested in Open XML / ODF conversion
 http://sourceforge.net/projects/odf-converter
 http://www.twitter.com/openxml
 Microsoft folks:
 Brian Jones http://blogs.msdn.com/brian_jones/
 Doug Mahugh http://blogs.msdn.com/dmahugh/
 Kevin Boske http://blogs.msdn.com/kevinboske/
 Erika Ehrli http://blogs.msdn.com/erikaehrli/
 Eric White http://blogs.msdn.com/ericwhite/
An Introduction to Open XML
Resources (web-sites & blogs)
 Word 2007 Content Control Toolkit on CodePlex
 http://www.codeplex.com/dbe
 Matthew Scott’s Content Controls and CustomXML
Channel 9 video
 http://url.ie/u05
 Wouter van Vugt
 http://blogs.code-counsel.net/Wouter/default.aspx
 A collection of Open XML resources:
 http://www.craigmurphy.com/blog/?p=871
 Including these slides and C# source code
38
An Introduction to Open XML 39
Resources (Books)
Open XML Explained
Wouter van Vugt
http://openxmldeveloper.org/articles/1970.aspx
An Introduction to Open XML
Contact Information
Craig Murphy
http://www.twitter.com/CAMURPHY
Updated slides, notes and source code:
http://www.CraigMurphy.com
http://www.CraigMurphy.com/blog
Questions

Más contenido relacionado

La actualidad más candente

Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...IndicThreads
 
Dev212 Comparing Net And Java The View From 2006
Dev212 Comparing  Net And Java  The View From 2006Dev212 Comparing  Net And Java  The View From 2006
Dev212 Comparing Net And Java The View From 2006kkorovkin
 
Intro Java Rev010
Intro Java Rev010Intro Java Rev010
Intro Java Rev010Rich Helton
 
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTHWEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTHBhavsingh Maloth
 
Multithreading in java
Multithreading in javaMultithreading in java
Multithreading in javaKavitha713564
 
Accelerate your Lotus Domino Web Applications with Dojo and XPages
Accelerate your Lotus Domino Web Applications with Dojo and XPagesAccelerate your Lotus Domino Web Applications with Dojo and XPages
Accelerate your Lotus Domino Web Applications with Dojo and XPagesDavalen LLC
 
Input output files in java
Input output files in javaInput output files in java
Input output files in javaKavitha713564
 
.Net framework components by naveen kumar veligeti
.Net framework components by naveen kumar veligeti.Net framework components by naveen kumar veligeti
.Net framework components by naveen kumar veligetiNaveen Kumar Veligeti
 
.NET and C# introduction
.NET and C# introduction.NET and C# introduction
.NET and C# introductionPeter Gfader
 
Online lg prodect
Online lg prodectOnline lg prodect
Online lg prodectYesu Raj
 
Automating API Documentation
Automating API DocumentationAutomating API Documentation
Automating API DocumentationSelvakumar T S
 
ASP.NET Session 3
ASP.NET Session 3ASP.NET Session 3
ASP.NET Session 3Sisir Ghosh
 

La actualidad más candente (15)

VB.net
VB.netVB.net
VB.net
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...
 
Dev212 Comparing Net And Java The View From 2006
Dev212 Comparing  Net And Java  The View From 2006Dev212 Comparing  Net And Java  The View From 2006
Dev212 Comparing Net And Java The View From 2006
 
Intro Java Rev010
Intro Java Rev010Intro Java Rev010
Intro Java Rev010
 
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTHWEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTH
WEB PROGRAMMING UNIT VIII BY BHAVSINGH MALOTH
 
Multithreading in java
Multithreading in javaMultithreading in java
Multithreading in java
 
Accelerate your Lotus Domino Web Applications with Dojo and XPages
Accelerate your Lotus Domino Web Applications with Dojo and XPagesAccelerate your Lotus Domino Web Applications with Dojo and XPages
Accelerate your Lotus Domino Web Applications with Dojo and XPages
 
Input output files in java
Input output files in javaInput output files in java
Input output files in java
 
.Net framework components by naveen kumar veligeti
.Net framework components by naveen kumar veligeti.Net framework components by naveen kumar veligeti
.Net framework components by naveen kumar veligeti
 
.NET and C# introduction
.NET and C# introduction.NET and C# introduction
.NET and C# introduction
 
Basic of java
Basic of javaBasic of java
Basic of java
 
Protocol buffers
Protocol buffersProtocol buffers
Protocol buffers
 
Online lg prodect
Online lg prodectOnline lg prodect
Online lg prodect
 
Automating API Documentation
Automating API DocumentationAutomating API Documentation
Automating API Documentation
 
ASP.NET Session 3
ASP.NET Session 3ASP.NET Session 3
ASP.NET Session 3
 

Destacado (8)

distance education
distance education distance education
distance education
 
Distance Education
Distance EducationDistance Education
Distance Education
 
Energy Reconsidered
Energy ReconsideredEnergy Reconsidered
Energy Reconsidered
 
Powerpoint Project
Powerpoint ProjectPowerpoint Project
Powerpoint Project
 
Tet 200
Tet 200Tet 200
Tet 200
 
distance education
distance educationdistance education
distance education
 
International Conference on Cognitive Modeling 2010 Brahms tutorial
International Conference on Cognitive Modeling 2010 Brahms tutorialInternational Conference on Cognitive Modeling 2010 Brahms tutorial
International Conference on Cognitive Modeling 2010 Brahms tutorial
 
Brahms Agent-Based Modeling & Simulation Course #1
Brahms Agent-Based Modeling & Simulation Course #1Brahms Agent-Based Modeling & Simulation Course #1
Brahms Agent-Based Modeling & Simulation Course #1
 

Similar a epicenter2010 Open Xml

Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Bill Buchan
 
Introductionto Xm Lmessaging
Introductionto Xm LmessagingIntroductionto Xm Lmessaging
Introductionto Xm LmessagingLiquidHub
 
Office OpenXML: a technical approach for OOo.
Office OpenXML: a technical approach for OOo.Office OpenXML: a technical approach for OOo.
Office OpenXML: a technical approach for OOo.Alexandro Colorado
 
Understanding Dom
Understanding DomUnderstanding Dom
Understanding DomLiquidHub
 
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML Alexandro Colorado
 
Linq to xml
Linq to xmlLinq to xml
Linq to xmlMickey
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27Max Kleiner
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml dataaspnet123
 
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1John F. Holliday
 
First Encounters With Office Open Xml Matt Turner 12 4 2007
First Encounters With Office Open Xml Matt Turner 12 4 2007First Encounters With Office Open Xml Matt Turner 12 4 2007
First Encounters With Office Open Xml Matt Turner 12 4 2007Dave Kellogg
 
First Encounters With Office Open Xml
First Encounters With Office Open XmlFirst Encounters With Office Open Xml
First Encounters With Office Open XmlMatt Turner
 
Applied xml programming for microsoft
Applied xml programming for microsoftApplied xml programming for microsoft
Applied xml programming for microsoftRaghu nath
 
.Net framework
.Net framework.Net framework
.Net frameworkRaghu nath
 

Similar a epicenter2010 Open Xml (20)

Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
 
Introductionto Xm Lmessaging
Introductionto Xm LmessagingIntroductionto Xm Lmessaging
Introductionto Xm Lmessaging
 
Office OpenXML: a technical approach for OOo.
Office OpenXML: a technical approach for OOo.Office OpenXML: a technical approach for OOo.
Office OpenXML: a technical approach for OOo.
 
Understanding Dom
Understanding DomUnderstanding Dom
Understanding Dom
 
Ad507
Ad507Ad507
Ad507
 
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML
A Technical Comparison: ISO/IEC 26300 vs Microsoft Office Open XML
 
Xml+messaging+with+soap
Xml+messaging+with+soapXml+messaging+with+soap
Xml+messaging+with+soap
 
Linq to xml
Linq to xmlLinq to xml
Linq to xml
 
ODF Mashups
ODF MashupsODF Mashups
ODF Mashups
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Working with xml data
Working with xml dataWorking with xml data
Working with xml data
 
LOD2: State of Play WP6 - LOD2 Stack Architecture
LOD2: State of Play WP6 - LOD2 Stack ArchitectureLOD2: State of Play WP6 - LOD2 Stack Architecture
LOD2: State of Play WP6 - LOD2 Stack Architecture
 
dot NET Framework
dot NET Frameworkdot NET Framework
dot NET Framework
 
treeview
treeviewtreeview
treeview
 
treeview
treeviewtreeview
treeview
 
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1
SPEVO13 - Dev212 - Document Assembly Deep Dive Part 1
 
First Encounters With Office Open Xml Matt Turner 12 4 2007
First Encounters With Office Open Xml Matt Turner 12 4 2007First Encounters With Office Open Xml Matt Turner 12 4 2007
First Encounters With Office Open Xml Matt Turner 12 4 2007
 
First Encounters With Office Open Xml
First Encounters With Office Open XmlFirst Encounters With Office Open Xml
First Encounters With Office Open Xml
 
Applied xml programming for microsoft
Applied xml programming for microsoftApplied xml programming for microsoft
Applied xml programming for microsoft
 
.Net framework
.Net framework.Net framework
.Net framework
 

epicenter2010 Open Xml

  • 1.
  • 2. Trinity College, Dublin: 8 – 11 June 2010 An Introduction to Open XML CRAIG MURPHY
  • 3. An Introduction to Open XML Housekeeping  Mobile ‘phones  Fire Exits  Toilets 3
  • 4. An Introduction to Open XML 4 Session Overview  This session will provide an explanation and demonstration of how we can programmatically create and use WordML and ExcelML documents  I will be using the Open XML SDK to make life easier  No manual creation and management of .zip files / containers  Let System.IO.Packaging, etc. take care of that  Avoids a discussion about code bloat, XML bloat and performance (which is actually very good)  It won’t be a political view of the “document wars” debate  There will be no XPS vs PDF vs Open XML vs ODF / OpenDocument content!
  • 5. An Introduction to Open XML If you learn one thing from my session…  On this day…June 8th…  1978: Woman takes world sailing record Yachtswoman Naomi James breaks the solo round-the-world sailing record by two days.
  • 6. An Introduction to Open XML Office 2010 – First Run 6
  • 7. An Introduction to Open XML Disclaimer  This session includes some content from Microsoft slide decks  Not going to be an in-depth look at the Open XML API  Code and demonstrations to get you started  Simplified version of the methods I use to generate custom reports in a non-production version of production application!  I’m a developer, not a designer!  No flashy graphics or fancy documents   Let’s ignore the i4i injunction a Judge in Texas imposed on Microsoft Word! 7
  • 8. An Introduction to Open XML About Me  60+ presentations delivered:  IMTC 2008, epicenter 2009  NRW06, NRW07  DeveloperDeveloperDeveloper (UK / Ireland Community Events)  Scottish Developers  Agile Scotland  British Computer Society (BCS)  UK Borland User Group (DDG)  Visual Basic User Group (VBUG)  VBUG .net Winter 2001 conference  XML One 2001  60+ articles/book reviews published:  The Delphi Magazine  developers’ magazine (Dotnet Developers’ Group - DDG)  ASPToday.com (now Wiley, previously Wrox)  ASP.NET Pro, International Developer  CSharpCorner, DeveloperFusion 8 Open XML XML XSLT XQuery XML Schema SOAP WML IntraWeb Web Services C# InterOp with Delphi RUP UML TDD in C#, VB.net and Delphi 8 Scrum
  • 9. An Introduction to Open XML 9 Agenda  Motivation  The Tools  What: Open XML SDK 2, API Design  How: Demos, Code Generation, Injection, Content Controls  Why: Summary  Resources
  • 10. An Introduction to Open XML 10 Motivation  There are times when we are too focused on application development  New/useful tools techniques are passed by  60-90 minute sessions like these, personally, help me save time by:  Identifying new/useful tools and techniques  Demonstrating new/useful tools and techniques  Your takeaway: is Open XML something you should be investigating further, or not as the case may be  I have been using Excel automation (COM type libraries) for report creation…since 1999  Gone through the “macro dilemma” – to use macros or not?  For Win32 Borland Delphi applications  For Win32 .net C# applications
  • 11. An Introduction to Open XML 11 The Tools  Visual Studio 2010 Professional  Open XML SDK 2 RTM (March 2010)  Sits inside the .NET 3.5 SP1 space (more about this later on); SDK makes use of LINQ  Office 2010 Standard  Only required for viewing documents  Unlike COM-based automation, an Office client is not required  A boon if you are preparing reports server-side  Previously used  Visual Studio 2008 Professional  Office 2007  Open XML SDK CTPs
  • 12. An Introduction to Open XML 12 Agenda  Motivation  The Tools  What: Open XML SDK 2, API Design  How: Demos - Manual, Code Generation, Injection  Why: Summary  Resources
  • 13. An Introduction to Open XML Open XML SDK 2  Productivity Tool  DocumentReflector for code generation  OpenXMLClassExplorer explore the Open XML markup and the ECMA 376 specification  OpenXMLDiff graphically compare Open XML files  OpenXMLValidator to validate entire documents or “document parts” against Office 2007 or Office 2010 file formats 13
  • 14. An Introduction to Open XML What is Open XML?  …an open standard for word-processing documents, presentations, and spreadsheets that can be freely implemented by multiple applications on different platforms  …faithful representation of existing word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft® Office applications, i.e. tightly coupled  …purpose of the Open XML standard is to de-couple documents created by Microsoft Office applications so that they can be manipulated by other applications independent of proprietary formats and without the loss of data https://connect.microsoft.com/content/content.aspx?ContentID=9521&SiteID=589&wa=wsignin1.0 14
  • 15. An Introduction to Open XML Before…Open XML SDK V2  Namespaces, element names and attributes were irksome to remember and to get right  Generally, constants were used to make managing namespaces, etc. that bit easier  Lack of strong typing  Code would compile  May produce incorrect results at run-time 15 <w:document xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main'> <w:body><w:p><w:r><w:t>some text</w:t></w:r></w:p> </w:body> </w:document>
  • 16. An Introduction to Open XML Now…Open XML SDK V2  Strongly Typed Object Model  Node identification using strings is a thing of the past  Loosely typed System.Xml.Linq.XElement usage can be replaced  e.g. DocumentFormat.OpenXml.WordProcessing.Paragraph  Spelling mistakes are caught by compile-time type checking  Obviously strong typing is preferable 16 AFTER var paragraphs = doc.MainDocumentPart .Document.Body.Elements<Paragraph>() .Select BEFORE var paragraphs = doc.MainDocumentPart .GetXDocument() .Element(w + "document") .Element(w + "body") .Elements(w + "p") .Select
  • 17. An Introduction to Open XML API Design 17
  • 18. An Introduction to Open XML API Design System Support 18  .Net Framework 3.5 – The Open XML SDK leverages the advanced technology provided by .Net Framework 3.5, especially LINQ To XML, which makes manipulating XML much easier and more intuitive  System.IO.Packaging – The Open XML SDK needs to be able to add/remove parts contained within the Open XML Format packages. Included as part of .Net Framework 3.0 were a set of generic packaging APIs capable of adding and removing parts of OPC (Open Package Convention) conforming packages. Given that Open XML Formats are based on OPC, the SDK uses System.IO.Packaging APIs to open, edit and save Open XML Packages  Open XML Schemas – The Open XML SDK is based on Open XML Formats, which are represented and described as schemas. These schemas make up the foundation of the Open XML SDK, since the SDK enables Open XML developers to build solutions on top of Open XML Formats
  • 19. An Introduction to Open XML API Design Open XML File Format Base Level  Stream Reading/Writing  includes stream reader and writer interfaces targeting Open XML elements and attributes  similar to XmlReader/XmlWriter, easier to use as the interfaces are Open XML aware  Open XML Low Level DOM  Manipulate the Open XML tree directly by working with strongly typed objects and classes instead of traditional XML nodes  Awareness of namespaces as well as element/attribute names is reduced  Intellisense for properties, etc.  Leverages LINQ  Open XML Packaging API  Sits above System.IO.Packaging (.NET 3.0)  allows developers to manipulate Open XML parts with strongly typed classes and objects  Shipped in Open XML SDK v1.0 19
  • 20. An Introduction to Open XML API Design Validation & Helpers  Validation Layer  Open XML base layer does not guarantee creation of valid Open XML documents!  Our reliance on XML Schema, XSD files, is reduced if not removed  The SDK takes care of it on our behalf  Helper Functions  Work directly on the XML elements and are functionally limited by the file format standard  e.g. deletion of a WordML paragraph – a helper function may ensure that all additional steps are taken to leave the document is a valid state… 20
  • 21. An Introduction to Open XML The Importance of Validation  http://blogs.msdn.com/brian_jones/archive/2009/04/08/ announcing-the-release-of-the-open-xml-sdk-version-2- april-2009-ctp.aspx 21 <w:body> <w:p> <w:r> <w:t>hello world</w:t> </w:r> </w:p> ... </w:body> <w:body> <w:p> <w:t>hello world</w:t> </w:p> ... </w:body>
  • 22. An Introduction to Open XML 22 Agenda  Motivation  The Tools  What: Open XML SDK 2, API Design  How: Demos, Code Generation, Injection, Content Controls  Why: Summary  Resources
  • 23. An Introduction to Open XML WordML Document Structure 23  Take a .docx, an .xlsx or a .pptx file, rename it as a .zip file  Open using Compressed Folders or your favourite zip utility  Very readable, but without the SDK, difficult to manage, especially in code
  • 24. An Introduction to Open XML Document Parts  A document part is…  analogous to a file on the file system  stored inside the package in a specific location reachable via a URI  stored with a specific content type  mainly XML but other native types as well  Images, sounds, video, OLE objects  Content type is enforced  Example: cannot tag JPEG part as GIF  [Open Excel - sample file – look for the image] 24
  • 25. An Introduction to Open XML ExcelML Document Structure 25  Relationships are stored in XML streams in the package  Ties elements inside the package to each other  Allows navigation of document without parsing parts  Package relationships stream URI: /_rels/.rels  Part relationships stream URI: _rels/[partname].rels
  • 26. An Introduction to Open XML 26 demo WordML and ExcelML
  • 27. An Introduction to Open XML Content Controls  New in Word 2007  Manageable via the Word Content Control Toolkit  Programmatic access to specific “fields” within a document  “Bindable”  Can be bound to XML nodes  Makes use of the customXML folder 27
  • 28. An Introduction to Open XML Enabling the Developer ribbon – Word 2007 28
  • 29. An Introduction to Open XML Enabling the Developer ribbon – Word 2010 29
  • 30. An Introduction to Open XML Why Use Content Controls?  In situations where small amounts information is collected from many users:  How often have you seen a spreadsheet being e-mailed to hundreds of users, asking them to fill in “some” cells?  Give them a Word document with Content Controls  Use a custom-written .NET application that aggregates the information in the Content Controls into an Excel spreadsheet 30
  • 31. An Introduction to Open XML 31 demo Content Controls CustomXML in Word 2007 / Word 2010
  • 32. An Introduction to Open XML Deployment  All that you need to deploy are:  Your OpenXML-enabled application  DocumentFormat.OpenXml.dll  WindowsBase.dll  .NET (VPC test…)  c:Program FilesReference AssembliesMicrosoftFrameworkv3.0WindowsBase.dll  http://blogs.msdn.com/dmahugh/archive/2006/12/14/finding-windowsbase-dll.aspx 33
  • 33. An Introduction to Open XML 34 Agenda  Motivation  The Tools  What: Open XML SDK 2, API Design  How: Demos - Manual, Code Generation, Injection  Why: Summary  Resources
  • 34. An Introduction to Open XML Summary  Open XML is little more than a moderately complex XML document  XML is readily accessible  in the .NET framework  in VB6  in Java  in Python, etc.  An Office installation is not required  Office client not required on the server  Enables Office document creation from non-Microsoft platforms  “…it’s just zip, it’s just XML…” - Doug Mahugh  http://channel9.msdn.com/posts/AdamKinney/Open-XML-File-Formats 35
  • 35. An Introduction to Open XML Summary  Start from a template document  Easy replication of existing [client] documents  Use the DocumentRefector to generate Open XML code  Refactor your report data into the generated code  Learn from the reflected / generated code  Open XML code is cleaner, more readable and more maintainable than its COM counterpart  Open XML documents can be consumed using applications and platforms from vendors other than Microsoft 36
  • 36. An Introduction to Open XML 37 Resources (web-sites & blogs)  Open XML Format SDK 2.0  http://url.ie/tik  Microsoft’s Open XML portal  http://www.openxmldeveloper.org/  If you are interested in Open XML / ODF conversion  http://sourceforge.net/projects/odf-converter  http://www.twitter.com/openxml  Microsoft folks:  Brian Jones http://blogs.msdn.com/brian_jones/  Doug Mahugh http://blogs.msdn.com/dmahugh/  Kevin Boske http://blogs.msdn.com/kevinboske/  Erika Ehrli http://blogs.msdn.com/erikaehrli/  Eric White http://blogs.msdn.com/ericwhite/
  • 37. An Introduction to Open XML Resources (web-sites & blogs)  Word 2007 Content Control Toolkit on CodePlex  http://www.codeplex.com/dbe  Matthew Scott’s Content Controls and CustomXML Channel 9 video  http://url.ie/u05  Wouter van Vugt  http://blogs.code-counsel.net/Wouter/default.aspx  A collection of Open XML resources:  http://www.craigmurphy.com/blog/?p=871  Including these slides and C# source code 38
  • 38. An Introduction to Open XML 39 Resources (Books) Open XML Explained Wouter van Vugt http://openxmldeveloper.org/articles/1970.aspx
  • 39. An Introduction to Open XML Contact Information Craig Murphy http://www.twitter.com/CAMURPHY Updated slides, notes and source code: http://www.CraigMurphy.com http://www.CraigMurphy.com/blog

Notas del editor

  1. Using the Productivity Tool, you can:Generate Open XML SDK source code based on document content. The source code could be used to regenerate all or part of the document.Compare source and target Open XML documents to highlight the differences. You can reveal the differences in the document part structure as well as the content differences. Based on those differences, you can generate source code that employs the Open XML SDK 2.0 to create the target document from the source.Validate documents. You can validate an entire document, specific document parts or a segment of content against Office 2007 or Office 2010 file formats.Display documentation for the Open XML SDK 2.0, the ISO/IEC 29500 Open XML File Formats standard, and the Microsoft Office implementer notes.
  2. This content set provides documentation and guidance for the strongly-typed classes in the Open XML SDK 2.0 for Microsoft Office.Welcome to the Open XML SDK 2.0 for Microsoft Office. The SDK is built on the System.IO.Packaging API and provides strongly-typed classes to manipulate documents that adhere to the Office Open XML File Formats Specification. The Office Open XML File Formats specification is an open, international, ECMA-376, Second Edition and ISO/IEC 29500 standard. The Open XML file formats are useful for developers because they are an open standard and are based on well-known technologies: ZIP and XML. The Open XML SDK 2.0 simplifies the task of manipulating Open XML packages and the underlying Open XML schema elements within a package. The Open XML SDK 2.0 encapsulates many common tasks that developers perform on Open XML packages, so that you can perform complex operations with just a few lines of code.
  3. Stream Reading/Writing – This component includes stream reader and writer interfaces specifically targeting Open XML elements and attributes. The readers and writers behave similar to XmlReader/XmlWriter, but are easier to use since the interfaces are Open XML aware.Open XML Low Level DOM – This component represents the xml wrapper of the Open XML schemas. Developers are able to use this component to manipulate the Open XML tree directly by working with strongly typed objects and classes instead of traditional XML nodes that require developers to be aware of namespaces as well as element/attribute names. The major advantage of having strongly typed classes and objects is that developers can easily see what properties are defined on a given class through intellisense. For example, a developer will know exactly what properties and children can exist off of a Paragraph object. In addition working with objects abstracts the requirement of remembering namespaces and element/attribute/value names since these concepts are implicitly defined by classes. This component is leverages many of the designs of LINQ in order to further improve the ease of use of this SDK.Open XML Packaging API – This component is built on top of the .Net Framework 3.0 System.IO.Packaging component. Instead of providing generic access to the parts contained in the Open XML Package, this component allows developers to manipulate Open XML parts with strongly typed classes and objects. This component has already shipped as the Open XML SDK v1.0.
  4. The Validation layer provides validation support when developing Open XML documents. Manipulating Open XML Formats by using the Open XML Base layer makes it much easier for developers to work on the Open XML tree, but doing so does not guarantee the production of valid Open XML documents. This validation layer assists developers by allowing developers to validate created Open XML documents against the Open XML schemas and additional syntax constraints as defined in the standard. Instead of relying on XSD files, prose within the standard and observed application behaviors, developers are able to leverage the SDK to cover much of this manual work. Manipulating the Open XML files requires that developers be familiar with the standard so that they won’t corrupt the files by breaking certain constrains in the standard.  This difficulty becomes apparent when working with the Open XML base layer and is evidenced in the results of the validation layer. For example, deleting a paragraph in a WordprocessingML document is not simply just deleting the paragraph node. There are a variety of extra steps required to delete a paragraph and maintain the integrity of a valid Open XML document.The SDK provides higher level helper functions or code snippets that can deal with common complex file format operations. These helper functions or snippets make the appropriate xml and part/relationship modifications when performing complex tasks. These helper functions or snippets don’t abstract away from the actual xml itself, but rather perform operations on the xml elements by taking advantage of the validation awareness. For example, deleting a paragraph element in a WordprocessingML document may result in corruption. A potential helper function would perform this delete operation and do the necessary extra steps to clean the resulting xml to ensure validity. These delete helper functions or snippets can be applied to other elements that are hard to delete, like tables and comments. In other words, these higher level functions or snippets perform directly on the xml elements and are constrained, in terms of functionality, by the file format standard itself. 
  5. &lt;w:body&gt; &lt;w:p&gt; &lt;w:t&gt;hello world&lt;/w:t&gt; &lt;/w:p&gt; ... &lt;/w:body&gt;
  6. CustomXML content controlshttp://channel9.msdn.com/posts/Rory/Matthew-Scott-Application-Development-using-the-Open-XML-File-Formats/