Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Xml For Dummies Chapter 8 Understanding And Using Dt Ds it-slideshares.blogspot.com
1. XML For Dummies Book Author : Lucinda Dykes & Ed Tittle Slides Prepared By: Son TN Chapter 8 :Understanding and Using DTDs
2. Contents What ‘s a DTD ? Inspecting the XML Prolog Reading a DTD Using Element Declarations Declaring Attributes Understanding Notations Calling a DTD
3. 8. 1. What’s a DTD ? Document Type Definition (DTD) Defines the syntax, grammar & semantics Defines the document structure What Elements, Attributes, Entities, etc are permitted? How are the document elements related & structured? Referenced by or defined in XML documents, but it’s not XML! Enables validation of XML documents using an XML Parser. Can be referenced to by more than one XML document. DTD’s may reference other DTD’s
4. 1. What’s a DTD ? (Cont) Table 8-1 deciphers some of the terms
5. 8.1.1 When to use a DTD XML doesn’t require you to use a DTD Including DTDs in a document is to decide whether you want to jump off that particular bridge. Several reasons to use DTD: To create and manage large sets of documents for your company. To define clearly what markup may be used in certain documents and how markup should be sequenced. To provide a common frame of reference for documents that many users can share. Standardization and control are what they’re all about!
6. 8.1.2 When NOT to use a DTD You may not need to use a DTD if: You’re working with only one or a few small documents. You’re using a nonvalidating processor to handle your XML documents. Let the XML documents or data that you work with drive you toward or away from creating formal document descriptions.
7. 8.2 2. Inspecting the XML Prolog The XML prolog is the first thing that a processor — or human eye, for that matter — sees in an XML document An XML prolog may include the following items XML declaration DOCTYPE declaration Comments Processing instructions White space The XML declaration.
8. 8.2.1 Examining the XML declaration A declaration is markup that tells an XML processor what to do. Declarations don’t add structure or define document elements. They provide instructions for a processor, such as what type of document to process and what standards to use. The XML declaration can include version, encoding, and/or standalone attributes: This is an XML document. The version of XML is XML 1.0. The character encoding is UTF-8. An external document may be needed to complete the document content (standalone=”no”). <?xml version=”1.0” encoding=”UTF-8” standalone=”no”?>
9. 8.2.2 Discovering the DOCTYPE The document type (DOCTYPE) declaration is markup that tells the processor where it can find a specific DTD. Here’s the basic markup of a DOCTYPE declaration: <!DOCTYPE marks the start of the DOCTYPE declaration. Books is the name of the DTD used. SYSTEM “bookstore.dtd” tells the processor to fetch an external document — in this case, a file named bookstore.dtd. <!DOCTYPE books SYSTEM “bookstore.dtd”>
10. 8.2.3 Understanding comments Comments — use them and read them! Use comments to include text that explains a document better (humans love that sort of thing) without that text being displayed — or even processed. The correct format is: You have two rules to live by when you’re using comments: Never nest a comment inside another element. Never include - (hyphen) or -- (double hyphen) within the comment text. Using comments enables you to leave human-style instructions (that is, comments) addressed to someone who reads the markup without disrupting the document’s structure. <!-- comment text --> <!-- Include your comment here -->
11. 8.2.4 Processing instructions Processing instructions are like comments addressed to machines; they provide a way to send instructions to computer programs or applications. All processing instructions follow this format: A common example of a processing instruction in XML documents is a reference to stylesheets. All processing instructions must begin with <? and end with ?>. <?name data?> <?xml:stylesheet type=”text/css” href=”bookstore.css”?>
12. 8.2.5 How about that white space? XML considers four characters to be whitespace: the carriage return ( or ch(13)), the linefeed ( or ch(10)), the tab(), and the spacebar (' '). In XML documents, there are two types of whitespace: Significant whitespace is part of the document content and should be preserved. Insignificant whitespace is used when editing XML documents for readability. The XML specification allows you to add white space outside markup; it’s ignored when the document is processed When you write markup, consider adding a line of white space between sections.
13.
14.
15. 8. 4.1 Using the EMPTY element type… Empty elements are like boxes you put in place but want left empty. To use them, first you have to point them out to the processor — by declaring them. Such a declaration looks like this: In our example DTD, the source element is an empty element: If (on the other hand) you want your element to serve as a catch-all box that you can put anything in, you may want to use another type of content specification: ANY. <!ELEMENT Name EMPTY> <!ELEMENT source EMPTY>
16.
17. Allow an element to contain both text and other elements (In that case, don’t forget the asterisk!<!ELEMENT Name (#PCDATA | Child1 | Child2)*> <!ELEMENT Name #PCDATA> <!ELEMENT author (#PCDATA | publisher )*>
18. 8. 4.3 Using element content models An element content model describes the child elements that an element can contain. The basic structure is : Example The element content model uses occurrence indicators to control the order and number of times that elements can occur. <!ELEMENT Name (childName)> <!ELEMENT books (book+)> <!ELEMENT customer (custNumber, lastName, firstName, address, city, state, zip, phone, email)>
19. 8. 4.4 Declaring Attributes You need to include attribute-list declarations in your DTD whenever you want elements to use associated attributes. The attribute-list declaration lists all attributes that may be used within a given element and also defines each attribute’s type and default value. The basic format for an attribute-list declaration is: Example The following list defines the terms that appear in attribute-list declarations: Element name Attribute name Datatype (CDATA, or character data , ID, IDREF, IDREFS, ENTITY, ENTITIES , NMTOKEN, NMTOKENS, NOTATION, Enumrated list Default value (#REQUIRED, #IMPLIED , #FIXED , value) <!ATTLIST element-name attribute-name datatypedefaultvalue> <!ATTLIST customer custType CDATA #REQUIRED> <!ATTLIST price priceType (Retail | Wholesale) #REQUIRED>
20. 8. 4.5 Discovering Entities An entity declaration defines an alias for a block of text. You can attach a name to a specific block of text and then insert the whole block by using just one name. An entity declaration in a DTD looks like this: entityNameis the name of the entity and is used to call up the replacement Text in your document. The two main classifications of entities are general entities and parameter entities. A general entity is an abbreviation for data that becomes part of the content of an XML document. A parameter entity is an abbreviation for data that becomes part of the content of a DTD. <!ENTITY entityName“replacementText”>
21. 8. 4.6 General entities The XML specification supports two types of general entities: Internal entities hold their values in the entity declaration external entities point to an external file. Internal entities To declare a general internal entity, you must use the following syntax: Or <!ENTITY entityName“replacementText”> <!ENTITY store1 “River Valley Center”> Five commonly used internal entities are already defined as part of XML
22. 8. 4.6 General entities (cont) External entities External entities help you integrate external documents and files into your XML document. In general, you use them in one of two ways: As a mechanism to divide your document into logical pieces. To reference images, multimedia clips, and other non-XML files. To declare an external entity, you use the following syntax: Use the following syntax to refer to a public identifier not stored on your system The benefit of using external entities is that they’re reusable. They are subject to three important limitations: You can’t use an entity before you define it. Your entity references have to do something. The entity has to refer to data that’s in the XML document. <!ENTITY entityNameSYSTEM “system-identifier”> <!ENTITY entityNamePUBLIC “system-identifier”>
23. 8. 4.7 Parameter entities A parameter entity Is an entity that is created specifically for the purpose of helping you use shortcuts when you write a DTD. They don’t refer to content in XML documents at all. Parameter entities may also be internal or external. Internal entities : Internal parameter entities work well to eliminate the need to repeat commonly used element and attribute declarations. Parameter entities must be declared before they can be used. The general syntax for an internal parameter entity declaration: External entities : Use external parameter entities to carve DTDs into bite-size bits of declarations that are easy to read and manipulate. You can then save each bit in a separate file and create a single parameter entity in the master DTD that points to each individual file. <!ENTITY % entityName “replacementText”>
24. 8. 4.7 Parameter entities External entities : These kinds of parameter entities are called external parameter entity references because they refer to information that’s external to the DTD in which they appear. <-- Master DTD for book information, sales data, and customer information --> <!ENTITY % Bks SYSTEM “book.dtd”> <!ENTITY % Sls SYSTEM “sales.dtd”> <!ENTITY % Cust SYSTEM “customer.dtd”> %Bks; %Sls; %Cust;
25. 8. 5. Understanding Notations In XML, you may come across data that you would like to include in your documents that is not XML. Notations allow you to include that data in your documents by describing the format it and allowing your application to recognize and handle it. The format for a notation is: The name identifies the format used in the document, and the external_id identifies the notation - usually with MIME-types. For example, to include a GIF image in your XML document: You can also use a "public" identifier, instead of "system". <!NOTATION name system "external_ID"> <!NOTATION GIF system "image/gif"> <!NOTATION GIF public "-//IETF/NOSGML Media Type image/gif//EN" "http://www.isi.edu/in-notes/iana/assignments/media-types/image/gif">
26. 8. 5. Calling a DTD DTDs come in two flavors: internal and external. Internal DTDs are entirely contained in the XML prolog of an XML document. External DTDs are contained in an external file and are referenced in the DOCTYPE declaration of an XML document.
27. 8. 5.1 Internal DTDs If your DTD is short and simple, and you don’t need to include it in a large group of XML documents, you may want to use it as an internal DTD. To add an internal DTD to your XML document, you include it within the DOCTYPE declaration Example <!DOCTYPE rootElement [ ... the entire DTD goes here ... ] <?xml version=”1.0” encoding=”UTF-8”?> <!DOCTYPE books [ <!ELEMENT books (book+, totalCost, customer)> <!ELEMENT book (bookInfo, salesInfo)> ... ] <books> <book contentType=”Fiction” format=”Hardback”> <bookInfo> ...
28. 8. 5.2 External DTDs Using external DTDs is a great idea, because you can then share a single DTD among any group of XML documents. To use an external DTD with an XML document, simply reference the external DTD in the DOCTYPE declaration in the XML prolog of your XML document. Example <!DOCTYPE rootElement SYSTEM dtd.url> <?xml version=”1.0” encoding=”UTF-8” standalone=”no”?> <!DOCTYPE books SYSTEM “bookstore.dtd”> <books> ...
29. 8. 5.3 When to use an internal or external DTD The inside view: Internal DTD subsets A single file processes faster than multiple files. Validity and well-formedness are kept in the same place. You can use internal DTDs on a local system without connecting to the Internet Calling for outside support: Referencing external DTDs They’re recyclable They’re versatile They’re easy to change. They’re timesavers. Two are sometimes better than one Combining DTDs isn’t much different. Live by these two major rules when mixing these two types of DTDs: An XML processor always reads the internal subset first. Entities declared in the internal subset can be referenced in the external subset.
30. 8.6 Summary Defining DTDs Knowing when and why to use a DTD Using an XML prolog Exploring an XML DTD Declaring elements and their attributes Declaring an entity Noting notations Including internal and external DTDs Choosing between internal and external DTDs