Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.49 MB, 594 trang )
<span class='text_page_counter'>(1)</span><div class='page_container' data-page=1>
(Course Code XM301)
cover
<b>July 2004 Edition</b>
The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.
<b> © Copyright International Business Machines Corporation 2001, 2004. All rights reserved.</b>
<b>This document may not be reproduced in whole or in part without the prior written permission of IBM.</b>
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.
AFS AIX alphaWorks
AS/400 CICS ClearCase
Database 2 DB2 DB2 Universal Database
DFS Distributed Relational
Database Architecture Domino
DRDA Encina Everyplace
IMS Lotus Enterprise Integrator Lotus Notes
Lotus MQSeries MVS
NetRexx Network Station Notes
Open Blueprint OS/2 OS/390
RACF RDN RS/6000
S/390 SecureWay Tivoli
Tivoli Enterprise Tivoli Management
Environment TME
TME 10 TXSeries VisualAge
TOC
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>iii</b>
<b>Trademarks . . . xi</b>
<b>Course Description . . . xiii</b>
<b>Agenda . . . xv</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>iv </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
TOC
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>v</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>vi </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
TOC
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>vii</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>viii </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
TOC
<b>Course materials may not be reproduced in whole or in part </b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>ix</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>x </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
<xsl:choose Element . . . .9-38
<xsl:choose Example . . . .9-39
Elements to Generate Output (XML to XML) . . . .9-40
<xsl:element Element . . . .9-42
<xsl:attribute> . . . .9-43
XML to XML Example (1 of 2) . . . .9-44
XML to XML Example (2 of 2) . . . .9-45
Numbers, Sorting, and Functions . . . .9-46
Working with Numbering in XSLT . . . .9-47
<xsl:number Element format Attribute Values . . . .9-49
<xsl:number Example . . . .9-50
<xsl:sort Element . . . .9-51
<xsl:sort Attributes . . . .9-52
Sort Example . . . .9-53
XPath/XSLT Functions . . . .9-54
Other Elements . . . .9-56
Attribute Value Templates . . . .9-57
Attribute Value Templates Example . . . .9-58
XSLT Processors . . . .9-59
Xalan . . . .9-60
XSL Resources from IBM . . . .9-61
<b>Appendix B. Additional Information for XML Schema . . . B-1</b>
<b>Appendix C. What’s New in WebSphere Studio V5.1.1 . . . C-1</b>
<b>Appendix D. Additional Information and Examples . . . D-1</b>
<b>Appendix E. Bibliography and References . . . E-1</b>
<b>Appendix F. Acronyms and Abbreviations . . . F-1</b>
<b>Appendix G. Glossary . . . G-1</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Trademarks</b> <b>xi</b>
TMK
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.
AFS® AIX® alphaWorks®
AS/400® CICS® ClearCase®
Database 2™ DB2® DB2 Universal Database™
DFS™ Distributed Relational
Database Architecture™ Domino®
DRDA® Encina® Everyplace®
IMS™ Lotus Enterprise Integrator® Lotus Notes®
Lotus® MQSeries® MVS™
NetRexx™ Network Station® Notes®
Open Blueprint® OS/2® OS/390®
RACF® RDN™ RS/6000®
S/390® SecureWay® Tivoli®
Tivoli Enterprise™ Tivoli Management
Environment® TME®
TME 10™ TXSeries® VisualAge®
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Course Description</b> <b>xiii</b>
pref
This course provides an introduction to XML (eXtensive Markup
designers, analysts, developers, testers, and administrators to use
XML and its related technologies in the context of building e-business
applications. The course is a 2.5-day classroom course with hands-on
lab exercises that reinforce the lecture material.
This course is designed for information technology individuals,
including enterprise application architects, designers, developers, and
content modelers and creators.
Knowledge of Internet technologies is required. Some experience with
using HTML would be helpful, but is not necessary.
After completing this course, you should be able to:
• Describe the important XML standards and recommend their use in
business applications
• Define XML documents using namespaces, DTD, or Schema
• Develop and test XML processing applications
• Use XSLT to transform XML documents as necessary
• Identify open areas in XML, such as security, and emerging
technologies such as DB support, XHTML, Web Services, XLink,
and so forth. Plan for their incorporation into XML processing
applications
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Agenda</b> <b>xv</b>
pref
Unit 1 - Introduction to XML and Related Technologies
Unit 2 - Issues in Electronic Information Exchange
Unit 3 - What Is XML?
XML Basics Lab
Unit 4 - WebSphere Studio Application Developer Overview
Introduction to WebSphere Studio Application Developer Lab
Unit 5 - Document Type Definition (DTD)
DTD Lab
Unit 6 - XML Namespaces
XML Namespaces Lab
Unit 7 -XML Schema
XML Schema Lab
Unit 8 - XPath - XML Path Language
XPath Lab
Unit 9 - XSL - eXtensible Stylesheet Language Part 1
XSLT Lab Part 1 - Simple Transforms
Unit 9 - XSL - Extensible Stylesheet Language Part 2
XSLT Lab Part 2 - Conditional Transforms
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-1</b>
Uempty
This unit describes the audience, prerequisites, and overall objectives
for XM301. The overall agenda for the course is also covered.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>1-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 1-1. Introduction XM3014.1
© Copyright IBM Corporation 2004
Job Role
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-3</b>
Uempty
Figure 1-2. Course Description XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>1-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 1-3. Audience XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-5</b>
Uempty
Figure 1-4. Prerequisites XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>1-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 1-5. Course Objectives (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-7</b>
Uempty
Figure 1-6. Course Objectives (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>1-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 1-7. Agenda - Day 1 XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-9</b>
Uempty
Figure 1-8. Agenda - Day 2 XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>1-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 1-9. Agenda - Day 3 XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-11</b>
Uempty
Figure 1-10. Unit Summary XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-1</b>
Uempty
This unit examines the different ways in which information is
exchanged in modern computer systems, identifying issues in each
case. The discussion is restricted to what is exchanged (the content)
not how it is exchanged (the mechanism). A set of messaging criteria
are developed that, if met, will reduce the impact of the issues
identified.
This unit shows some of the business drivers for XML, and gives
examples of how XML is being used by businesses today.
After completing this unit, you should be able to:
• Describe the types of information exchange that occur in modern
computer systems
• Describe information exchange issues that exist in modern
computer systems
• Describe what is needed to address many of the issues that exist in
information exchange
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-1. Unit Objectives XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-3</b>
Uempty
Figure 2-2. Electronic Information Exchange (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-3. Electronic Information Exchange (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
System 1 (Sales)
Application (CRM)
System 2
(Accounting)
Intercompany
Inter-System
Inter-Application
Intra-Application
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-5</b>
Uempty
Figure 2-4. Intra-Application Information Exchange XM3014.1
© Copyright IBM Corporation 2004
In a well-structured application, information flows between three different layers:
The presentation layer (often called the View): presents information to the user
and collects information from the user. This layer is often coupled to a particular
presentation technology, for example, Presentation Manager, X-Windows, and
so forth. Therefore, it often must change significantly when the presentation
mode changes.
The processing layer (often called the Controller): operates on the information
in accordance with the functional requirements of the application.
The business layer (often called the Model or Business Model): maintains the
operational constraints that govern the business as a whole. It ensures that no
individual application contradicts those rules by performing an operation that is
inconsistent with those constraints.
Presentation
Layer
(View)
Process
Layer
(Controller)
Business
Layer
(Model)
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-5. Agile Views - Multiple Client/Device Support XM3014.1
© Copyright IBM Corporation 2004
Prior to the arrival of the World Wide Web, applications were largely presented via
workstations or dumb terminals, and required (relatively) infrequent modification of their
presentation layer. The World Wide Web has changed this.
Now, the addition of the mobile work force and use of handheld devices presents new
opportunities for business and new challenges for application developers.
Applications must be presented via:
Cell phones and Handhelds, Wireless Markup Language (WML)
Web Browsers (HTML, Style Sheets, JavaScript)
PDF
And so forth
Many Web applications suffer from coupling issues where applications habitually
generate output that combines Presentation information (font, color, and so forth) with
business information (bank balance, product information, and so forth) making it difficult
to reuse the data stream.
Ideally, the presentation layer would emit/consume a generic, structured information
stream that can be filtered for the target device.
<i>An external rendering engine worries about how it looks, while the application worries </i>
about what should be viewed.
Enables speedy, low-cost support for new client devices.
Need a View-independent, structured
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-7</b>
Uempty
Figure 2-6. Inter-Application Information Exchange XM3014.1
© Copyright IBM Corporation 2004
Ideally, the design of a system takes into account all the operations that it will
perform and the applications that will perform them.
It is rare that enough information exists to perform such an analysis and rarer still
that the design remains stable as the applications that compose the system are
constructed (typically at disparate points in time).
Technology does not stand still; it is common to see applications built late in
the life of a system using technology that is completely different from that
used by the initial ones, for example, COBOL versus Java.
Experience has shown that it is best to focus on the application at hand and allow
the plans for a system to evolve as further applications are built based on new
knowledge of the problem and new technologies.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-7. Context-free Communication XM3014.1
© Copyright IBM Corporation 2004
As much as possible, eliminate assumptions from the way in which information is
exchanged.
This means that the information that flows between applications should not be
coupled to a particular technology or to an assumption about how it will be used.
When possible, send an application domain entity, for example, a Purchase
Order rather than the individual pieces, for example, a total, an item description,
and so forth.
Don't use a message that is bound to an implementation technology, for
example, a Serialized Java Object (a Java-specific bit stream).
Ideally, the communication medium would be based on simple, ubiquitous
technology, for example, straight text.
Should be structured and self-describing to eliminate the need for context
awareness in the receiver.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-9</b>
Uempty
Figure 2-8. B2B Intercompany Information Exchange XM3014.1
© Copyright IBM Corporation 2004
In this case, the presentation is focussed on the Business to Business (B2B)
relationships that exist in e-business.
In such cases, the systems involved often talk to multiple business partners;
sometimes for the same service where selection is based on price, availability,
and so forth, for example, Credit Transaction Validation.
<b>Scenario 2</b>
Communicate with
business partners
through an intermediate
'Marketplace' vendor.
Forced to evolve at the
rate of the intermediary
C1
C2
C3
Cn
M
<b>Scenario 1</b>
Communicate directly with
business partners, potential
for 'n' communication protocols
C2
C3
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-9. Need to Establish Common Ground for Communication XM3014.1
© Copyright IBM Corporation 2004
Requires an implementation-independent, vendor-neutral
markup language for describing information; enabling the
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-11</b>
Uempty
Figure 2-10. Inter-system Information Exchange XM3014.1
© Copyright IBM Corporation 2004
The exchange of information between systems is subject to most of the problems
discussed so far except, perhaps, view coupling. Typically, this sort of
communication does not involve a presentation layer.
When laying out the infrastructure in which systems will reside, it is wise to
establish a means of insulating systems from one another with a layer that is
devoid of implementation and process coupling ... let's call this the Interface Layer
(it's also known as an Abstraction of the System).
The role of the interface layer is to capture the semantics of a system as seen
from an external point of view, and to represent it as a dialog, with messages
providing the units of communication in the dialog.
As long as the definition of the system doesn't change, the dialog (the interface to
the system) should remain stable. The implementation may change significantly.
System1 System2
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-11. Exchanging Messages XM3014.1
© Copyright IBM Corporation 2004
There are other differences, for example, the likely use of Message Oriented
Middleware in system integration (MOM), but this presentation is focused on
<i>the information being exchanged not on the exchange mechanism.</i>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-13</b>
Uempty
Figure 2-12. The Semantic Web XM3014.1
© Copyright IBM Corporation 2004
Requires self-describing information
decoupled from View details
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-13. A Common Solution? XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-15</b>
Uempty
Figure 2-14. Checkpoint Questions (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>2-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 2-15. Checkpoint Questions (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-17</b>
Uempty
Figure 2-16. Unit Summary XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-1</b>
Uempty
In this unit, the basic elements of XML are explained.
After completing this unit, you should be able to:
• Describe the basic rules of XML
• Identify what makes XML well-formed
• List the components that make up an XML document
• Differentiate between XML and HTML
• Describe the internationalization support in XML
• Define some best practices for XML
• Checkpoint
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-1. Unit Objectives XM3014.1
Although XML is a stable and mature, the supporting technologies are evolving rapidly.
<b>Keep up with the changes at: />
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-3</b>
Uempty
Figure 3-2. What Is XML? XM3014.1
Usually people will talk about this 'XML' and that 'XML' or this 'XML file' and what they are
really referring to is XML markup text encapsulating specific data.
As long as XML text or definitions follow the syntax set of rules, any data can be
represented.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-3. Example Tree Representation of XML XM3014.1
This example shows a typical XML document and how it is represented as a tree of nodes.
This conceptual depiction of XML is important to understand.
book is the root element but ROOT is the highest point in the tree or hierarchy: think of
ROOT as the location of a pointer used to keep track of where you are.
© Copyright IBM Corporation 2004
<b><?xml version="1.0"?></b>
<b><book></b>
<b> <author></b>
<b> Tom Wolfe</b>
<b> </author></b>
<b> <title></b>
<b> The Right Stuff</b>
<b> </title></b>
<b> <price></b>
<b> $6.00</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-5</b>
Uempty
Figure 3-4. A Simple XML Document - Basic Structure XM3014.1
Textual data between tags is also be referred to as content.
Tagged elements of any sort are also known as markup.
Sometimes the term body is used to refer to anything between a start tag and an end tag.
© Copyright IBM Corporation 2004
<b><?xml version="1.0"?></b> "Optional" first line; only required if
encoding IS NOT UTF-8 or UTF-16*
<b><book></b> Root element start tag
<b> <title></b>
<b> Alphabet from A to Z</b>
<b> </title></b>
First child element with data
<b> <isbn number="1112-23-4356" /></b> Empty element (no data)
<b> <author></b> Begin element tag
<b> <firstName>Boreng</firstName></b>
<b> <lastName>Riter</lastName></b> Nested child elements
<b> </author></b> End element tag
<b> <chapter title="Letter A"></b>
<b> The letter A is the first in</b>
<b> the alphabet. It is also the</b>
<b> first of five vowels.</b>
<b> </chapter></b>
Element containing an attribute and
<b> <!-- The rest of the letter</b>
<b> chapters are missing --></b> Comment
<b> <chapter title="Letter Z"></b>
<b> The letter Z is the last</b>
<b> letter in the alphabet. </b>
<b> </chapter></b>
Last element in document
<b></book></b> Root element end tag
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-5. A Simple XML Document - Basic Nomenclature XM3014.1
These definitions will be important when we discuss the XML Schema definition language
in a later chapter.
We introduce these terms here in preparation for their use then.
© Copyright IBM Corporation 2004
The XML instance on the previous page consists of:
<i><b>One main element book</b></i>
<i><b>Subelements title, isbn, author, chapter, and comment</b></i>
<b>Author contains other subelements firstName and lastName</b>
<i><b>ISBN and chapter contain attributes number and title, respectively</b></i>
<b>Title, firstName, and lastName contain only strings:</b>
<i><b>Elements that contain numbers, strings, dates, and so forth (TBD) but no </b></i>
<i>subelements (or attributes) are said to have simple types</i>
<b>ISBN and chapter carry attributes; author has subelements:</b>
Elements that contain subelements or carry attributes are said to have
<i>complex types</i>
Attributes always have simple types (that is, they are numbers, strings,
dates, and so forth.
<i>TBD -- In a later chapter we describe XML Schemas which have access to </i>
<b>Course materials may not be reproduced in whole or in part </b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-7</b>
Uempty
Figure 3-6. Basics of <i>Well-formed XML (1 of 2)</i> XM3014.1
As you can see, creating an XML instance will be a rather straightforward task.
© Copyright IBM Corporation 2004
All other elements are nested inside the root element
<i>For every opening tag "<...>" there must be a matching closing tag </i>
"</...>"
<i>The exception is an empty (no content or body) tag "<.../>"</i>
<i>A nested tag-pair may not overlap another tag</i>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-7. Basics of <i>Well-formed XML (2 of 2)</i> XM3014.1
Version 1.1 is about to emerge. Many of the current XML instances lack this declaration.
It is often useful to identify the processing instructions, of which the XML declaration is but
one, as the prolog; the actual XML instance material, that between the root element open
and closing tags, may then be referred to as the XML document.
© Copyright IBM Corporation 2004
All tag and attribute names, attribute values, and data must comply
with XML naming rules.
<i>That is, all attribute values must be in quotes.</i>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-9</b>
Uempty
Figure 3-8. Element Rules - Rule 1. Single Root Element XM3014.1
XML is a Mark Up language. Tags form the basis of all mark up languages.
The purpose of an Element tag is to identify the contents of the data and children tags held
within them.
The root element should have a name that provides a good definition of all the data
contained in the document.
The first physical line in this sample is there because of Rule 6, which we shall cover later.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-9. Element Rules - Rule 2. Element Tag Rules XM3014.1
The empty element notation (< ... />) is unique to XML. The W3C is currently updating the
SGML recommendation to include this syntax.
Empty elements are practical and common when the only associated information is
enclosed within the element's attributes.
For Empty Element tags, a space is required before the tags terminator (" />").
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-11</b>
Uempty
Figure 3-10. Element Rules - Rule 3. Element Nesting XM3014.1
There is no limit to the depth of children in XML, but an overly large number may indicate a
poor design.
If an XML document does not have an associated DTD or Schema, then all whitespace is
retained since a processor does not know if it is considered textual data or just for
aesthetics. DTDs and Schemas are covered in later sections.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-11. Element Nesting Example XM3014.1
Indentation and other whitespace is only for human readability, but adds "fat" to a
documents size and processing requirements.
This is only an issue with huge XML documents.
It is important to realize that an XML instance is treated by its processor/parser as one,
continuous stream of characters, some of which are recognized by the parser as "special."
As a consequence, when the parser reports an error its location is where the parser
gave up, which may be far beyond where the actual error occurred.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-13</b>
Uempty
Figure 3-12. Element Rules - Rule 4. XML Naming Rules XM3014.1
Elements may not use W3C reserved Namespace prefix or the letters "XML" in any case.
Element names may not include words reserved by the XML specification. These include:
<b> • DOCTYPE </b>
<b> • ELEMENT </b>
<b> • ATTLIST </b>
<b> • ENTITY</b>
Colons (":"), while technically legal in tag names, should not be used as they are reserved
for use with Namespaces.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-13. Rule 4... Tag Naming - Samples XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-15</b>
Uempty
Figure 3-14. Rule 4... Element Content (1 of 2): General XM3014.1
PCDATA is parsed character data.
A "snippet" is a piece of a larger, legitimate XML file.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-15. Rule 4... Element Content (2 of 2): Data XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-17</b>
Uempty
Figure 3-16. Rule 4... PCDATA - Parsed Character Data XM3014.1
XML differentiates between markup characters and text characters by providing special
XML escape characters to be used in XML PCDATA.
Only regular parsed character data is allowed inside the attributes value.
Any special characters such as ">" and "&" must always be represented as escape
characters.
The others may appear non-escaped in some places in XML, but it is best to just use the
escape characters all the time.
These escape characters are independent of the encoding chosen.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-17. Rule 4... CDATA - Character Data XM3014.1
The 5 XML escape characters will not be interpreted (that is, changed to the non-escaped
character) in CDATA sections, so they should not be used. If you put < in the CDATA, you
will see < in the out put not ">". So use the actual characters.
Encoding refers to the character set for the entire document, so it does apply to CDATA as
well.
CDATA sections cannot be nested.
CDATA will retain spaces.
While XML escape characters are not to used in CDATA, you must be aware of how the
'down-line' applications of the XML will use the CDATA.
Common usage: JavaScript in the XML and specialized HTML
Browser may have problems with some special characters which must then be
represented in hex.
example: micro sign (à) = µ
â Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-19</b>
Uempty
example: ampersand (7) = &
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-18. Rule 4... CDATA Examples XM3014.1
Both 'script' element examples are valid. Which one you would use would depend on the
behavior of the application/browser which will use the transformed XML and therefore the
CDATA.
This topic is important to XSLT processing.
© Copyright IBM Corporation 2004
<b>function matchwo(a,b) {</b>
<b> if (a < b && a < 0)</b>
<b> then</b>
<b> { return 1 }</b>
<b> else</b>
<b> { return 0 }</b>
<b>}</b>
<b>]]></script></b>
<b><script><![CDATA[</b>
<b>function matchwo(a,b) {</b>
<b> if (a < b && a < 0)</b>
<b> then</b>
<b> { return 1 }</b>
<b> { return 0 }</b>
<b>}</b>
<b>]]></script></b>
<b><nameXML></b>
<b> <![CDATA[</b>
<b> <name common="freddy" breed="springer-spaniel"></b>
<b> Sir Frederick of Ledyard's End</b>
<b> </name></b>
<b> ]]></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-21</b>
Uempty
Figure 3-19. Element Rules - Rule 5. Element Attributes XM3014.1
Attribute naming follows the same rules as element naming.
An element may contain zero or more attributes within its start tag.
Attributes provide extra information to the meaning of the element. This may include "key"
information or other identifying details.
Name collisions are common in XML as shown in the attributes of the first example. Using
Namespaces resolves these sort of issues.
You cannot use the same style quote in the value of the attribute, that is, style="monty's" is
valid, style='monty's' is invalid.
© Copyright IBM Corporation 2004
<title <b>type="section"</b> <b>number="1">XML overview</title></b>
<title <b>type="boat"</b> <b>state="FL">Yacht</title></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-20. Element Rules - Rule 6. XML Declaration (1 of 2) XM3014.1
All XML documents should begin with this tag, and it MUST be at the first position of the file
(that is, no blank lines or comments or spaces before the tag).
The current version of all XML documents is "1.0" and must appear within the "<?XML" tag
if that tag is used. It indicates the version of XML to which the Document Entity must
conform.
"stand-alone" is included here for completeness: it is automatically set to the correct value -
if it is not used; most users do not include it. We will have more to say on this in our
discussions of the grammars we can apply to XML instances. "Yes" means the document
that follows can stand alone; that is, without requiring a grammar document to complete its
information.
© Copyright IBM Corporation 2004
The XML Declaration is an optional first line in all XML documents:
<b><?xml version="1.0" ?></b>
<b><?xml version="1.0" encoding="UTF-8" ?></b>
<b><?xml version="1.0" standalone="yes"?></b>
If this declaration is used, the version attribute is mandatory.
<b>The encoding attribute indicates the character encoding used in the </b>
document; if UTF-8 or UTF-16 is used it may be omitted.
ASCII is a subset of UTF-8 and need not be declared.
Comments are <i><b><sub>not</sub></b></i> allowed before this statement.
<i>The XML Declaration follows the syntax of a Processing Instruction or PI, </i>
which is described on a subsequent chart, but it is considered to be
unique and is treated separately in the 1.0 XML specification.
<b>GENERAL NOTE OF CAUTION</b>: You can not always rely on a browser or
tool to completely/correctly enforce the specifications. Nor are the
<i>specifications always written in language that, to a particular reader, is </i>
<i>unambiguous. Still, the best advice is when in doubt, refer to the </i>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-23</b>
Uempty
Figure 3-21. Element Rules - Rule 6. XML Declaration (2 of 2) XM3014.1
The last point may be problematic if, say, the associated DTD file is not readily available for
inspection. You will see in later sections that we can override the attribute values in our
XML instance from within a DTD or XML Schema file.
This may not appear to be a problem at the outset, but over time we may forget that we are
overriding some values.
As XML instances grow in length and complexity this may become a serious source of
confusion.
A best practice is to design the XML instance data to contain ALL the data so that, from an
internal data perspective, it does stand alone.
© Copyright IBM Corporation 2004
<b>The stand-alone attribute is included here for completeness: it is used to </b>
indicate if this XML document depends on information declared externally to
<i>this document (in a DTD or XSL file (TBD), for examples); value may be yes </i>
or no.
A value of "yes" indicates there are no external markup declarations; if
there are no external markup declarations, the declaration has no
A value of "no"indicates there are or may be such external markup
declarations; if there are such declarations but there is no standalone
declaration, "no" is assumed.
. . . so it is typically not used.
In any event, the inclusion in the XML instance of references to external
entities, such as those in an embedded DTD, does not change its
<i>standalone status.</i>
<b>A bigger issue associated with the stand-alone attribute is that of defining or </b>
<i>setting values in any entity that may be external to the XML instance. </i>
Arguably, the principal reason for using XML is that it explicitly defines the
elements it includes. If attribute values are overridden then the XML
instance before us is no longer declarative.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-22. Comments XM3014.1
Comments can go anywhere in the XML except:
Inside the actual element tags
Comments are a good thing.
Use them just as would in a program.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-25</b>
Uempty
Figure 3-23. Internationalization and Encoding (1 of 2) XM3014.1
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0 or later.
There are two error messages you may receive from IE or from a parser:
1. An invalid character was found in text content.
You will get this error message if a character in the document does not match the
encoding attribute.
2. Switch from current encoding to specified encoding not supported.
You will get this error message if there is a disconnect between the encoding used in
saving and specification of the encoding. The common problem is that it has been saved as
a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-24. Internationalization and Encoding (2 of 2) XM3014.1
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0.
There are two error Messages you may receive from IE or from a parser:
1. An invalid character was found in text content.
You will get this error message if a character in the document does not match the
encoding attribute.
2. Switch from current encoding to specified encoding not supported.
You will get this error message if your file there is a disconnect between the saving and
specification of the encoding. The common problem is that is has been saved as a
single-byte encoding and the encoding attribute specifies a double-byte or visa versa.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-27</b>
Uempty
Figure 3-25. Processing Instruction XM3014.1
If a comment is inserted between the XML Declaration and a PI such as the one shown,
Studio will not consider it an error.
A demo file is available in the XM301 Lectures folder, Unit 3.
This PI, although useful, does NOT define a <i>grammar for the XML document in which it </i>
is used: we will talk about grammars in subsequent chapters.
To reemphasize: the XML Declaration, while it may look like a PI, is treated as special!
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-26. Well-formed versus Valid XM3014.1
All XML parsers must check XML documents for being well formed.
XML parsers are classified as being validating, or non-validating.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-29</b>
Uempty
Figure 3-27. HTML versus XML (1 of 2) XM3014.1
All markup tags in HTML are directed at visual composition. No consideration is given to
the actual semantics of the data.
XML markup tags are based solely on the data content.
Clean separation of data and presentation
© Copyright IBM Corporation 2004
XML is about structured information
interchange
HTML is about presentation and
browsing
<b><course></b>
<b> <name>Java Programming</name> </b>
<b><department>EECS</department></b>
<b> <teacher></b>
<b> <name>Paul Thompson</name></b>
<b> <student></b>
<b> <name>Ron Jones</name></b>
<b> </student></b>
<b> <student></b>
<b> <name>Uma Abingdon</name></b>
<b> </student></b>
<b> <student></b>
<b> <name>Lindsay Garmon</name></b>
<b> </student></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-28. HTML versus XML (2 of 2) XM3014.1
These two source listings really show fundamental differences between HTML and XML.
While both contain text marked up by tags, their meaning is entirely different.
Which would you rather parse and insert into a database?
© Copyright IBM Corporation 2004
<b><html></b>
<b><title>Course Roster</title></b>
<b><body></b>
<b><center></b>
<b> <h1>Course Roster</h1></b>
<b> <h2>XML Programming</h2></b>
<b> <h3>Department: EECS</h3></b>
<b> <p></b>
<b> <table border=2></b>
<b> <tr></b>
<b> <th>Teacher</th></b>
<b> <td>Paul Thompson</td></b>
<b> </tr><tr></b>
<b> <th>Student<br>List</th></b>
<b> </b> <b><td>Ron Jones<br></b>
<b> Uma Abingdon<br></b>
<b> Lindsay Garmon</b>
<b> </b> <b></td></b>
<b> </tr></b>
<b> </table></b>
<b></center></b>
<b></body></b>
<b></html></b>
<b><?xml version="1.0"?></b>
<b><course></b>
<b> <name>Java Programming</name></b>
<b> <department>EECS</department></b>
<b> <teacher></b>
<b> <name>Paul Thompson</name></b>
<b> </teacher></b>
<b> <student></b>
<b> <name>Ron Jones</name></b>
<b> </student></b>
<b> <student></b>
<b> <name>Uma Abingdon</name></b>
<b> </student></b>
<b> <student></b>
<b> <name>Lindsay Garmon</name></b>
<b> </student></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-31</b>
Uempty
Figure 3-29. HTML and XML Key Differences XM3014.1
HTML has a fixed tag set. In XML there is no predefined tag set. The allowed tags in an
XML document are defined in its DTD or Schema.
XHTML is an effort to correct the sins of HTML's past. It is a new XML technology that
consists of an HTML specific DTD that defines the valid HTML tags.
Unfortunately, many of today's browsers will not recognize XHTML documents properly!
© Copyright IBM Corporation 2004
<b>HTML</b> <b>XML</b>
Predefined tags define how to present
data. Defines its own tags to identify data.
Allows missing end tags.
<b><br></b>and <b><p></b>
Requires matching end tags.
<b><name>test</name></b>
Attributes do not require quotes.
<b><img src=myDog.jpeg></b>
Attributes must be quoted.
<b><book isdn="3432"></book></b>
Attributes do not require a value.
<b><input type=radio</b> <b>checked></b>
Attributes must have a value.
<b><device type="radio" /></b>
Tolerates non-nested tags.
<b><H1><center>Hello!</H1></center></b>
Strict nesting and tag matching rules.
<b><H1><center>Hello!</center></H1></b>
Browsers will almost always do a "best
guess" on ill-formed HTML.
XML Parsers will generate a fatal
exception for well-formedness violations.
Does not support empty elements, but
allows single start tags.
<b><br> and <hr> </b>
Provides for empty elements.
<b><device type="radio" /></b>
Is not case sensitive.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-30. Checkpoint Questions (1 of 3) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-33</b>
Uempty
Figure 3-31. Checkpoint Questions (2 of 3) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>3-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 3-32. Checkpoint Questions (3 of 3) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-35</b>
Uempty
Figure 3-33. Unit Summary XM3014.1
The status of various XML technologies (W3C Activities) can be found at:
<b> />
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-1</b>
Uempty
This unit describes IBM WebSphere Studio Application Developer.
This is an overview of the broad features and organization of this
application development tool.
After completing this unit, you should be able to:
• Describe the WebSphere Studio family of tools
• State the role of WebSphere Studio Workbench in the WebSphere
Studio tools
• Describe basic features of WebSphere Studio Application
Developer
• Review
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-1. Unit Objectives XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-3</b>
Uempty
Figure 4-2. Roles-based Development XM3014.1
There are four distinct development roles shown here:
<b> • Enterprise Integrator</b>
<b> • Bean Provider</b>
<b> • Application Assembler</b>
<b> • Page Producer</b>
Tooling needs to support each of these roles and permit easy management and integration
of the developed assets.
© Copyright IBM Corporation 2004
<b>Workarea</b>
<b>Products</b>
<b>Connection</b>
<b> Data</b>
<b>Business</b>
<b> Logic Data</b>
<b>Application </b>
<b> Flow</b>
<b>Page Layout</b>
<b>and </b>
<b>Content</b>
<b>JavaBeans</b>
<b>EJBs</b>
<b>JavaBeans</b>
<b>EJBs</b>
<b>Servlets, JSPs,</b>
<b>JavaBeans</b>
<b>HTML, JSPs,</b>
<b>MIME Types</b>
<b>Metrics</b>
<b>Tool</b>
<b>Role</b>
<b>Enterprise </b>
<b>Integrator</b>
<b>Bean</b>
<b>Provider</b>
<b>Application</b>
<b>Assembler</b>
<b>Page </b>
<b>Producer</b>
<b>Web</b>
<b>Master</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-3. Development Environment Goals XM3014.1
The development environment should support the tasks performed by the developers.
It should be configurable and customizable for each individual developer.
Tools need to accommodate the rapid change in available technologies.
© Copyright IBM Corporation 2004
Unified by a new tooling platform
Provide multilevel vendor integration
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-5</b>
Uempty
Figure 4-4. IBM WebSphere Studio Family XM3014.1
The IBM WebSphere Studio family is applied to a development platform (as opposed to a
set of development tools).
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-5. Family Contents XM3014.1
The flagship products in the WebSphere Studio brand (Version 5) are:
<b> • WebSphere Studio Application Developer</b>
<b> ã WebSphere Studio Enterprise Developer</b>
â Copyright IBM Corporation 2004
Focused on development of Web Services, JSPs, Servlets, XML and
J2EE and database applications in a team environment
Focused on Enterprise Integration using the J2EE Connector
Architecture
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-7</b>
Uempty
Figure 4-6. WebSphere Studio Workbench XM3014.1
The Workbench is not a tool, that is, it is not in itself a product that is for sale. It is an open
and portable tool platform providing an integration technology. The Workbench can be
thought of as a set of Java frameworks and a set of development tools geared for tool
builders.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-7. WebSphere Studio Workbench Rationale XM3014.1
The Workbench offers its greatest support for tool builders; making it easy to add plug-ins
(tools) to the overall IDE. This allows quick "time-to-market" of tools supporting emerging
technologies.
The underlying framework which adds to the tool builders productivity gives end-users a
common look and feel.
© Copyright IBM Corporation 2004
Globalization, distributed debug, Team, SCM
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-9</b>
Uempty
Figure 4-8. WebSphere Studio Application Developer XM3014.1
© Copyright IBM Corporation 2004
Start the WebSphere Studio Application Developer
Start -> Programs -> IBM WebSphere Studio -> Application Developer 5.1
Workbench opens when you launch Application Developer
Within the workbench -- open the perspectives, views, and editors
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-9. Terminology XM3014.1
The workbench window displays one or more perspectives that contain views and editors.
You can quickly switch between perspectives and views using the shortcut buttons which
appear on the shortcut bar.
© Copyright IBM Corporation 2004
<b>Shortcut</b><i><b> Bar</b></i>
<b>Source Pane</b>
<b>Outline Pane</b> <b>Task Sheet</b>
<b>Navigator</b>
<b> Pane</b>
<b>Editor</b>
<b>Views</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-11</b>
Uempty
Figure 4-10. Perspectives XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>4-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-11. Views XM3014.1
Views support editors and provide alternative presentations or navigation of the information
in your workbench. For example, the Navigator displays projects and other resources you
are working with.
A view might appear by itself, or stacked with other views in a tabbed notebook.
On Windows platforms, views can be undocked from the main workbench window and
appear as floating windows on the desktop. Undocked views can also be docked back into
the main workbench window.
More info on the Application Developer menu: Help --> Navigating Workbench
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-13</b>
Uempty
Figure 4-12. Editors XM3014.1
The key thing to note about editors is the Open-save-close life cycle. You must explicitly
save the corresponding resource after making changes.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-13. Online Help XM3014.1
<b>Tips : F1, </b>
F1 : info pop on a selected task
To hide the navigation frame, click the Hide Navigation button on the Help view's toolbar.
<b>Note: Your product may include more than one information set (a collection of </b>
documentation topics). When you run a search, only the current information set is
searched. The current information set is shown in the drop-down list at the top of the Help
view. To search another information set, select it from the list, and run the search again.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-15</b>
Uempty
Figure 4-14. Cheat Sheets XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-15. Application Developer Design Points XM3014.1
Reduced learning curve through the consolidation of tooling to one platform. For example,
with customizable perspectives, one could customize Application Developer to look similar
to other Java IDEs.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-17</b>
Uempty
Figure 4-16. Tooling XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-17. Java IDE (1 of 3) XM3014.1
A default JRE can be selected for the Workbench with Windows-> Preferences. Project
specific JRE is selected in the Launch Configuration Dialog.
For more on hot method replace, refer to the foil at the end of the unit.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-19</b>
Uempty
Figure 4-18. Java IDE (2 of 3) XM3014.1
JDI: Java Debugging Interface. The JDI is a high-level Java API providing information
useful for debuggers and similar systems needing access to the running state of a Java
virtual machine.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-19. Java IDE (3 of 3) XM3014.1
Starting with V5.1, Application developer adds support for UML visualization. You can
select an existing components and have the system generate the UML diagrams, or you
can start with a blank diagram and develop components from the diagram, or use a
combination of the two approaches. These features let developers understand existing
components better by producing UML that represents the existing components and also
assists them in generating components based on the UML diagrams.
The entire class diagram or portions may be exported in bmp, jpg, or gif image formats.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-21</b>
Uempty
Figure 4-20. J2EE Tooling (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-21. J2EE Tooling (2 of 2) XM3014.1
WebSphere Studio provides a Web-based Universal Test Client where you can test your
Enterprise JavaBeans (EJBs) and other objects. Using this test client, you can test the
home and remote interface methods of your enterprise beans. By calling the methods and
passing user-defined arguments you can test methods to ensure that they work correctly.
© Copyright IBM Corporation 2004
Allows for versioning of unit test environment
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-23</b>
Uempty
Figure 4-22. Portlet Tooling XM3014.1
There are actually two related plug-ins. The first, WebSphere Portal Toolkit ships with all
offerings of WebSphere Portal V4.x. The second, WebSphere Everyplace Toolkit ships with
WebSphere Everyplace Server.
The test environment interacts with a developer configuration of WebSphere Portal Server
running on WebSphere Application Advanced Single Server Edition (AEs). This is
facilitated by the Remote WebSphere Server configuration.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-23. Data Tooling (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
Create Tables/Views/Indexes/Keys
Generate DDL
Connect to and view existing relational database objects
Metadata generated as XMI
SELECT, INSERT, UPDATE, DELETE supported
Metadata generated as XMI
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-25</b>
Uempty
Figure 4-24. Data Tooling (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-25. Web Tooling (1 of 2) XM3014.1
The Web Site Designer is new with 5.1. The configuration of the entire Web site is
maintained in the Web Site Configuration object. The choice of static or dynamic web sites
and the Palette view are also newly introduced in release 5.1.
Examples of the drawer labels in the Palette view are: HTML, Free Layout, JSP, Java
Server pages, and Site Parts. The Site Parts include items such as Vertical and Horizontal
Navigation Bars, which help to maintain consistency in the look and feel of pages across
the site.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-27</b>
Uempty
Figure 4-26. Web Tooling (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-27. XML Tooling (1 of 3) XM3014.1
© Copyright IBM Corporation 2004
Code Assist for building XML documents
Visual tooling for working with DTDs
Create DTDs from existing documents
Generate an XML Schema from a DTD
Generate JavaBeans for creating/manipulating XML documents
Generate an HTML form from a DTD
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-29</b>
Uempty
Figure 4-28. XML Tooling (2 of 3) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-29. XML Tooling (3 of 3) XM3014.1
XPath expressions can be used to search through XML documents, extracting information
from the nodes (such as an element or attribute).
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-31</b>
Uempty
Figure 4-30. Performance/Trace Tooling XM3014.1
© Copyright IBM Corporation 2004
Heap
Stack
Class/Method details
Object References
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-31. Team Development XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-33</b>
Uempty
Figure 4-32. Web Services Tooling (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
Browse UDDI registry to locate Web Service (Web Services
Explorer)
Generate JavaBean proxy for existing Web Services
Create new Web Services from JavaBeans, databases
Wrap existing artifacts such as SOAP and HTTP GET/POST
accessible services
Generate Java client proxy to Web Services
Create new WSDL files
Create ports, port types, messages, bindings, operations, types within
WSDL files
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-33. Web Services Tooling (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
Deploy Web Services to WebSphere or Tomcat Servers
Built-in test client allows for immediate testing of local and remote
Web Services
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-35</b>
Uempty
Figure 4-34. Standards Support XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>4-36 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 4-35. Review XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-37</b>
Uempty
Figure 4-36. Unit Summary XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-1</b>
Uempty
This unit covers XML 1.0 DTDs, which provide a way to define the
structure of an XML document. DTDs provide an additional level of
After completing this unit, you should be able to:
• Describe the reasons for using DTDs
• Define well-formed versus valid documents
• Define the grammar rules for an XML document using DTD
• Describe the difference between non-validating and validating
processors
• Describe examples of DTDs being used in business
• Describe best practices used in DTDs
• Define the limitations of DTDs
• Describe the status of the DTD in the industry
• Checkpoint
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-1. Unit Objectives XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-3</b>
Uempty
Figure 5-2. Review: Well-Formed XML XM3014.1
This is a quick review of the important rules for XML well-formedness. It's important to
recognize that the well formedness rules are very simple.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-3. Why Do We Need DTDs? XM3014.1
The difficulty with well-formedness is that the rules are very simple.
Quite often we want to express more complicated constraints such as:
The element <message> can only have two children, <greeting> and <farewell>, and
the two children must appear in that order
The element <message> may have an optional urgent attribute?
What if we want the computer to be able to verify that an XML document meets these kinds
of constraints?
What if we want to have reusable pieces of text between two XML documents?
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-5</b>
Uempty
Figure 5-4. What Is a DTD? XM3014.1
A Document Type Definition is essentially the framework or skeleton of an XML document.
It defines which elements are allowed, which attributes are allowed for each element, and
whether such elements or attributes are required or optional. XML Schemas (often referred
enhancements. An XML document that conforms to its specified DTD or XML Schema is
said to be valid.
The DTD can be a separate file or it can also be embedded in the XML file. In fact, the DTD
contents can be split across an external file and the XML file.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-5. What is Allowed in a DTD? (1 of 2) XM3014.1
Similar material can also be found in the WSAD IE 5.1 help file for DTD.
This page and the next list the elements you may use in a DTD file.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-7</b>
Uempty
Figure 5-6. What is Allowed in a DTD? (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-7. XML and DTD Example XM3014.1
Here's a simple example of an XML document on the left, and the DTD rules that describe
it on the right. We're not going to go into the details of the rules right here -- that's what the
rest of this unit is about. We just wanted you to have an idea of how an XML file and it's
related DTD might look.
© Copyright IBM Corporation 2004
<b><street>1401 Main Street</street></b>
<b><city>Sheboygan</city></b>
<b><state>WI</state></b>
<b><country>USA</country></b>
<b></address></b>
<b><!ELEMENT address (name, street+,</b>
<b> city, state, zip?, country)></b>
<b><!ELEMENT name (title?, first-name,</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-9</b>
Uempty
Figure 5-8. What Is Allowed. . .Declaring Elements XM3014.1
Here's our introduction to declaring elements. An element declaration begins with
<!ELEMENT followed by the name of the element being declared and then the content
model for the element.
Here's a sample declaration for an element called greeting that accepts #PCDATA (text),
along with two <greeting> elements that are valid according to this declaration. The second
<greeting> element is using a CDATA section to quote its contents.
Remember, element names must start with a letter or underscore, however, the letters xml,
Namespaces), a period or alphanumeric characters may follow the first character (while
technically legal, an underscore-period combination is not recommended).
#PCDATA (parsed character data) indicates that only text and entities can be included in
the element. This data will be examined by the parser for entities and markup. Parsed
character data cannot contain the characters "&", "<", or ">"; these need to be represented
by their respective entities (Refer to the slide Built-in Entities).
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-9. Element Content Models XM3014.1
The content is the stuff in between the element's start and end tag.
There are four types of content models in XML 1.0 DTDs.
Types of DTD Content models
<b> • EMPTY</b>
<b> • ANY</b>
<b> • Element only - this includes child elements</b>
<b> • Mixed - this includes child elements and text</b>
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-11</b>
Uempty
Figure 5-10. EMPTY Content Model XM3014.1
The EMPTY content model is used for an element that will have no content whatsoever.
Note that such an element may have as many attributes as it likes.
To specify the EMPTY content model, provide the word EMPTY for the content model.
The two examples on the foil show two elements that are valid with an empty content
model.
Empty elements are not much use unless they have attributes. We'll learn more about
declaring attributes in a bit.
An EMPTY element can be very useful for testing snippets of XML. There is an example of
this later in this chapter.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-11. ANY Content Model XM3014.1
Contrary to what you might expect, the ANY content model does not allow you to put
anything you like between the start and end tag. When you use the ANY content model,
you must supply well-formed xml if what you supply has markup in it. Moreover, the
elements that you use must be declared in the DTD as well. So for the third example on the
foil, the <galaxy> element must be declared in the DTD for the document.
To specify the ANY content model provide the word ANY for the content model.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-13</b>
Uempty
Figure 5-12. Elements Content Model XM3014.1
If the content of an element consists solely of child elements, the element is said to have
element content.
The element content model is specified by content model particles that are combinations of
either element names or other content model particles.
The table describes the operators that can be used to form these combinations.
In the table, a or b can be either content particles or element names.
To create the content model of a followed by b, use the comma (,).
To create the content model of a or b, use the vertical bar (|).
To repeat a content particle at least once, use the (+).
To repeat a content particle zero or more times, use the (*).
To allow a content particle to be absent or present exactly once, use the (?).
© Copyright IBM Corporation 2004
sequence <b><!ELEMENT name (a,b)></b>
choice <b><!ELEMENT name (a|b)></b>
one <b><!ELEMENT name (a)></b>
one or more <b><!ELEMENT name (a)+></b>
zero or more <b><!ELEMENT name (a)*></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-13. Elements Content Examples (1 of 3) XM3014.1
The first example specifies that <person> has a content model that accepts an <fname>
followed by an <lname> or an <lname> followed by an <fname>. The matches show all the
possible permutations.
© Copyright IBM Corporation 2004
<b><!ELEMENT person ((fname,lname)|(lname,fname))></b>
<b><!ELEMENT lname (#PCDATA)></b>
<b> <lname>Smith</lname></b>
<b> <fname>John</fname></b>
<b></person></b>
<b> <fname></fname></b>
<b> <lname>Smith</lname></b>
<b></person></b>
<b> <fname/></b>
<b> <lname>Smith</lname></b>
<b></person></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-15</b>
Uempty
Figure 5-14. Elements Content Examples (2 of 3) XM3014.1
The second example specifies that an <order> is a sequence of at least one <order-item>
followed by a <delivery-address>, followed by an optional <order-date>.
The valid XML shows
1. One <order-item>, a <delivery-address> and no <order-date>.
2. Two <order-items> a <delivery-address> and no <order-date>.
3. Two <order-items>, a <delivery-address> and an <order-date>.
© Copyright IBM Corporation 2004
Declaration:
<b><!ELEMENT order (order-item+,delivery-address,order-date?)></b>
<b><!-- Child elements defined as containing #PCDATA --></b>
Valid XML fragments:
<b><order></b>
<b> <order-item>item1</order-item></b>
<b> <delivery-address>123 State Street</delivery-address></b>
<b></order></b>
<b><order></b>
<b> </b> <b><order-item>item3</order-item></b>
<b> </b> <b><order-item>item4</order-item></b>
<b><delivery-address>123 State Street</delivery-address></b>
<b></order></b>
<b><order></b>
<b> <order-item>item5</order-item></b>
<b> <order-item>item6</order-item></b>
<b> </b> <b><delivery-address>123 State Street</delivery-address></b>
<b><order-date>July 5, 2001</order-date></b>
<b></order></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-15. Elements Content Examples (3 of 3) XM3014.1
This example says that a phone book is at least one <entry>, <column-heading> or
<page-number>, but that there may be more than one of any of these three, and that they
may appear in any order.
The valid XML shows show:
1. Three <entry>'s.
2. Two <column-headings>.
The invalid example is invalid because page-number cannot have entry as a child.
© Copyright IBM Corporation 2004
<b>Declaration:</b>
<b><!ELEMENT phonebook (page)+></b>
<b><!ELEMENT page (heading, (entry|advert)+)></b>
<b><!ELEMENT heading (#PCDATA)></b>
<b><!ELEMENT entry (#PCDATA)></b>
<b><!ELEMENT advert (#PCDATA)></b>
<b>Valid XML fragment:</b>
<b><phonebook></b>
<b> <page></b>
<b><heading>The whole town</heading></b>
<b><advert>Fred's Fish n' Chips - 123-4567</advert></b>
<b></page></b>
<b></phonebook></b>
<b>Invalid XML fragments:</b>
<b><phonebook><page><entry/><entry/></page></phonebook></b>
<b><phonebook><page/></phonebook></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-17</b>
Uempty
Figure 5-16. Mixed Content Model XM3014.1
Elements that have the mixed content model can contain (parsed) character data. In
addition to the character data, mixed content models may also contain child elements
interspersed with the character data. If a mixed content model contains child elements, it
can specify which elements may appear, but the child elements can appear in any order,
and any number of times.
The valid XML shows:
1. An element with character data content only.
2. An element allowing a single child element in addition to the character data content.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-17. What Is Allowed. . .Declaring Attributes XM3014.1
The syntax for declaring attributes looks like this:
<!ATTLIST followed by
elementName - the name of element we are declaring that attribute for.
attributeName - is the name of the attribute being declared.
attributeType - specifies the data type (see Attribute Type table).
attributeDefault - specifies the attribute's default behavior.
To declare multiple attributes, you can write multiple ATTLIST declarations or repeat the
(attributeName attributeType attributeDefault) part as necessary.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-19</b>
Uempty
Figure 5-18. Organizational Note XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-19. Attribute Types XM3014.1
CDATA attributes contain character data. Whitespace crunching is not performed.
We covered this on previous charts.
The ID data type contains a string value that must be unique to each element. No element
type may have more than one ID attribute specified, although the declared ID attribute may
be #IMPLIED or #REQUIRED. ID valued attributes can be combined with IDREF and
IDREFS valued attributes to create cross referencing within an XML document.
IDREF's must contain values which are specified in an ID-valued attribute elsewhere in the
document. IDREFS are a space separated list of ID values.
ENTITY and ENTITIES are the name or a space separated list of entity names. (More on
entities in a moment).
NMTOKENs are strings composed of the legal characters in an XML element name -- they
are not the same as XML element names, because the first character of an XML element
name may not contain some of the characters that are legal as the first character of an
NMTOKEN.
© Copyright IBM Corporation 2004
<b>Attribute Type</b> <b>Description</b>
String Type
CDATA
Used to declare an attribute whose value may contain
arbitrary character data. Whitespace crunching is not done.
This is the only attribute type permitting attribute values that
do not match the NAME production in the XML 1.0 grammar.
Tokenized Type
NMTOKEN Used to declare an attribute whose value must conform to
the definition of a NAME in XML 1.0
NMTOKENS Allows multiple NMTOKENs separated by white space.
ID Used to declare an attribute whose value must be a unique
within the XML document.
IDREF, IDREFS The value of the attribute must refer to an ID value declared
elsewhere in the document. IDREFS? See NMTOKENS
the name of a declared ENTITY.
ENTITIES Allows multiple ENTITY names separated by whitespace.
Enumerated Type
NOTATION References a <!NOTATION declaration in the DTD.
ENUMERATION Attributes have a specified list of acceptable NMTOKEN
values.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-21</b>
Uempty
These were introduced earlier.
NOTATION valued attributes must contain the name of a NOTATION declared elsewhere in
the document. (More on NOTATION later).
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-20. Attribute Default Declarations XM3014.1
Every attribute must specify a default type. The possible values for the default type are:
#REQUIRED: Indicates that the attribute must occur; the value may be enumerated or
fixed.
#IMPLIED: Indicates that the attribute or the attribute's value can remain unspecified;
#FIXED value: Indicates that this attribute, when used, has a single (fixed) value, this
value must appear immediately after the keyword and be in quotes.
enumerated list: gives a list of choices in parentheses, each separated by an "or"
operator. A default value (from the enumerated list) may be given after the list and must
be in quotes. If a default value is declared, when the attribute is not present, the
element is treated as if the attribute were present with the declared default value.
© Copyright IBM Corporation 2004
<b>Default Declaration</b> <b>Description</b>
#REQUIRED The attribute must be present
#IMPLIED The attribute does not need to be present
and no default value was supplied
<i>attribute-value</i>
If the attribute’s value is not present,
<i>"attribute-value" is supplied as a default </i>
value
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-23</b>
Uempty
Figure 5-21. Attribute Default Declaration Examples XM3014.1
Here we've declared a few attributes with the various default types. Size has a default
value, type is required, and manufacturer is fixed.
Let's look at how the examples come out:
For the valid examples:
<shirt type="short"/> will also pickup the default value "large" for size, and the fixed value
"Levi" for manufacturer
<shirt type="short" size="large"/> will pick up the fixed value "Levi" for manufacturer
For the invalid examples:
<shirt/> is missing the required "type=" attribute
<shirt type=short size="medium large"/> is invalid because "medium large" isn't in the
enumerated value list for size
<shirt type="short" manufacturer="Gap"/> is invalid because "Gap" isn't the fixed value for
manufacturer
© Copyright IBM Corporation 2004
<b><!ELEMENT shirt (#PCDATA)></b>
<b><!ATTLIST shirt type CDATA #REQUIRED></b>
<b><!ATTLIST shirt collar CDATA #IMPLIED></b>
<b><!ATTLIST shirt size (small|medium|large) "large"></b>
<b><!ATTLIST shirt manufacturer CDATA #FIXED "Levi"></b>
<b><shirt type="short">cotton</shirt> </b>
<b><shirt type="short" size="large">wool</shirt> </b>
<b><shirt type="short" manufacturer="Levi">denim</shirt></b>
<b><shirt type="short sleeve" collar="button-down"></shirt> </b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-22. Attribute Alternate Declaration XM3014.1
© Copyright IBM Corporation 2004
<b><!ELEMENT shirt (#PCDATA)></b>
<b><!ATTLIST shirt size (small|medium|large) "large"</b>
<b> collar CDATA #IMPLIED</b>
<b> type CDATA #REQUIRED</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-25</b>
Uempty
Figure 5-23. Attribute Types: Tokenized Types: IDREFS Example XM3014.1
This foil shows a declaration for an implied attribute of type IDREFS.
According to the syntax rules for IDs, numbers cannot be ID's. That is why the
serialNumber values begin with a letter.
Aside from naming rules, manager2 could have any value as long as there is an element
with that value defined.
Consequently, an employee could be self-managed!
The uniqueness constraint applies to IDs not to IDREFs so the employee could be
self-managed twice: both manager1 and manager2 could have the same value.
© Copyright IBM Corporation 2004
<b><!ATTLIST elementName attributeName IDREF defaultDecl></b>
<b><!ELEMENT employee (#PCDATA)></b>
<b><!ATTLIST employee serialNumber ID #REQUIRED></b>
<b><!ATTLIST employee manager1</b> <b>IDREF</b> <b>#IMPLIED></b>
<b><!ATTLIST employee manager2</b> <b>IDREFS</b> <b>#IMPLIED></b>
<b><employee serialNumber="e00001">Joe Smith</employee></b>
<b><employee serialNumber="e00002">Bill Smith</employee></b>
<b><employee serialNumber="e00003" manager1="e00001">John </b>
<b>Smith</employee></b>
<b><employee serialNumber="e00004" manager1="e00001" </b>
<b>manager2="e00002 e00001">John Smith</employee> </b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-24. Attribute Types: Tokenized Types: ENTITY Example XM3014.1
This foil shows a declaration for an implied attribute of type ENTITY.
As you can see there are several concepts involved that we have yet to discuss.
Not the least of which is "what is an 'entity'?"
You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.
The concepts themselves are described on subsequent charts.
© Copyright IBM Corporation 2004
<b><!ATTLIST elementName attributeName ENTITY defaultDecl></b>
<b><!ELEMENT employee (#PCDATA)></b>
<b><!ATTLIST employee companyName ENTITY #REQUIRED></b>
<b><!ENTITY company </b>
<b> SYSTEM /><b> NDATA txt></b>
<b><!NOTATION txt </b>
<b> SYSTEM "file:///C:/Windows/System32/notepad.exe"></b>
<b><employee companyName="company">Joe Smith</employee></b>
ENTITY is also used in its own right as another element of a DTD; this is
covered in subsequent charts. Here we focus on ENTITY as an attribute.
NDATA and NOTATION are concepts we have yet to discuss.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-27</b>
Uempty
Figure 5-25. Attribute Types: Tokenized Types: ENTITIES Example XM3014.1
ENTITIES provide a mechanism for including data from multiple sources.
As you can see there are several concepts involved that we have yet to discuss.
You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.
While DTDs may be lacking in several important aspects (listed later), they can still be very
complex!
Like the ENTITY example, we need to define several concepts for this chart to be
understood. The explanations follow.
© Copyright IBM Corporation 2004
<b>Syntax:</b>
<b><!ATTLIST elementName attribName ENTITIES defaultDecl></b>
<b>Declaration:</b>
<b><!ELEMENT employee (#PCDATA)></b>
<b><!ATTLIST employee companyAtts ENTITIES #REQUIRED></b>
<b><!ENTITY company "IBM"></b>
<b><!ENTITY</b> <b>division "19"></b>
<b><!ENTITY</b> <b>branch " />
Valid XML fragment:
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-26. DTDs Part II XM3014.1
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-29</b>
Uempty
Figure 5-27. Declaring ENTITYs: an Internal, Parsed ENTITYs Example XM3014.1
Here is an example.
But we just told you that entities are related to separate storage units, and the entity
declaration that we just saw fit completely into the DTD. This kind of entity is called an
internal entity and is not associated with a separate physical storage unit. Let's look at how
to declare the same entity as an external entity, in a separate physical storage unit.
© Copyright IBM Corporation 2004
<!ENTITY entityName "replacementText">
&entityName;
<!ENTITY xmlExpert "Ron Smith">
<!ENTITY topic "XML Documents">
<response>For additional help with &topic;,
Please contact &xmlExpert;.</response>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-28. Declaring ENTITYs: an External, Parsed ENTITYs Example XM3014.1
In this case where the entity defines a public URI, the parser must understand how to
handle the "publicURI" identifier. This is traditionally only used when the parser provided
was hard-coded to handle it, or if you will be creating your own parser to handle entity
replacement.
According to 4.2.2 (External Entities) of the XML 1.0 specification: "Definition: In
<b>addition to a system identifier, an external identifier may [</b><i>emphasis added] include a </i>
<b>public identifier. An XML processor attempting to retrieve the entity's content may </b>
[<i>emphasis added] use the public identifier to try to generate an alternative URI </i>
<b>reference. If the processor is unable to do so, it must [</b><i>emphasis added] use the URI </i>
reference specified in the system literal...."
Here is their example:
<!ENTITY open-hatch
SYSTEM " /><!ENTITY open-hatch
PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
© Copyright IBM Corporation 2004
<!ENTITY entityName SYSTEM "systemURI">
<!ENTITY entityName PUBLIC "publicURI" "systemURI">*
<b>*refer to the Notes.</b>
<!ENTITY copyrightInfo SYSTEM "file:///c:/legal/boilerplate.txt">
Copyright 2003, IBM. All rights reserved.
<notices>This application was developed using WebSphere Studio.
©rightInfo;</notices>
This application was developed using WebSphere Studio.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-31</b>
Uempty
<!ENTITY hatch-pic
SYSTEM "../grafix/OpenHatch.gif"
NDATA gif >
Find out more at: />
Be aware that an external entity may not recursively reference itself, either directly or
indirectly.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-29. Unparsed Entity Declarations: a Review XM3014.1
Here's an example of unparsed entity use:
First we declare a notation called jpeg and associate it with a photoshop.exe somewhere
on the local machine.
Then we declare an external unparsed entity called prod17792 and add the NDATA jpeg
clause to specify the notation.
The rest of the DTD declares an empty element item with an ENTITY valued attribute
called picture.
You can see in the XML instance document that we supply prod17792 (the name of the
entity) as the value of the picture attribute of item.
This is how you can associate a piece of unparsed/binary data with a portion of an XML
document.
© Copyright IBM Corporation 2004
<b><!ENTITY entityName SYSTEM "URI" NDATA notationName></b>
<b><!NOTATION jpeg SYSTEM</b>
<b>"file:///c:/Program Files/Photoshop/photoshop.exe"></b>
<b><!ENTITY prod17792 SYSTEM "prod17792.jpg" NDATA jpeg></b>
<b><!ELEMENT item EMPTY></b>
<b><!ATTLIST item picture ENTITY #REQUIRED></b>
<b><item picture='prod17792'/></b>
<b>Rules:</b>
Unparsed entities can only be external entities. In order to declare an unparsed
entity, you start with a regular external entity declaration and before the closing
angle bracket you insert NDATA and the name of a notation. This associates a
notation name with the unparsed entity.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-33</b>
Uempty
Figure 5-30. Parameter ENTITYs XM3014.1
The parameter entity replacement works like regular entity replacement. The parser will
substitute the replacement text, and then continue evaluating the DTD from the point of
replacement.
Parameter entities are entities that are meant to be used in the DTD. Parameter entities
are very useful if you want to reuse portions of an attribute list declaration or if you want to
reuse parts of a complex content model specification.
Parameter entities are the primary tool that is available to help you structure a complex
DTD.
© Copyright IBM Corporation 2004
<!ENTITY % parameterEntityName "replacementText">
%parameterEntityName;
<!ENTITY % commonAtts "make CDATA #IMPLIED
model CDATA #IMPLIED">
<!ELEMENT phone (#PCDATA)>
<!ATTLIST phone %commonAtts
type (rotary | touch-tone) #IMPLIED>
<!ELEMENT phone (#PCDATA)>
<!ATTLIST phonemake CDATA #IMPLIED
model CDATA #IMPLIED
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-31. Parameter ENTITYs - Another Example XM3014.1
In this example the commonAtts parameter entity is used to represent common attributes
© Copyright IBM Corporation 2004
<b><!ENTITY % commonAtts</b>
<b> "typeID ID #REQUIRED</b>
<b> make CDATA #IMPLIED</b>
<b> model CDATA #IMPLIED"></b>
<b><!ELEMENT car (#PCDATA)></b>
<b><!ATTLIST car %commonAtts;></b>
<b><!ELEMENT computer (#PCDATA)></b>
<b><!ATTLIST computer %commonAtts;></b>
<b><!ELEMENT phone (#PCDATA)></b>
<b><!ATTLIST phone %commonAtts;</b>
<b> type (rotary|digital) #IMPLIED></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-35</b>
Uempty
Figure 5-32. What Is Allowed. . . Declaring Comments XM3014.1
To insert a comment in a DTD (or an XML document for that matter) place the comment
text inside <!-- and -->.
Comments cannot be nested. The space after the <!-- is required, as is the space before
-->. The characters "--" may not be used within the comment. This form of declaration is
also usable within HTML, XML and XSL documents.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-36 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-33. Joining a DTD to an XML Instance XM3014.1
Overriding (changing) the data contained in an XML instance may cause confusion for
other users of the instance.
The application of an XSL transform or a processor program (for example, DOM, SAX,
or similar) may be a better alternative.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-37</b>
Uempty
Figure 5-34. External DTD Subset XM3014.1
Up until now we've described some of the contents of a DTD without showing how to
actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at placing the DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.
The external DTD subset is an entity even though DTD declarations are not elements.
Therefore you need to supply a text declaration at the beginning of the external DTD
subset. This is especially important if the document and the DTD are going to be using
different character encodings.
In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have it's own encoding declaration (which
The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-38 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
DTD subset. This means that potentially any element declaration in the DTD can serve as
the root element. It is up to the DOCTYPE writer to specify this. Following the name of the
root element is the keyword SYSTEM followed by a URI reference that the local machine
can use to locate the actual file containing the external DTD.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-39</b>
Uempty
Figure 5-35. Internal DTD Subset XM3014.1
Up until now we've describe some of the contents of a DTD without showing how to
actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at the placing DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.
The external DTD subset is an entity even though DTD declarations are not elements.
Therefore you need to supply a text declaration at the beginning of the external DTD
subset. This is especially important if the document and the DTD are going to be using
different character encodings.
In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have its own encoding declaration (which
may be different from the encoding of documents that reference the DTD file).
The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-40 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
DTD subset. This means that potentially any element declaration in the DTD can serve as
the root element. It is up to the DOCTYPE writer to specify this. Following the name of the
root element is the keyword SYSTEM followed by a URI reference that the local machine
can use to locate the actual file containing the external DTD.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-41</b>
Uempty
Figure 5-36. Split DTD Subsets XM3014.1
The example on this foil shows a DTD with an entity called destination in both the internal
and external subsets. The declaration for destination in the internal subset will override the
declaration in the external subset, leaving the messages "Hello cruel world" and "good-bye
cruel world" after entity expansion has occurred.
This allows local entity declarations in the internal subset to override entity declarations in
the external subset.
A best practice would be to include a comment drawing attention to the intent of this
internal subset to override a value set in the external subset.
© Copyright IBM Corporation 2004
Embedding DOCTYPE declarations and the DTD within the XML file:
<b>Filename: hello.xml</b>
<b><?xml version="1.0" encoding="UTF-8"?></b>
<b><!DOCTYPE message SYSTEM "message.dtd" [</b>
<b> <!ENTITY destination "cruel world"></b>
<b> <!-- overrides destination in message.dtd --></b>
<b>]></b>
<b><message></b>
<b><greeting>Hello, &destination;</greeting></b>
<b><farewell>Goodbye, &destination;</farewell></b>
<b></message></b>
<b>Filename: message.dtd</b>
<b><!ELEMENT message (greeting, farewell)></b>
<b> <!ELEMENT greeting (#PCDATA)></b>
<b> <!ELEMENT farewell (#PCDATA)></b>
<b> <!ENTITY destination "World"></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-42 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-37. Whitespace and DTDs XM3014.1
Whitespace is white space isn't it? Not if you are a validating XML processor. There are two
kinds of white space:
Whitespace in #PCDATA element content (between the same start and end tag pair) -
you only know this if you have a DTD
Whitespace in non-character data content
Whitespace not in #PCDATA data element content is ignorable
Parsers report whitespace and ignorable whitespace differently. The parser does not
actually discard the ignorable white space -- this is the application's job. But the parser can
use different data structures / callback routines in order to report ignorable versus not
ignorable whitespace.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-43</b>
Uempty
Figure 5-38. Ignorable Whitespace Example XM3014.1
This slide shows an example XML document and DTD, and shows which whitespace is
ignorable and which whitespace is not.
Again, it is up to the application to decide what to do about ignorable whitespace. An XML
processor will report all of the whitespace and indicate whether or not it is ignorable or note.
© Copyright IBM Corporation 2004
<b><?xml version='1.0'?></b>
<b><!DOCTYPE example [</b>
<b><!ELEMENT example (source-code)></b>
<b>]></b>
<b><example> <-- ignorable</b>
<b><source-code> <-- not ignorable</b>
<b> int i; <-- not ignorable</b>
<b> i = 0; <-- not ignorable</b>
<b></source-code></b> <b><-- ignorable</b>
<b></example></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-44 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-39. Validating versus Non-validating Processors XM3014.1
Validating processors are straightforward. The XML spec tells implementors exactly what a
validating processor must do (in fact, they must do everything).
Non-validating processors have options because the XML spec says that a non-validating
processor may do certain things, but is not required to do them. Unfortunately, every parser
A non-validating processor must check the document entity including the internal subset.
If there is an external DTD subset, they may or may not:
normalize attribute values from the external subset
replace internal entity text from the external subset
supply attribute defaults from the external subset
Since the behavior of non-validating processors is up to implementors, you need to be
careful when working with a non-validation processor if you have complicated attribute
values or use entities.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-45</b>
Uempty
Figure 5-40. Example DTDs XM3014.1
Many organizations are producing DTD's for various applications. Here some examples:
<b> • cXML - </b>
<b> - cXML is a streamlined protocol intended for consistent communication of business </b>
documents between procurement applications, e-commerce hubs and suppliers.
The current standard includes documents for setup (company details and
transaction profiles), catalogue content, application integration (including the
widely-used PunchOut feature), original, change and delete purchase orders and
responses to all of these requests, as well as new order confirmation and ship notice
documents (cXML analogues of EDI 855 and 856 transactions).
<b> • RosettaNet - </b>
<b> - RosettaNet Partner Interface Processes (PIPs™) define business processes </b>
between trading partners. RosettaNet dictionaries provide a common set of
properties for PIPs™. The RosettaNet Business Dictionary designates the
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-46 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
standards expedite the alignment of business processes between trading partners.
<b> • RSS - />
<b> - The RDF Site Summary format was originally developed by Netscape and is widely </b>
used across the World Wide Web for the purpose of syndicating news articles.
<b> • DocBook - </b>
<b> - DocBook is an XML version of the SGML DocBook DTDs that are widely used in the </b>
production of documentation which can be rendered into multiple output formats.
<b> • OFX - </b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-47</b>
Uempty
Figure 5-41. What's Wrong with DTDs? XM3014.1
There are number of problems with DTD's, which are listed on the chart.
These problems have led to the creation of a number of alternate languages for defining
the structure of XML grammars. The two leading contenders are W3C's XML Schema, and
OASIS's Relax NG.
© Copyright IBM Corporation 2004
XML syntax, strong typing, constraints
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-48 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-42. Status of DTDs XM3014.1
DTD's are a part of the XML 1.0 recommendation. They are a stable technology and widely
XML Schema is the W3C approved replacement for DTD's, but this is a new technology
and has not reached broad usage at the time of this writing.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-49</b>
Uempty
Figure 5-43. Tooling XM3014.1
The Tooling for DTDs is pretty simple at the base. You can use the same editor that you
use to edit an XML file to edit a DTD. They are the same kind of text.
There are also many tools for working with DTD's.
The commercially available XML Spy is a popular graphical tool for working with XML,
DTD's and XML Schema.
There are many parsers that perform validation using a DTD. This is true of all of the
parsers available from the Apache Software Foundation.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-50 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-44. Checkpoint Questions (1 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-51</b>
Uempty
Figure 5-45. Checkpoint Questions (2 of 2) XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>5-52 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 5-46. Unit Summary XM3014.1
In this section you have learned about:
<b> • XML 1.0 DTD's</b>
<b> • Element declarations</b>
<b> • Attribute declarations</b>
<b> • Comments</b>
<b> • Entity declarations</b>
<b> • General</b>
<b> • Parameter</b>
<b> • Notation declarations</b>
<b> • The difference between validating and non validating processors</b>
<b> ã Example DTD's</b>
<b> ã Best Practices</b>
â Copyright IBM Corporation 2004
General
Parameter
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-53</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-1</b>
Uempty
This unit describes the XML Namespaces Facility.
After completing this unit, you should be able to:
• Describe the reasons for using namespaces
• Describe the syntax used in namespaces
• Define and illustrate an example using namespaces
• Define myths about namespaces
• Define problems with namespaces
• List and define the best practices to use when using namespaces
• Describe the status of namespaces in the industry
• Checkpoint
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-1. Unit Objectives XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-3</b>
Uempty
Figure 6-2. Problem: Element and Attribute Names can be Ambiguous XM3014.1
The double use of title in the example illustrates the need for a namespace solution in XML.
We need to be able to tell that the two title elements in this document are not the same
element. Even though the elements have the same name, they have different meanings to
the application. Using the context to disambiguate the two uses is not a generally
applicable solution.
© Copyright IBM Corporation 2004
<b><catalogEntry></b>
<b> <book></b>
<b> <title>this book</title></b>
<b> <isbn>0001</isbn></b>
<b> <author></b>
<b> <title>Dr.</title></b>
<b> <lastName>Expert</lastName></b>
<b> <firstName>Iman</firstName></b>
<b> </author></b>
<b> </book></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-3. Elaboration XM3014.1
URI's are not actually used for lookup, only as reference. The only purpose is to give the
namespace a unique name. Sometimes the URI is a pointer to a web page, which provides
information about the namespace, but this is not required. The URI is not looked up as part
of XML parsing or processing.
The application is responsible for deciding what to do with the names.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-5</b>
Uempty
Figure 6-4. Namespaces: The Big Idea XM3014.1
URI, recall, is uniform resource identifier.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-5. XML Namespaces XM3014.1
URI's are not actually used for lookup, only as reference. The only purpose is to give the
namespace a unique name. Sometimes the URI is a pointer to a web page, which provides
information about the namespace, but this is not required. The URI is not looked up as part
of XML parsing or processing.
The application is responsible for deciding what to do with the names.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-7</b>
Uempty
Figure 6-6. Qualified Names (QNames) XM3014.1
You can think of a QName like <books:title> as being equivalent to the following Clark
Notation:
{ />
© Copyright IBM Corporation 2004
<books : title >
<b>Course materials may not be reproduced in whole or in part </b>
<b>6-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-7. Declaring Namespaces (1 of 2) XM3014.1
Note that you can declare a namespace on any element that you like, not just the root
element.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-9</b>
Uempty
Figure 6-8. Declaring Namespaces (2 of 2) XM3014.1
Now let's look at example with nested elements.
© Copyright IBM Corporation 2004
<b> <title>Tom Sawyer</title></b>
<b></book></b>
<b><books:book</b>
<b> xmlns:books=' />
<b> books:hardcover='true'></b>
<b> <books:title</b>
<b> xmlns:books=' /><b> Tom Sawyer</b>
<b> </books:title></b>
<b></books:book></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-9. Namespace Scope XM3014.1
Note that every element or attribute name that is in the namespace has the appropriate
namespace prefix in front of it.
© Copyright IBM Corporation 2004
<b><books:book</b>
<b>xmlns:books=' />
<b>books:hardcover='true'></b>
<b><books:title></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-11</b>
Uempty
Figure 6-10. Default Namespaces XM3014.1
Once you have specified the default namespace, all unprefixed elements in the scope of
In our example, we set the books namespace to the default and get rid of all the prefixes on
element names. We still need the prefix on the attribute names because default
namespaces don't apply to attributes.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-11. Example - Default Namespaces XM3014.1
The result of these apparent duplications is to put the hardcover attribute inside a
namespace.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-13</b>
Uempty
Figure 6-12. Documents with Multiple Namespaces XM3014.1
All that we did to enable this was add two more namespace declarations, and then add the
new elements and use the appropriate namespace prefix.
In the case of the isbn element, we declared the namespace that it needed on the element
itself -- you can declare namespaces on any element that you like, not just the root
element. When you do this, the prefix is only good for the element it was declared on. You
can also change the default namespace for a particular element by redefining the default
namespace on that element. Again, the scope will be the element that the declaration is
attached to.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>6-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-13. Elements with No Namespace XM3014.1
The unprefixed <title> element is in no namespace, because there is no default null
namespace.
In order to repair this example, we need to prefix title with the books namespace prefix
again.
WRONG!
© Copyright IBM Corporation 2004
<b><book</b>
<b>xmlns=' />
<b>xmlns:amazon=' /><b> </b> <b><title>Tom Sawyer</title></b>
<b> </b> <b><isbn xmlns=""></b>
<b> </b> <b>0140390839 </b>
<b> </b> <b><amazon:skuNo>A25</amazon:skuNo></b>
<b></book></b>
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-15</b>
Uempty
Figure 6-14. Attributes and Namespaces XM3014.1
There are two interacting rules that affect attributes and namespaces:
<b> • Attributes are not affected by a default namespace declaration.</b>
<b> • Attributes on a single element must be unique.</b>
In the example above, the <bad> element is invalid because there are two unprefixed att
attributes. In the second invalid element the two attributes are the same because ns1 and
ns2 are two prefixes for the same namespace URI. Therefore, the two attribute names are
identical.
It should be obvious that the first <valid> element is valid -- a and b are unprefixed, and a is
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-15. Namespace Processing XM3014.1
© Copyright IBM Corporation 2004
SAX2
DOM Level 2
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-17</b>
Uempty
Figure 6-16. Example: Use of Namespaces XM3014.1
Here's an example of namespaces in use:
Here we have an imaginary record that might be used in an airline's airplane fleet inventory.
For each airplane, we want to know which manufacturer provided each major part of the
This example shows how we could use namespaces to identify which components came
from which manufacturers.
An application that processed this document could then use the namespaces to determine
which manufacturer's diagnostic equipment would be needed to perform a full maintenance
cycle on a particular airplane.
While not required, it is a best practice to collect all the namespace definitions in one place;
especially in large, composite files.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-17. Problems with Namespaces XM3014.1
Namespace recommendation after XML 1.0 - because the namespace recommendation
came after XML 1.0, it's not really part of the spec. This means there are places where
namespaces and XML 1.0 don't fit together.
DTD's don't really integrate well - We've showed you an ad hoc solution for using a fixed
set of namespaces with a DTD, but that solution doesn't really satisfy a lot of desires that
users have for namespaces.
Testing equality of namespaces is a pain - there's no easy way to test equality of two
namespaces except to get the two namespace URIs and compare them character by
character.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-19</b>
Uempty
Figure 6-18. Best Practices XM3014.1
When to use namespaces:
<b> • When you think your DTD/Schema will be used outside your organization. </b>
<b> • When you think you will need to combine your DTD/Schema with other grammars.</b>
<b> • As a practical note, this means that anybody doing serious grammar work really ought </b>
to be using namespaces.
Performance implications:
<b> • Namespace processing slows down the parser and increases memory usage. The </b>
parser needs to look at all the namespace declarations and QNames. Even if you turn
off namespace processing in your parser, there will still be a performance impact
because your input document will still be larger (because of namespace declarations
and QNames) than if you were not using namespaces.
Don't use relative URIs for namespace identifiers; they are deprecated post the
namespaces recommendation.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
choose carefully.
Don't declare more than one prefix for a namespace URI - there's no reason to do it and it
will cause confusion to someone else.
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-21</b>
Uempty
Figure 6-19. Status of Namespaces XM3014.1
Namespaces in XML Recommendation 1/1999 - it is a stable recommendation.
Supported by most parsers relative to DTDs.
Much better support with XML Schema.
Namespaces are ready for use, especially now that XML Schema has reached
recommendation status.
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-20. More Information XM3014.1
© Copyright IBM Corporation 2004
/>
NamespacesFAQ.htm XML Namespaces FAQ
/>
namespaces/index.html
XML.com article about Namespace
Myths
James Clark's notes on XML
Namespaces
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-23</b>
Uempty
Figure 6-21. Checkpoint Questions XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>6-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 6-22. Unit Summary XM3014.1
© Copyright IBM Corporation 2004
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 7. XML Schema</b> <b>7-1</b>
Uempty
This unit presents an introduction to the essential features of the W3C
XML Schema language.
After completing this unit, you should be able to:
• List and describe the reasons for using XML Schemas
• List the key new features of Schemas
• Define the grammar rules of an XML document using the syntax of
XMLSchemas
• List and define the best practices to use when using XML Schemas
• Describe the status of XML Schemas in the industry
<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>
<b>7-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>
Figure 7-1. Unit Objectives XM3014.1
© Copyright IBM Corporation 2004