Tải bản đầy đủ (.pdf) (594 trang)

this site is individual site for ueh students of information management faculty this site provides some students resources of it courses such as computer network data structure and algorithm enterprise resource planning

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.49 MB, 594 trang )

<span class='text_page_counter'>(1)</span><div class='page_container' data-page=1>

<i>Introduction to XML</i>



<i>and Related Technologies </i>



(Course Code XM301)


Student Notebook


ERC 4.1


IBM Certified Course Material



cover


</div>
<span class='text_page_counter'>(2)</span><div class='page_container' data-page=2>

<b>July 2004 Edition</b>


The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.


<b> © Copyright International Business Machines Corporation 2001, 2004. All rights reserved.</b>


<b>This document may not be reproduced in whole or in part without the prior written permission of IBM.</b>


Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.


IBM® is a registered trademark of International Business Machines Corporation.


The following are trademarks of International Business Machines Corporation in the United


States, or other countries, or both:


Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.


Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.


UNIX is a registered trademark of The Open Group in the United States and other
countries.


SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.


AFS AIX alphaWorks


AS/400 CICS ClearCase


Database 2 DB2 DB2 Universal Database


DFS Distributed Relational


Database Architecture Domino


DRDA Encina Everyplace


IMS Lotus Enterprise Integrator Lotus Notes


Lotus MQSeries MVS



NetRexx Network Station Notes


Open Blueprint OS/2 OS/390


RACF RDN RS/6000


S/390 SecureWay Tivoli


Tivoli Enterprise Tivoli Management


Environment TME


TME 10 TXSeries VisualAge


</div>
<span class='text_page_counter'>(3)</span><div class='page_container' data-page=3>

TOC


<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>iii</b>

<b>Contents</b>



<b>Trademarks . . . xi</b>


<b>Course Description . . . xiii</b>


<b>Agenda . . . xv</b>


</div>
<span class='text_page_counter'>(4)</span><div class='page_container' data-page=4>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>iv </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


</div>
<span class='text_page_counter'>(5)</span><div class='page_container' data-page=5>

TOC


<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>v</b>


</div>
<span class='text_page_counter'>(6)</span><div class='page_container' data-page=6>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>vi </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


</div>
<span class='text_page_counter'>(7)</span><div class='page_container' data-page=7>

TOC


<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>vii</b>


</div>
<span class='text_page_counter'>(8)</span><div class='page_container' data-page=8>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>viii </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


</div>
<span class='text_page_counter'>(9)</span><div class='page_container' data-page=9>

TOC


<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Contents</b> <b>ix</b>


</div>
<span class='text_page_counter'>(10)</span><div class='page_container' data-page=10>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>x </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


<xsl:choose Element . . . .9-38
<xsl:choose Example . . . .9-39
Elements to Generate Output (XML to XML) . . . .9-40
<xsl:element Element . . . .9-42
<xsl:attribute> . . . .9-43
XML to XML Example (1 of 2) . . . .9-44
XML to XML Example (2 of 2) . . . .9-45
Numbers, Sorting, and Functions . . . .9-46
Working with Numbering in XSLT . . . .9-47
<xsl:number Element format Attribute Values . . . .9-49
<xsl:number Example . . . .9-50
<xsl:sort Element . . . .9-51
<xsl:sort Attributes . . . .9-52
Sort Example . . . .9-53
XPath/XSLT Functions . . . .9-54
Other Elements . . . .9-56
Attribute Value Templates . . . .9-57
Attribute Value Templates Example . . . .9-58
XSLT Processors . . . .9-59
Xalan . . . .9-60
XSL Resources from IBM . . . .9-61


XSL References . . . .9-62
Checkpoint Questions . . . .9-63
Unit Summary . . . .9-64
<b>Appendix A. Introduction to Databases and XML . . . A-1</b>


<b>Appendix B. Additional Information for XML Schema . . . B-1</b>


<b>Appendix C. What’s New in WebSphere Studio V5.1.1 . . . C-1</b>


<b>Appendix D. Additional Information and Examples . . . D-1</b>


<b>Appendix E. Bibliography and References . . . E-1</b>


<b>Appendix F. Acronyms and Abbreviations . . . F-1</b>


<b>Appendix G. Glossary . . . G-1</b>


</div>
<span class='text_page_counter'>(11)</span><div class='page_container' data-page=11>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Trademarks</b> <b>xi</b>


TMK

<b><sub>Trademarks</sub></b>



The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:


IBM® is a registered trademark of International Business Machines Corporation.


The following are trademarks of International Business Machines Corporation in the United


States, or other countries, or both:


Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.


Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.


UNIX is a registered trademark of The Open Group in the United States and other
countries.


SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.


AFS® AIX® alphaWorks®


AS/400® CICS® ClearCase®


Database 2™ DB2® DB2 Universal Database™


DFS™ Distributed Relational


Database Architecture™ Domino®


DRDA® Encina® Everyplace®


IMS™ Lotus Enterprise Integrator® Lotus Notes®


Lotus® MQSeries® MVS™



NetRexx™ Network Station® Notes®


Open Blueprint® OS/2® OS/390®


RACF® RDN™ RS/6000®


S/390® SecureWay® Tivoli®


Tivoli Enterprise™ Tivoli Management


Environment® TME®


TME 10™ TXSeries® VisualAge®


</div>
<span class='text_page_counter'>(12)</span><div class='page_container' data-page=12>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(13)</span><div class='page_container' data-page=13>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Course Description</b> <b>xiii</b>


pref

<b><sub>Course Description</sub></b>



<b>Introduction to XML and Related Technologies </b>


<b>Duration: 2.5 days</b>



<b>Purpose</b>



This course provides an introduction to XML (eXtensive Markup


Language) and related technologies. Students will gain conceptual
and practical knowledge of the concepts that are required to work with
XML. The course will build the basic skills to enable architects,


designers, analysts, developers, testers, and administrators to use
XML and its related technologies in the context of building e-business
applications. The course is a 2.5-day classroom course with hands-on
lab exercises that reinforce the lecture material.


<b>Audience</b>



This course is designed for information technology individuals,


including enterprise application architects, designers, developers, and
content modelers and creators.


<b>Prerequisites</b>



Knowledge of Internet technologies is required. Some experience with
using HTML would be helpful, but is not necessary.


<b>Objectives</b>



After completing this course, you should be able to:


• Describe the important XML standards and recommend their use in
business applications


• Define XML documents using namespaces, DTD, or Schema
• Develop and test XML processing applications



• Use XSLT to transform XML documents as necessary
• Identify open areas in XML, such as security, and emerging


technologies such as DB support, XHTML, Web Services, XLink,
and so forth. Plan for their incorporation into XML processing
applications


</div>
<span class='text_page_counter'>(14)</span><div class='page_container' data-page=14>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(15)</span><div class='page_container' data-page=15>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Agenda</b> <b>xv</b>


pref

<b><sub>Agenda</sub></b>



<b>Day 1</b>



Unit 1 - Introduction to XML and Related Technologies
Unit 2 - Issues in Electronic Information Exchange
Unit 3 - What Is XML?


XML Basics Lab


Unit 4 - WebSphere Studio Application Developer Overview
Introduction to WebSphere Studio Application Developer Lab
Unit 5 - Document Type Definition (DTD)



DTD Lab


Unit 6 - XML Namespaces
XML Namespaces Lab

<b>Day 2</b>



Unit 7 -XML Schema
XML Schema Lab


Unit 8 - XPath - XML Path Language
XPath Lab


Unit 9 - XSL - eXtensible Stylesheet Language Part 1
XSLT Lab Part 1 - Simple Transforms


<b>Day 3</b>



Unit 9 - XSL - Extensible Stylesheet Language Part 2
XSLT Lab Part 2 - Conditional Transforms


</div>
<span class='text_page_counter'>(16)</span><div class='page_container' data-page=16>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(17)</span><div class='page_container' data-page=17>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-1</b>


Uempty

<b><sub>Unit 1. Introduction to XML and Related </sub></b>




<b>Technologies</b>



<b>What This Unit is About</b>



This unit describes the audience, prerequisites, and overall objectives
for XM301. The overall agenda for the course is also covered.


<b>What You Should Be Able to Do</b>



</div>
<span class='text_page_counter'>(18)</span><div class='page_container' data-page=18>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>1-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 1-1. Introduction XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Introduction</b>



XM301 Introduction to XML and Related Technologies


Instructor:



Please introduce yourself and provide your:


Name and organization


Job Role



</div>
<span class='text_page_counter'>(19)</span><div class='page_container' data-page=19>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-3</b>


Uempty


Figure 1-2. Course Description XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Course Description</b>



This course is designed to introduce students to the fundamentals


of XML and its significant derivative companion technologies: XML


Schema, Namespaces, XPath, and XSL Transformations.



Document Type Declarations (DTDs) are also introduced.


The focus of the course is on the creation, specification and


processing of XML documents.



The course is 2.5 days in length and provides extensive hands-on


labs throughout.



</div>
<span class='text_page_counter'>(20)</span><div class='page_container' data-page=20>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>1-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 1-3. Audience XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Audience</b>



</div>
<span class='text_page_counter'>(21)</span><div class='page_container' data-page=21>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-5</b>


Uempty


Figure 1-4. Prerequisites XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Prerequisites</b>



Prerequisites:



</div>
<span class='text_page_counter'>(22)</span><div class='page_container' data-page=22>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>1-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 1-5. Course Objectives (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Course Objectives (1 of 2)</b>



After completing this course, you should be able to:


Describe/differentiate the use of HTML and XML


Enumerate the rules of a well-formed XML document


Create and maintain XML documents



Describe the purpose and use of Document Type Definitions


(DTDs)



Create DTDs describing the validation rules for specific XML


instances*



Describe the purpose and use of XML Schema


Enumerate the benefits of XML Schema over DTDs



Create XML Schemas describing the validation rules for specific


XML instances*



</div>
<span class='text_page_counter'>(23)</span><div class='page_container' data-page=23>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-7</b>



Uempty


Figure 1-6. Course Objectives (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Course Objectives (2 of 2)</b>



After completing this course, you should be able to:


Describe the purpose of XML Namespaces



Declare and use XML Namespaces in an XML document*


Describe the use of an XPath in the context of XSLT and XML


Schema



Create XPath expressions that locate specific information in an


XML instance*



Describe the use of XSL in the processing of XML documents


Create an XSL Transformation to transform an XML document


into some other instance*



</div>
<span class='text_page_counter'>(24)</span><div class='page_container' data-page=24>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>1-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 1-7. Agenda - Day 1 XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Agenda - Day 1</b>



Welcome and Introductions


Issues in Information Exchange


What is XML?



Lab Exercise



Overview of IBM WebSphere Studio Application Developer


Lab Exercise



Document Type Definitions


Lab Exercise



</div>
<span class='text_page_counter'>(25)</span><div class='page_container' data-page=25>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-9</b>


Uempty


Figure 1-8. Agenda - Day 2 XM3014.1


<i><b>Notes:</b></i>




© Copyright IBM Corporation 2004


<b>Agenda - Day 2 </b>



XML Schema


Lab Exercise


XPath



Lab Exercise



</div>
<span class='text_page_counter'>(26)</span><div class='page_container' data-page=26>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>1-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 1-9. Agenda - Day 3 XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Agenda - Day 3</b>



</div>
<span class='text_page_counter'>(27)</span><div class='page_container' data-page=27>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 1. Introduction to XML and Related Technologies</b> <b>1-11</b>


Uempty



Figure 1-10. Unit Summary XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Summary</b>



We've looked at the overall course objectives and a day-by-day


agenda.



</div>
<span class='text_page_counter'>(28)</span><div class='page_container' data-page=28>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(29)</span><div class='page_container' data-page=29>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-1</b>


Uempty

<b><sub>Unit 2. Issues in Electronic Information Exchange</sub></b>



<b>What This Unit is About</b>



This unit examines the different ways in which information is


exchanged in modern computer systems, identifying issues in each
case. The discussion is restricted to what is exchanged (the content)
not how it is exchanged (the mechanism). A set of messaging criteria
are developed that, if met, will reduce the impact of the issues



identified.


This unit shows some of the business drivers for XML, and gives
examples of how XML is being used by businesses today.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:


• Describe the types of information exchange that occur in modern
computer systems


• Describe information exchange issues that exist in modern
computer systems


• Describe what is needed to address many of the issues that exist in
information exchange


<b>How You Will Check Your Progress</b>


Accountability:


</div>
<span class='text_page_counter'>(30)</span><div class='page_container' data-page=30>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-1. Unit Objectives XM3014.1


<i><b>Notes:</b></i>




© Copyright IBM Corporation 2004


<b>Unit Objectives</b>



After completing this unit, you should be able to:



Describe the types of information exchange that occur in modern


computer systems



Describe information exchange issues that exist in modern


computer systems



</div>
<span class='text_page_counter'>(31)</span><div class='page_container' data-page=31>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-3</b>


Uempty


Figure 2-2. Electronic Information Exchange (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Electronic Information Exchange (1 of 2)</b>



Electronic information exchange is a simple concept:




Electronically encoded information of one sort or another moves


among software units during the execution of some domain-


(business) related function.



There are several contexts for information exchange:



Intra-application - information movement among the parts of an


application.



Inter-application - information movement between applications in


the same company system.



Intercompany - information movement between companies.


Inter-system - information movement between systems in the


same company.



</div>
<span class='text_page_counter'>(32)</span><div class='page_container' data-page=32>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-3. Electronic Information Exchange (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Electronic Information Exchange (2 of 2)</b>



Company 1




System 1 (Sales)


Company 2



Company 3


Application (Ordering)


Application (CRM)
System 2
(Accounting)


Intercompany
Inter-System


Inter-Application
Intra-Application


</div>
<span class='text_page_counter'>(33)</span><div class='page_container' data-page=33>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-5</b>


Uempty


Figure 2-4. Intra-Application Information Exchange XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Intra-Application Information Exchange</b>



In a well-structured application, information flows between three different layers:


The presentation layer (often called the View): presents information to the user


and collects information from the user. This layer is often coupled to a particular
presentation technology, for example, Presentation Manager, X-Windows, and
so forth. Therefore, it often must change significantly when the presentation
mode changes.


The processing layer (often called the Controller): operates on the information


in accordance with the functional requirements of the application.


The business layer (often called the Model or Business Model): maintains the


operational constraints that govern the business as a whole. It ensures that no
individual application contradicts those rules by performing an operation that is
inconsistent with those constraints.


Presentation
Layer
(View)


Process
Layer
(Controller)



Business
Layer
(Model)


</div>
<span class='text_page_counter'>(34)</span><div class='page_container' data-page=34>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-5. Agile Views - Multiple Client/Device Support XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Agile Views - Multiple Client/Device Support</b>



Prior to the arrival of the World Wide Web, applications were largely presented via
workstations or dumb terminals, and required (relatively) infrequent modification of their
presentation layer. The World Wide Web has changed this.


Now, the addition of the mobile work force and use of handheld devices presents new
opportunities for business and new challenges for application developers.


Applications must be presented via:


Cell phones and Handhelds, Wireless Markup Language (WML)
Web Browsers (HTML, Style Sheets, JavaScript)


PDF



And so forth


Many Web applications suffer from coupling issues where applications habitually
generate output that combines Presentation information (font, color, and so forth) with
business information (bank balance, product information, and so forth) making it difficult
to reuse the data stream.


Ideally, the presentation layer would emit/consume a generic, structured information
stream that can be filtered for the target device.


<i>An external rendering engine worries about how it looks, while the application worries </i>
about what should be viewed.


Enables speedy, low-cost support for new client devices.
Need a View-independent, structured


</div>
<span class='text_page_counter'>(35)</span><div class='page_container' data-page=35>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-7</b>


Uempty


Figure 2-6. Inter-Application Information Exchange XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Inter-Application Information Exchange</b>



Ideally, the design of a system takes into account all the operations that it will
perform and the applications that will perform them.


It is rare that enough information exists to perform such an analysis and rarer still
that the design remains stable as the applications that compose the system are
constructed (typically at disparate points in time).


Technology does not stand still; it is common to see applications built late in
the life of a system using technology that is completely different from that
used by the initial ones, for example, COBOL versus Java.


Experience has shown that it is best to focus on the application at hand and allow
the plans for a system to evolve as further applications are built based on new
knowledge of the problem and new technologies.


The way that applications communicate


should not make assumptions about



</div>
<span class='text_page_counter'>(36)</span><div class='page_container' data-page=36>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-7. Context-free Communication XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Context-free Communication</b>



As much as possible, eliminate assumptions from the way in which information is
exchanged.


This means that the information that flows between applications should not be
coupled to a particular technology or to an assumption about how it will be used.


When possible, send an application domain entity, for example, a Purchase
Order rather than the individual pieces, for example, a total, an item description,
and so forth.


Don't use a message that is bound to an implementation technology, for
example, a Serialized Java Object (a Java-specific bit stream).


Ideally, the communication medium would be based on simple, ubiquitous
technology, for example, straight text.


Should be structured and self-describing to eliminate the need for context
awareness in the receiver.


Requires a structured information


(text) format that supports the



</div>
<span class='text_page_counter'>(37)</span><div class='page_container' data-page=37>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-9</b>



Uempty


Figure 2-8. B2B Intercompany Information Exchange XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>B2B Intercompany Information Exchange</b>



In this case, the presentation is focussed on the Business to Business (B2B)
relationships that exist in e-business.


In such cases, the systems involved often talk to multiple business partners;
sometimes for the same service where selection is based on price, availability,
and so forth, for example, Credit Transaction Validation.


<b>Scenario 2</b>


Communicate with
business partners
through an intermediate
'Marketplace' vendor.
Forced to evolve at the
rate of the intermediary
C1


C2
C3



Cn
M


<b>Scenario 1</b>


Communicate directly with
business partners, potential
for 'n' communication protocols


C2
C3


</div>
<span class='text_page_counter'>(38)</span><div class='page_container' data-page=38>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-9. Need to Establish Common Ground for Communication XM3014.1

<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


Technology issues aside, it is clear that successful, unfettered B2B


information exchange depends greatly on the creation of



implementation-independent, vendor-neutral languages in which to


conduct business.



Markup languages have existed as a means to embed semantics in


electronic documents (for example, SGML).




SGML was created as a language for describing documents. B2B


communication may benefit from a similar solution, that is, use a


markup language to describe information.



Such a language could be used to describe documents that whole


industries agree on as a means to exchange the information they


need to conduct business.



<b>Need to Establish Common </b>


<b>Ground for Communication</b>



Requires an implementation-independent, vendor-neutral
markup language for describing information; enabling the


</div>
<span class='text_page_counter'>(39)</span><div class='page_container' data-page=39>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-11</b>


Uempty


Figure 2-10. Inter-system Information Exchange XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Inter-system Information Exchange</b>




The exchange of information between systems is subject to most of the problems
discussed so far except, perhaps, view coupling. Typically, this sort of


communication does not involve a presentation layer.


When laying out the infrastructure in which systems will reside, it is wise to
establish a means of insulating systems from one another with a layer that is
devoid of implementation and process coupling ... let's call this the Interface Layer
(it's also known as an Abstraction of the System).


The role of the interface layer is to capture the semantics of a system as seen
from an external point of view, and to represent it as a dialog, with messages
providing the units of communication in the dialog.


As long as the definition of the system doesn't change, the dialog (the interface to
the system) should remain stable. The implementation may change significantly.


System1 System2


</div>
<span class='text_page_counter'>(40)</span><div class='page_container' data-page=40>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-11. Exchanging Messages XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Exchanging Messages</b>



Exchanging messages between systems has a lot in common with


exchanging messages between B2B business partners.



The exception is that though inter-system information exchange


requires an established protocol (the interface), the system does not


necessarily benefit from that protocol being an accepted standard


for B2B communication



There are other differences, for example, the likely use of Message Oriented
Middleware in system integration (MOM), but this presentation is focused on
<i>the information being exchanged not on the exchange mechanism.</i>


So, in common with B2B communication:



</div>
<span class='text_page_counter'>(41)</span><div class='page_container' data-page=41>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-13</b>


Uempty


Figure 2-12. The Semantic Web XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>The Semantic Web</b>




Requires self-describing information
decoupled from View details

An extension of the WWW.



The Web becomes an active (rather than passive) information


space.



Separation of content from presentation is necessary.


That is, Model-View separation.



HTML doesn't have this.



Look at the browser compatibility problem as evidence for the need


for this.



In order for the Web to reason, it must be possible to identify the


units that are going to be reasoned about.



</div>
<span class='text_page_counter'>(42)</span><div class='page_container' data-page=42>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>2-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-13. A Common Solution? XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>A Common Solution?</b>



Collecting all the observations together, an information solution


addressing each of these issues would be:



a. A view-independent, structured information stream.


b. A structured information (text) format that supports the



expression of semantics.



c. An implementation-independent, vendor-neutral markup


language for describing information, enabling the creation of


domain-specific business languages.



d. Self-describing, decoupled from view details.


In short:



"A text-based, vendor-neutral markup language that supports the


expression of semantics."



</div>
<span class='text_page_counter'>(43)</span><div class='page_container' data-page=43>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-15</b>


Uempty


Figure 2-14. Checkpoint Questions (1 of 2) XM3014.1


<i><b>Notes:</b></i>




© Copyright IBM Corporation 2004


<b>Checkpoint Questions (1 of 2)</b>



1. Which of the following will reduce the coupling related to


Electronic Information Exchange?



(Select all that apply)



a. Create messages that are context-free.



b. Use system interfaces to hide implementation details.


c. Combine view information and data in each message.


d. Use messages that are vendor-neutral and



implementation-independent.


e. All of the above.



2. Text-based messages are preferred because:


(Select all that apply)



a. They are implementation-neutral.



b. All software technologies can read/write them.


c. It's easier to debug messaging problems.


d. They can be spell checked.



</div>
<span class='text_page_counter'>(44)</span><div class='page_container' data-page=44>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>2-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 2-15. Checkpoint Questions (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Checkpoint Questions (2 of 2)</b>



3. In general, the properties a message should exhibit are:


(Select all that apply)



a. Self-describing



b. Predictable structure



</div>
<span class='text_page_counter'>(45)</span><div class='page_container' data-page=45>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 2. Issues in Electronic Information Exchange</b> <b>2-17</b>


Uempty


Figure 2-16. Unit Summary XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Unit Summary</b>



Having completed this unit, you should be able to:



Describe the types of information exchange that occur in modern


computer systems



Describe the information exchange issues that exist in modern


computer systems



</div>
<span class='text_page_counter'>(46)</span><div class='page_container' data-page=46>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(47)</span><div class='page_container' data-page=47>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-1</b>


Uempty

<b><sub>Unit 3. What Is XML?</sub></b>



<b>What This Unit is About</b>



In this unit, the basic elements of XML are explained.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:
• Describe the basic rules of XML



• Identify what makes XML well-formed


• List the components that make up an XML document
• Differentiate between XML and HTML


• Describe the internationalization support in XML
• Define some best practices for XML


<b>How You Will Check Your Progress</b>


Accountability:


• Checkpoint


</div>
<span class='text_page_counter'>(48)</span><div class='page_container' data-page=48>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-1. Unit Objectives XM3014.1


<i><b>Notes:</b></i>



Although XML is a stable and mature, the supporting technologies are evolving rapidly.
<b>Keep up with the changes at: />


© Copyright IBM Corporation 2004


<b>Unit Objectives</b>



After completing this unit, you should be able to:


Describe the basic rules of XML




Describe what it means for an XML document to be well-formed


List the components that make up an XML document



Differentiate between XML and HTML



</div>
<span class='text_page_counter'>(49)</span><div class='page_container' data-page=49>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-3</b>


Uempty


Figure 3-2. What Is XML? XM3014.1


<i><b>Notes:</b></i>



Usually people will talk about this 'XML' and that 'XML' or this 'XML file' and what they are
really referring to is XML markup text encapsulating specific data.


As long as XML text or definitions follow the syntax set of rules, any data can be
represented.


© Copyright IBM Corporation 2004


<b>What Is XML?</b>



At its core XML is text formatted to follow a well-defined set of rules.


XML documents consist primarily of tags and text.




If you've ever seen the source to an HTML document, then the


<i>XML structure should look familiar</i>



This text may be stored/represented in:


A normal file stored on disk



A message being sent over HTTP



A character string in a programming language


A CLOB (character large object) in a database


Any other way textual data can be used



XML documents do

not

need to exist as documents --they may be:


Byte streams sent between applications



Fields in a database record



Collections of XML Infoset information items



</div>
<span class='text_page_counter'>(50)</span><div class='page_container' data-page=50>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-3. Example Tree Representation of XML XM3014.1


<i><b>Notes:</b></i>



This example shows a typical XML document and how it is represented as a tree of nodes.
This conceptual depiction of XML is important to understand.



book is the root element but ROOT is the highest point in the tree or hierarchy: think of
ROOT as the location of a pointer used to keep track of where you are.


© Copyright IBM Corporation 2004


XML documents should be thought of as a hierarchical tree


structure.



<b>Example Tree Representation of XML</b>



"Tom



Wolfe"

"$6.00"



"The


Right


Stuff"


<book>



<author>

<title>

<price>


ROOT



=



<b><?xml version="1.0"?></b>
<b><book></b>


<b> <author></b>
<b> Tom Wolfe</b>



<b> </author></b>
<b> <title></b>


<b> The Right Stuff</b>


<b> </title></b>
<b> <price></b>
<b> $6.00</b>


</div>
<span class='text_page_counter'>(51)</span><div class='page_container' data-page=51>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-5</b>


Uempty


Figure 3-4. A Simple XML Document - Basic Structure XM3014.1


<i><b>Notes:</b></i>



Textual data between tags is also be referred to as content.
Tagged elements of any sort are also known as markup.


Sometimes the term body is used to refer to anything between a start tag and an end tag.


© Copyright IBM Corporation 2004


<b><?xml version="1.0"?></b> "Optional" first line; only required if



encoding IS NOT UTF-8 or UTF-16*


<b><book></b> Root element start tag


<b> <title></b>


<b> Alphabet from A to Z</b>


<b> </title></b>


First child element with data


<b> <isbn number="1112-23-4356" /></b> Empty element (no data)


<b> <author></b> Begin element tag


<b> <firstName>Boreng</firstName></b>


<b> <lastName>Riter</lastName></b> Nested child elements
<b> </author></b> End element tag


<b> <chapter title="Letter A"></b>


<b> The letter A is the first in</b>
<b> the alphabet. It is also the</b>
<b> first of five vowels.</b>


<b> </chapter></b>


Element containing an attribute and


parsed character data (PCDATA) [TBD]


<b> <!-- The rest of the letter</b>


<b> chapters are missing --></b> Comment
<b> <chapter title="Letter Z"></b>


<b> The letter Z is the last</b>
<b> letter in the alphabet. </b>


<b> </chapter></b>


Last element in document


<b></book></b> Root element end tag


</div>
<span class='text_page_counter'>(52)</span><div class='page_container' data-page=52>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-5. A Simple XML Document - Basic Nomenclature XM3014.1


<i><b>Notes:</b></i>



These definitions will be important when we discuss the XML Schema definition language
in a later chapter.


We introduce these terms here in preparation for their use then.



© Copyright IBM Corporation 2004


<b>A Simple XML Document - </b>


<b>Basic Nomenclature</b>



The XML instance on the previous page consists of:
<i><b>One main element book</b></i>


<i><b>Subelements title, isbn, author, chapter, and comment</b></i>


<b>Author contains other subelements firstName and lastName</b>


<i><b>ISBN and chapter contain attributes number and title, respectively</b></i>
<b>Title, firstName, and lastName contain only strings:</b>


<i><b>Elements that contain numbers, strings, dates, and so forth (TBD) but no </b></i>
<i>subelements (or attributes) are said to have simple types</i>


<b>ISBN and chapter carry attributes; author has subelements:</b>


Elements that contain subelements or carry attributes are said to have


<i>complex types</i>


Attributes always have simple types (that is, they are numbers, strings,
dates, and so forth.


<i>TBD -- In a later chapter we describe XML Schemas which have access to </i>


</div>
<span class='text_page_counter'>(53)</span><div class='page_container' data-page=53>

<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-7</b>


Uempty


Figure 3-6. Basics of <i>Well-formed XML (1 of 2)</i> XM3014.1


<i><b>Notes:</b></i>



As you can see, creating an XML instance will be a rather straightforward task.


© Copyright IBM Corporation 2004


<i><b>Basics of Well-formed XML (1 of 2)</b></i>



<i>XML documents are considered to be well-formed when they </i>


adhere to a set of five rules that define basic XML syntax and


structure + a sixth for worldwide conformity.



<i>1. There must be a single root element:</i>



All other elements are nested inside the root element

2. Elements must be properly terminated:



<i>For every opening tag "<...>" there must be a matching closing tag </i>
"</...>"


<i>The exception is an empty (no content or body) tag "<.../>"</i>



<i>3. Elements must be properly nested underneath a parent tag </i>



<i>(except for the single, root element):</i>



<i>A nested tag-pair may not overlap another tag</i>


</div>
<span class='text_page_counter'>(54)</span><div class='page_container' data-page=54>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-7. Basics of <i>Well-formed XML (2 of 2)</i> XM3014.1


<i><b>Notes:</b></i>



Version 1.1 is about to emerge. Many of the current XML instances lack this declaration.
It is often useful to identify the processing instructions, of which the XML declaration is but
one, as the prolog; the actual XML instance material, that between the root element open
and closing tags, may then be referred to as the XML document.


© Copyright IBM Corporation 2004


<i><b>Basics of Well-formed XML (2 of 2)</b></i>



4. Tag names are case sensitive:



All tag and attribute names, attribute values, and data must comply
with XML naming rules.


<i>5. Attributes, extra information that can be provided for elements, </i>



must be properly quoted:



<i>That is, all attribute values must be in quotes.</i>


6. The first line should/must contain the special tag that identifies


the version of the XML specification to apply:



</div>
<span class='text_page_counter'>(55)</span><div class='page_container' data-page=55>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-9</b>


Uempty


Figure 3-8. Element Rules - Rule 1. Single Root Element XM3014.1


<i><b>Notes:</b></i>



XML is a Mark Up language. Tags form the basis of all mark up languages.


The purpose of an Element tag is to identify the contents of the data and children tags held
within them.


The root element should have a name that provides a good definition of all the data
contained in the document.


The first physical line in this sample is there because of Rule 6, which we shall cover later.


© Copyright IBM Corporation 2004



<b>Element Rules - Rule 1. Single Root Element</b>



All XML documents must have a single root element.



Legal:

Not legal:



<b><?xml version="1.0"?> </b>


<b><colors></b>



<b> <color>red</color></b>


<b> <color>green</color></b>


<b></colors></b>



<b><?xml version="1.0"?> </b>


<b><color>red</color></b>



<b><color>green</color></b>



<i>Colors is the root element for </i>


<i>this XML.</i>



</div>
<span class='text_page_counter'>(56)</span><div class='page_container' data-page=56>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-9. Element Rules - Rule 2. Element Tag Rules XM3014.1


<i><b>Notes:</b></i>




The empty element notation (< ... />) is unique to XML. The W3C is currently updating the
SGML recommendation to include this syntax.


Empty elements are practical and common when the only associated information is
enclosed within the element's attributes.


For Empty Element tags, a space is required before the tags terminator (" />").


© Copyright IBM Corporation 2004


<b>Element Rules - Rule 2. Element Tag Rules</b>



Elements consist of start and end tags.


End tag is identified by the /.



Example:

<b><color>red</color></b>



Elements may contain attributes within the start tag.


Example:

<b><book isbn="34323"></book> </b>



Note: The attribute is isbn.



Empty elements contain no child elements or data.



These elements can be represented with a special shorthand


notation.



Example:



<b><record key="123"></record></b>




Can be shortened to:



<b><record key="123" /></b>

(preferred)



</div>
<span class='text_page_counter'>(57)</span><div class='page_container' data-page=57>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-11</b>


Uempty


Figure 3-10. Element Rules - Rule 3. Element Nesting XM3014.1


<i><b>Notes:</b></i>



There is no limit to the depth of children in XML, but an overly large number may indicate a
poor design.


If an XML document does not have an associated DTD or Schema, then all whitespace is
retained since a processor does not know if it is considered textual data or just for


aesthetics. DTDs and Schemas are covered in later sections.


© Copyright IBM Corporation 2004


<b>Element Rules - Rule 3. Element Nesting</b>



Elements must be properly nested.




The end tags of inner elements must occur before the end tags of


outer elements.



</div>
<span class='text_page_counter'>(58)</span><div class='page_container' data-page=58>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-11. Element Nesting Example XM3014.1


<i><b>Notes:</b></i>



Indentation and other whitespace is only for human readability, but adds "fat" to a
documents size and processing requirements.


This is only an issue with huge XML documents.


It is important to realize that an XML instance is treated by its processor/parser as one,
continuous stream of characters, some of which are recognized by the parser as "special."


As a consequence, when the parser reports an error its location is where the parser
gave up, which may be far beyond where the actual error occurred.


© Copyright IBM Corporation 2004


<b>Element Nesting Example</b>



Legal:

Not legal:



<b><?xml version="1.0"?> </b>



<b><shirt></b>



<b> <style></b>

<b>Polo</b>

<b></style></b>


<b> <color></b>

<b>red</b>

<b></color></b>


<b> <size></b>

<b>large</b>

<b></size></b>


<b></shirt></b>



<b><?xml version="1.0"?> </b>


<b><shirt></b>



<b> <style></b>



<b> <size></b>

<b>large</b>


<b> </b>

<b><color></b>

<b>red</b>


<b> Polo</b>



<b> </b>

<b></style></b>



<b> </size></color></b>


<b></shirt></b>



<i>All elements are properly nested.</i>

<i>The element tags are mixed up </i>



<i>and not ordered.</i>



Best Practice:

Use indentation to represent the document's hierarchy.


Important if your document will likely be read by humans.



</div>
<span class='text_page_counter'>(59)</span><div class='page_container' data-page=59>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-13</b>


Uempty


Figure 3-12. Element Rules - Rule 4. XML Naming Rules XM3014.1


<i><b>Notes:</b></i>



Elements may not use W3C reserved Namespace prefix or the letters "XML" in any case.
Element names may not include words reserved by the XML specification. These include:
<b> • DOCTYPE </b>


<b> • ELEMENT </b>
<b> • ATTLIST </b>
<b> • ENTITY</b>


Colons (":"), while technically legal in tag names, should not be used as they are reserved
for use with Namespaces.


© Copyright IBM Corporation 2004


<b>Element Rules - Rule 4. XML Naming Rules</b>



XML name construction:



The first character must be A-Z, a-z, or _ (underscore)


Any number of subsequent letters, numbers, hyphens,


periods, colons, and underscore characters.




XML names are case sensitive.


Names cannot contain spaces.



Names must not have a prefix of xml in any case combination


(such names are reserved).



Best Practice:

Brevity in tag names is not necessary.


Use descriptive names for elements and attributes.



<b><Queue></b>

or

<b><que></b>

is far better than

<b><q>.</b>



Best Practice:

Maintain standard naming conventions and


quoting.



</div>
<span class='text_page_counter'>(60)</span><div class='page_container' data-page=60>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-13. Rule 4... Tag Naming - Samples XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Rule 4. Tag Naming - Samples</b>



<b>Legal</b>

<b>Not Legal</b>

<b>Comments</b>



title, book.isdn,



lastName, _street,


addrLine1, name:first



1name, -street,


&name



Examples of legal and


illegal element names.



<b><color></b>


<b> red</b>


<b></color></b>


<b><SIZE></b>


<b> small</b>


<b></SIZE></b>


<b><color></b>


<b> red</b>


<b></COLOR></b>


<b><SIZE></b>


<b> small</b>


<b><SiZe></b>



Element names are


case sensitive and


start and end tags


must match.


<b><fname></b>


<b> John</b>


<b></fname></b>


<b><f name></b>



<b> John</b>


<b></f name></b>



Element names must


not contain spaces.



<b><nameXML></b>


<b> John</b>


<b></nameXML></b>


<b><xmlName></b>


<b> John</b>


<b></xmlName></b>



</div>
<span class='text_page_counter'>(61)</span><div class='page_container' data-page=61>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-15</b>


Uempty


Figure 3-14. Rule 4... Element Content (1 of 2): General XM3014.1


<i><b>Notes:</b></i>



PCDATA is parsed character data.


A "snippet" is a piece of a larger, legitimate XML file.


© Copyright IBM Corporation 2004



<b>Rule 4. Element Content (1 of 2): General</b>



<i>An XML instance is composed of elements expressed in tag pairs </i>


<i>(except for empty tags) plus optional attributes that always have </i>


<i>quoted values and optional data that appears between the element </i>


start tag and the element end tag.



Mixed content - element content that contains data (PCDATA is


shown) and other elements.



Example (snippet):



<b><title><ref>XML</ref> Example</title></b>


<b><chapter></b>



<b>Chapter information</b>



</div>
<span class='text_page_counter'>(62)</span><div class='page_container' data-page=62>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-15. Rule 4... Element Content (2 of 2): Data XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Rule 4. Element Content (2 of 2): Data</b>




Element data content is handled in one of two ways:



1. Parsed Character Data (PCDATA): is examined by the XML


parser to discover XML content embedded within it.



</div>
<span class='text_page_counter'>(63)</span><div class='page_container' data-page=63>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-17</b>


Uempty


Figure 3-16. Rule 4... PCDATA - Parsed Character Data XM3014.1


<i><b>Notes:</b></i>



XML differentiates between markup characters and text characters by providing special
XML escape characters to be used in XML PCDATA.


Only regular parsed character data is allowed inside the attributes value.


Any special characters such as ">" and "&" must always be represented as escape
characters.


The others may appear non-escaped in some places in XML, but it is best to just use the
escape characters all the time.


These escape characters are independent of the encoding chosen.


© Copyright IBM Corporation 2004



<b>Rule 4. PCDATA - Parsed Character Data</b>



Predefined entities exist to address ambiguous syntax situations,


situations where the literal would be interpreted as part of the


XML document syntax rather than its content.



Examples:



<b><range>&gt; 6 &amp; &lt; 20</range></b>


<b><quotes characters="'&quot;'"/></b>



<b>Entity</b>

<b>Description</b>

<b>Character</b>



</div>
<span class='text_page_counter'>(64)</span><div class='page_container' data-page=64>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-17. Rule 4... CDATA - Character Data XM3014.1


<i><b>Notes:</b></i>



The 5 XML escape characters will not be interpreted (that is, changed to the non-escaped
character) in CDATA sections, so they should not be used. If you put &lt; in the CDATA, you
will see &lt; in the out put not ">". So use the actual characters.


Encoding refers to the character set for the entire document, so it does apply to CDATA as
well.



CDATA sections cannot be nested.
CDATA will retain spaces.


While XML escape characters are not to used in CDATA, you must be aware of how the
'down-line' applications of the XML will use the CDATA.


Common usage: JavaScript in the XML and specialized HTML


Browser may have problems with some special characters which must then be
represented in hex.


example: micro sign (à) = &#181;


â Copyright IBM Corporation 2004


<b>Rule 4. CDATA - Character Data</b>



Syntax:



<b><![CDATA[ ...Anything can go here... ]]></b>



Note: Anything except the literal string "

<b>]]></b>

";


<b>to embed "]]>" use "]]&gt;"</b>



CDATA is not parsed and is treated as-is.



Useful for embedding other languages within the XML.


HTML documents.



XML documents.



JavaScript source.



Or any other text with a lot of special characters.



Generally speaking the escaping rules inside a CDATA section are


those of the embedded language



</div>
<span class='text_page_counter'>(65)</span><div class='page_container' data-page=65>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-19</b>


Uempty


example: ampersand (7) = &#38;


</div>
<span class='text_page_counter'>(66)</span><div class='page_container' data-page=66>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-18. Rule 4... CDATA Examples XM3014.1


<i><b>Notes:</b></i>



Both 'script' element examples are valid. Which one you would use would depend on the
behavior of the application/browser which will use the transformed XML and therefore the
CDATA.


This topic is important to XSLT processing.



© Copyright IBM Corporation 2004


<b>Rule 4. CDATA Examples</b>



These script elements contain JavaScript:



This nameXML element stores actual XML to be treated as text:


<b><script><![CDATA[</b>


<b>function matchwo(a,b) {</b>
<b> if (a < b && a < 0)</b>
<b> then</b>


<b> { return 1 }</b>
<b> else</b>


<b> { return 0 }</b>
<b>}</b>


<b>]]></script></b>


<b><script><![CDATA[</b>


<b>function matchwo(a,b) {</b>


<b> if (a < b &#38;&#38; a < 0)</b>
<b> then</b>


<b> { return 1 }</b>


<b> else</b>


<b> { return 0 }</b>
<b>}</b>


<b>]]></script></b>


<b><nameXML></b>
<b> <![CDATA[</b>


<b> <name common="freddy" breed="springer-spaniel"></b>
<b> Sir Frederick of Ledyard's End</b>


<b> </name></b>


<b> ]]></b>


</div>
<span class='text_page_counter'>(67)</span><div class='page_container' data-page=67>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-21</b>


Uempty


Figure 3-19. Element Rules - Rule 5. Element Attributes XM3014.1


<i><b>Notes:</b></i>



Attribute naming follows the same rules as element naming.



An element may contain zero or more attributes within its start tag.


Attributes provide extra information to the meaning of the element. This may include "key"
information or other identifying details.


Name collisions are common in XML as shown in the attributes of the first example. Using
Namespaces resolves these sort of issues.


You cannot use the same style quote in the value of the attribute, that is, style="monty's" is
valid, style='monty's' is invalid.


© Copyright IBM Corporation 2004


<b>Element Rules - Rule 5. Element Attributes</b>



Attributes are used to attach information to elements.



Attributes consist of a name="value" pair, where the name is a legal


XML name. This is often referred to as a "key-value" pair.



Attributes are placed in the start tag of the element to which they


apply.



An element may have several attributes, each uniquely named.


Examples:



<title <b>type="section"</b> <b>number="1">XML overview</title></b>


<title <b>type="boat"</b> <b>state="FL">Yacht</title></b>



Notice the different usage of the attribute "type" in the two elements;


semantically they are not the same.



Attributes must have a value.



</div>
<span class='text_page_counter'>(68)</span><div class='page_container' data-page=68>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-20. Element Rules - Rule 6. XML Declaration (1 of 2) XM3014.1

<i><b>Notes:</b></i>



All XML documents should begin with this tag, and it MUST be at the first position of the file
(that is, no blank lines or comments or spaces before the tag).


The current version of all XML documents is "1.0" and must appear within the "<?XML" tag
if that tag is used. It indicates the version of XML to which the Document Entity must
conform.


"stand-alone" is included here for completeness: it is automatically set to the correct value -
if it is not used; most users do not include it. We will have more to say on this in our


discussions of the grammars we can apply to XML instances. "Yes" means the document
that follows can stand alone; that is, without requiring a grammar document to complete its
information.


© Copyright IBM Corporation 2004


<b>Element Rules - Rule 6. </b>



<b>XML Declaration (1 of 2)</b>



The XML Declaration is an optional first line in all XML documents:
<b><?xml version="1.0" ?></b>


<b><?xml version="1.0" encoding="UTF-8" ?></b>
<b><?xml version="1.0" standalone="yes"?></b>


If this declaration is used, the version attribute is mandatory.


<b>The encoding attribute indicates the character encoding used in the </b>
document; if UTF-8 or UTF-16 is used it may be omitted.


ASCII is a subset of UTF-8 and need not be declared.
Comments are <i><b><sub>not</sub></b></i> allowed before this statement.


<i>The XML Declaration follows the syntax of a Processing Instruction or PI, </i>
which is described on a subsequent chart, but it is considered to be
unique and is treated separately in the 1.0 XML specification.


<b>GENERAL NOTE OF CAUTION</b>: You can not always rely on a browser or
tool to completely/correctly enforce the specifications. Nor are the


<i>specifications always written in language that, to a particular reader, is </i>
<i>unambiguous. Still, the best advice is when in doubt, refer to the </i>


</div>
<span class='text_page_counter'>(69)</span><div class='page_container' data-page=69>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-23</b>



Uempty


Figure 3-21. Element Rules - Rule 6. XML Declaration (2 of 2) XM3014.1

<i><b>Notes:</b></i>



The last point may be problematic if, say, the associated DTD file is not readily available for
inspection. You will see in later sections that we can override the attribute values in our
XML instance from within a DTD or XML Schema file.


This may not appear to be a problem at the outset, but over time we may forget that we are
overriding some values.


As XML instances grow in length and complexity this may become a serious source of
confusion.


A best practice is to design the XML instance data to contain ALL the data so that, from an
internal data perspective, it does stand alone.


© Copyright IBM Corporation 2004


<b>The stand-alone attribute is included here for completeness: it is used to </b>
indicate if this XML document depends on information declared externally to


<i>this document (in a DTD or XSL file (TBD), for examples); value may be yes </i>


or no.


A value of "yes" indicates there are no external markup declarations; if
there are no external markup declarations, the declaration has no


meaning.


A value of "no"indicates there are or may be such external markup
declarations; if there are such declarations but there is no standalone
declaration, "no" is assumed.


. . . so it is typically not used.


In any event, the inclusion in the XML instance of references to external
entities, such as those in an embedded DTD, does not change its


<i>standalone status.</i>


<b>A bigger issue associated with the stand-alone attribute is that of defining or </b>
<i>setting values in any entity that may be external to the XML instance. </i>


Arguably, the principal reason for using XML is that it explicitly defines the
elements it includes. If attribute values are overridden then the XML
instance before us is no longer declarative.


</div>
<span class='text_page_counter'>(70)</span><div class='page_container' data-page=70>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-22. Comments XM3014.1


<i><b>Notes:</b></i>



Comments can go anywhere in the XML except:


Before the XML Declaration


Inside the actual element tags
Comments are a good thing.


Use them just as would in a program.


© Copyright IBM Corporation 2004


<b>Comments</b>



<b><!-- --></b>

Defines a comment.



A space after the beginning and before the trailing hyphens is


recommended but not required.



<b><?xml version="1.0"?></b>



<b><!-- This is a comment. They can go anywhere</b>



<b> inside an XML document except within an element </b>


<b>tag.</b>



<b> --></b>



<b> <book></b>



<b><chapter>A is the first letter</chapter></b>



<b><!-- Here is another comment. --></b>




<b><chapter>Z is the last letter</chapter></b>


<b></book></b>



Improper usage:



</div>
<span class='text_page_counter'>(71)</span><div class='page_container' data-page=71>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-25</b>


Uempty


Figure 3-23. Internationalization and Encoding (1 of 2) XM3014.1


<i><b>Notes:</b></i>



A good way to test that the encoding is correct is by viewing the XML file in IE 5.0 or later.
There are two error messages you may receive from IE or from a parser:


1. An invalid character was found in text content.


You will get this error message if a character in the document does not match the
encoding attribute.


2. Switch from current encoding to specified encoding not supported.


You will get this error message if there is a disconnect between the encoding used in
saving and specification of the encoding. The common problem is that it has been saved as
a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.



© Copyright IBM Corporation 2004


<b>Internationalization and Encoding (1 of 2)</b>



Support for different character encodings is provided through the


encoding attribute of the XML Declaration.



<b><?xml version="1.0" encoding="charset"?></b>



The encoding attribute indicates the set of characters that are


permitted in the document.



In the absence of an encoding declaration, Unicode UTF-8 or


UTF-16 characters may be used.



Documents exchanged via network may be presented to the



</div>
<span class='text_page_counter'>(72)</span><div class='page_container' data-page=72>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-24. Internationalization and Encoding (2 of 2) XM3014.1


<i><b>Notes:</b></i>



A good way to test that the encoding is correct is by viewing the XML file in IE 5.0.
There are two error Messages you may receive from IE or from a parser:



1. An invalid character was found in text content.


You will get this error message if a character in the document does not match the
encoding attribute.


2. Switch from current encoding to specified encoding not supported.


You will get this error message if your file there is a disconnect between the saving and
specification of the encoding. The common problem is that is has been saved as a
single-byte encoding and the encoding attribute specifies a double-byte or visa versa.


© Copyright IBM Corporation 2004


<b>Internationalization and Encoding (2 of 2)</b>



It is very important that the editor and operating system used to


write and save an XML document support the encoding specified in


the XML Declaration.



Sample encoding declarations:



<b>ASCII (subset of UTF-8)</b>



<b><?xml version="1.0" encoding="ISO-8859-1"?></b>


<b>16 bit UNICODE</b>



<b><?xml version="1.0" encoding="UTF-16"?></b>



<b><?xml version="1.0" encoding="ISO-10646-UCS-2"?></b>


<b>...</b>




<b>Japanese</b>



<b><?xml version="1.0" encoding="ISO-2022-JP"?></b>


<b><?xml version="1.0" encoding="Shift_JIS"?></b>


<b>...</b>



</div>
<span class='text_page_counter'>(73)</span><div class='page_container' data-page=73>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-27</b>


Uempty


Figure 3-25. Processing Instruction XM3014.1


<i><b>Notes:</b></i>



If a comment is inserted between the XML Declaration and a PI such as the one shown,
Studio will not consider it an error.


A demo file is available in the XM301 Lectures folder, Unit 3.


This PI, although useful, does NOT define a <i>grammar for the XML document in which it </i>


is used: we will talk about grammars in subsequent chapters.


To reemphasize: the XML Declaration, while it may look like a PI, is treated as special!


© Copyright IBM Corporation 2004



<b>Processing Instruction</b>



Syntax <? target arg*?>



Processing Instruction is often abbreviated as PI in


documentation.



A feature inherited from SGML.



Used to embed application-specific instructions in documents.


The target name immediately follows "<?" and is used to


associate the PI with an application.



May include zero or more arguments.


May be preceded by comments.



For example,

<?xml-stylesheet href="common.css" type="text/css"?>

,



</div>
<span class='text_page_counter'>(74)</span><div class='page_container' data-page=74>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-26. Well-formed versus Valid XM3014.1


<i><b>Notes:</b></i>



All XML parsers must check XML documents for being well formed.
XML parsers are classified as being validating, or non-validating.



© Copyright IBM Corporation 2004


<b>Well-formed versus Valid</b>



A well-formed XML document:



Consists of XML elements that are nested within another.


Has a unique root element.



Follows the XML naming conventions.



Follows the XML rules for quoting attributes.


Has tags that are properly terminated.



All XML parsers check for well-formedness.



<i>A valid XML document has an associated vocabulary and obeys the </i>


structural rules specified by that vocabulary.



Associated vocabulary is typically defined by either a DTD or an


XML Schema.



XML parsers may be validating or non-validating depending upon


whether or not they can apply an associated grammar.



</div>
<span class='text_page_counter'>(75)</span><div class='page_container' data-page=75>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-29</b>



Uempty


Figure 3-27. HTML versus XML (1 of 2) XM3014.1


<i><b>Notes:</b></i>



All markup tags in HTML are directed at visual composition. No consideration is given to
the actual semantics of the data.


XML markup tags are based solely on the data content.
Clean separation of data and presentation


© Copyright IBM Corporation 2004


XML is about structured information
interchange


HTML is about presentation and
browsing


<b>HTML versus XML (1 of 2)</b>



<b><course></b>


<b> <name>Java Programming</name> </b>
<b><department>EECS</department></b>
<b> <teacher></b>


<b> <name>Paul Thompson</name></b>


<b> </teacher></b>


<b> <student></b>


<b> <name>Ron Jones</name></b>
<b> </student></b>


<b> <student></b>


<b> <name>Uma Abingdon</name></b>
<b> </student></b>


<b> <student></b>


<b> <name>Lindsay Garmon</name></b>
<b> </student></b>


</div>
<span class='text_page_counter'>(76)</span><div class='page_container' data-page=76>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-28. HTML versus XML (2 of 2) XM3014.1


<i><b>Notes:</b></i>



These two source listings really show fundamental differences between HTML and XML.
While both contain text marked up by tags, their meaning is entirely different.


Which would you rather parse and insert into a database?



© Copyright IBM Corporation 2004


<b>HTML versus XML (2 of 2)</b>



<b>HTML</b>

<b>XML</b>



<b><html></b>


<b><title>Course Roster</title></b>
<b><body></b>


<b><center></b>


<b> <h1>Course Roster</h1></b>
<b> <h2>XML Programming</h2></b>
<b> <h3>Department: EECS</h3></b>
<b> <p></b>


<b> <table border=2></b>
<b> <tr></b>


<b> <th>Teacher</th></b>


<b> <td>Paul Thompson</td></b>
<b> </tr><tr></b>


<b> <th>Student<br>List</th></b>
<b> </b> <b><td>Ron Jones<br></b>



<b> Uma Abingdon<br></b>
<b> Lindsay Garmon</b>
<b> </b> <b></td></b>
<b> </tr></b>
<b> </table></b>
<b></center></b>
<b></body></b>
<b></html></b>
<b><?xml version="1.0"?></b>
<b><course></b>


<b> <name>Java Programming</name></b>
<b> <department>EECS</department></b>
<b> <teacher></b>


<b> <name>Paul Thompson</name></b>
<b> </teacher></b>


<b> <student></b>


<b> <name>Ron Jones</name></b>
<b> </student></b>


<b> <student></b>


<b> <name>Uma Abingdon</name></b>
<b> </student></b>


<b> <student></b>



<b> <name>Lindsay Garmon</name></b>
<b> </student></b>


</div>
<span class='text_page_counter'>(77)</span><div class='page_container' data-page=77>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-31</b>


Uempty


Figure 3-29. HTML and XML Key Differences XM3014.1


<i><b>Notes:</b></i>



HTML has a fixed tag set. In XML there is no predefined tag set. The allowed tags in an
XML document are defined in its DTD or Schema.


XHTML is an effort to correct the sins of HTML's past. It is a new XML technology that
consists of an HTML specific DTD that defines the valid HTML tags.


Unfortunately, many of today's browsers will not recognize XHTML documents properly!


© Copyright IBM Corporation 2004


<b>HTML and XML Key Differences</b>



<b>HTML</b> <b>XML</b>


Predefined tags define how to present



data. Defines its own tags to identify data.
Allows missing end tags.


<b><br></b>and <b><p></b>


Requires matching end tags.


<b><name>test</name></b>


Attributes do not require quotes.


<b><img src=myDog.jpeg></b>


Attributes must be quoted.


<b><book isdn="3432"></book></b>


Attributes do not require a value.


<b><input type=radio</b> <b>checked></b>


Attributes must have a value.


<b><device type="radio" /></b>


Tolerates non-nested tags.


<b><H1><center>Hello!</H1></center></b>


Strict nesting and tag matching rules.



<b><H1><center>Hello!</center></H1></b>


Browsers will almost always do a "best
guess" on ill-formed HTML.


XML Parsers will generate a fatal
exception for well-formedness violations.
Does not support empty elements, but


allows single start tags.


<b><br> and <hr> </b>


Provides for empty elements.


<b><device type="radio" /></b>


Is not case sensitive.


</div>
<span class='text_page_counter'>(78)</span><div class='page_container' data-page=78>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-30. Checkpoint Questions (1 of 3) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004



<b>Checkpoint Questions (1 of 3)</b>



1. Basic XML can be described as:



A. A hierarchical structure of tagged elements, attributes and text.


B. All the HTML tags plus a set of new XML only tags.



C. Object-oriented structure of rows and columns.


D. Processing instructions (PIs) for text data.


E. Textual data with tags for visual presentation.


2. Which of these XML fragments is not well-formed?



A. <root><class>XML</class></root>


B. <class><root>XML</root></class>


C. <root><class id="XML"></root>



D. <root>XML<class id="XML"/>XML</root>



</div>
<span class='text_page_counter'>(79)</span><div class='page_container' data-page=79>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-33</b>


Uempty


Figure 3-31. Checkpoint Questions (2 of 3) XM3014.1


<i><b>Notes:</b></i>




© Copyright IBM Corporation 2004


<b>Checkpoint Questions (2 of 3)</b>



3. XML Comments are allowed (Select all that apply):


A. Before the XML Declaration



B. Anywhere



C. Between element tags


D. Before the root element


E. All of the Above



4. Which of these XML elements with attributes is not well-formed?


A.

<

name

first=

'

<b>Tony</b>

'

LAST=

"

<b>Romeo</b>

" />



B.

<

name

name=

"

<b>Tony</b>

"

NAME=

"

<b>ROMEO</b>

" />



C.

<

_name_

first-name=

"

<b>Tony</b>

"

last-name=

"

<b>Romeo</b>

"/>



D.

<b><name="Tony Romeo" /></b>



E.

<

name

name=

"

<b>first='Tony' last='Romeo</b>

'" />



</div>
<span class='text_page_counter'>(80)</span><div class='page_container' data-page=80>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>3-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 3-32. Checkpoint Questions (3 of 3) XM3014.1



<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Checkpoint Questions (3 of 3)</b>



5. Which of these comments regarding HTML and XML is not true?


A. HTML markup is focused on presentation.



B. XML markup is based on defining the data.


C. XML is based on HTML.



D. HTML tags are not case sensitive.


E. XML tags are case sensitive.



</div>
<span class='text_page_counter'>(81)</span><div class='page_container' data-page=81>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 3. What Is XML?</b> <b>3-35</b>


Uempty


Figure 3-33. Unit Summary XM3014.1


<i><b>Notes:</b></i>



The status of various XML technologies (W3C Activities) can be found at:
<b> />


© Copyright IBM Corporation 2004



<b>Unit Summary</b>



Having completed this unit, you should be able to:


Describe the basic rules of XML



Describe what it means for an XML document to be well-formed


List the components that make up an XML document



</div>
<span class='text_page_counter'>(82)</span><div class='page_container' data-page=82>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(83)</span><div class='page_container' data-page=83>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-1</b>


Uempty

<b><sub>Unit 4. WebSphere Studio Application Developer </sub></b>



<b>Overview</b>



<b>What This Unit is About</b>



This unit describes IBM WebSphere Studio Application Developer.
This is an overview of the broad features and organization of this
application development tool.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:


• Describe role-based development


• Describe the WebSphere Studio family of tools


• State the role of WebSphere Studio Workbench in the WebSphere
Studio tools


• Describe basic features of WebSphere Studio Application
Developer


<b>How You Will Check Your Progress</b>


Accountability:


• Review


<b>References</b>



</div>
<span class='text_page_counter'>(84)</span><div class='page_container' data-page=84>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-1. Unit Objectives XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


After completing this unit, you should be able to:


Describe role-based development




Describe the WebSphere Studio family of tools



State the role of WebSphere Studio Workbench in the WebSphere


Studio tools



Describe basic features of WebSphere Studio Application


Developer



Describe the major sets of tooling provided by WebSphere Studio


Application Developer



</div>
<span class='text_page_counter'>(85)</span><div class='page_container' data-page=85>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-3</b>


Uempty


Figure 4-2. Roles-based Development XM3014.1


<i><b>Notes:</b></i>



There are four distinct development roles shown here:
<b> • Enterprise Integrator</b>


<b> • Bean Provider</b>


<b> • Application Assembler</b>
<b> • Page Producer</b>



Tooling needs to support each of these roles and permit easy management and integration
of the developed assets.


© Copyright IBM Corporation 2004


<b>Workarea</b>


<b>Products</b>


<b>One tool, many user perspectives</b>


<b>One tool, many user perspectives</b>



<b>Connection</b>
<b> Data</b>
<b>Business</b>
<b> Logic Data</b>
<b>Application </b>
<b> Flow</b>
<b>Page Layout</b>
<b>and </b>
<b>Content</b>
<b>JavaBeans</b>
<b>EJBs</b>
<b>JavaBeans</b>
<b>EJBs</b>
<b>Servlets, JSPs,</b>
<b>JavaBeans</b>
<b>HTML, JSPs,</b>
<b>MIME Types</b>


<b>Operational</b>
<b>Environment</b>
<b>Configuration </b>
<b>Data, Site Usage</b>


<b>Metrics</b>
<b>Tool</b>
<b>Role</b>
<b>Enterprise </b>
<b>Integrator</b>
<b>Bean</b>
<b>Provider</b>
<b>Application</b>
<b>Assembler</b>
<b>Page </b>
<b>Producer</b>
<b>Web</b>
<b>Master</b>


<b>WebSphere Studio Tooling</b>



Developing Web Applications requires more than just


writing Java code



</div>
<span class='text_page_counter'>(86)</span><div class='page_container' data-page=86>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-3. Development Environment Goals XM3014.1



<i><b>Notes:</b></i>



The development environment should support the tasks performed by the developers.
It should be configurable and customizable for each individual developer.


Tools need to accommodate the rapid change in available technologies.


© Copyright IBM Corporation 2004


<b>Development Environment Goals</b>



Create a new Development Environment that will:


Be based on a new open, highly pluggable platform



Unified by a new tooling platform
Provide multilevel vendor integration


Provide a role-based development model where the assets are


the focus, not the tool



Provide a common repository solution for all assets and tools


Provide rapid support for new standards and technologies



</div>
<span class='text_page_counter'>(87)</span><div class='page_container' data-page=87>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-5</b>


Uempty



Figure 4-4. IBM WebSphere Studio Family XM3014.1


<i><b>Notes:</b></i>



The IBM WebSphere Studio family is applied to a development platform (as opposed to a
set of development tools).


© Copyright IBM Corporation 2004


<b>IBM WebSphere Studio Family</b>



Provide a sturdy Web/Java development platform in the industry


Open tooling and run-time support



Open programming model



Provide in-depth Enterprise connectivity


EJB/J2EE Tooling



Enterprise Connectivity/Enterprise Access Builders


Provide integrated end-to-end development



Built-in Unit Test Environment


Incremental compilation



Flexible debugging support



</div>
<span class='text_page_counter'>(88)</span><div class='page_container' data-page=88>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>4-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-5. Family Contents XM3014.1


<i><b>Notes:</b></i>



The flagship products in the WebSphere Studio brand (Version 5) are:
<b> • WebSphere Studio Application Developer</b>


<b> ã WebSphere Studio Enterprise Developer</b>


â Copyright IBM Corporation 2004


<b>Family Contents</b>



WebSphere Studio Products (V5) :



WebSphere Studio Application Developer (includes all of Site


Developer functionality



Focused on development of Web Services, JSPs, Servlets, XML and
J2EE and database applications in a team environment


WebSphere Studio Enterprise Developer


Includes all of Application Developer functionality


Focused on Enterprise Integration using the J2EE Connector
Architecture



</div>
<span class='text_page_counter'>(89)</span><div class='page_container' data-page=89>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-7</b>


Uempty


Figure 4-6. WebSphere Studio Workbench XM3014.1


<i><b>Notes:</b></i>



The Workbench is not a tool, that is, it is not in itself a product that is for sale. It is an open
and portable tool platform providing an integration technology. The Workbench can be
thought of as a set of Java frameworks and a set of development tools geared for tool
builders.


© Copyright IBM Corporation 2004


<b>WebSphere Studio Workbench</b>



Workbench is:



Not a tool, not a product, not for sale



A portable, universal tool platform and integration technology


The basis for an open source project



Workbench has:



Frameworks and services that enable tool builders to focus on



tooling building



Tools to help tool builders build tools


Java Development Tools (JDT)


</div>
<span class='text_page_counter'>(90)</span><div class='page_container' data-page=90>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-7. WebSphere Studio Workbench Rationale XM3014.1


<i><b>Notes:</b></i>



The Workbench offers its greatest support for tool builders; making it easy to add plug-ins
(tools) to the overall IDE. This allows quick "time-to-market" of tools supporting emerging
technologies.


The underlying framework which adds to the tool builders productivity gives end-users a
common look and feel.


© Copyright IBM Corporation 2004


<b>WebSphere Studio Workbench Rationale </b>



End-users (Web application developers)



No more on-site integration, tools just work together


Common, easy-to-use interface




Common code, project, file management system


Same tool platform regardless of development role


Same look and feel regardless of tool vendor



Tool Builders



Seamless integration and interoperability with IBM AD tools and


WebSphere Software Platform



Seamless integration with other Workbench tools


Enterprise ready, off the shelf



Globalization, distributed debug, Team, SCM


Easy construction and deployment platform for tools



</div>
<span class='text_page_counter'>(91)</span><div class='page_container' data-page=91>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-9</b>


Uempty


Figure 4-8. WebSphere Studio Application Developer XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


Start the WebSphere Studio Application Developer



Start -> Programs -> IBM WebSphere Studio -> Application Developer 5.1
Workbench opens when you launch Application Developer


Within the workbench -- open the perspectives, views, and editors


</div>
<span class='text_page_counter'>(92)</span><div class='page_container' data-page=92>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-9. Terminology XM3014.1


<i><b>Notes:</b></i>



The workbench window displays one or more perspectives that contain views and editors.
You can quickly switch between perspectives and views using the shortcut buttons which
appear on the shortcut bar.


© Copyright IBM Corporation 2004


<b>Shortcut</b><i><b> Bar</b></i>


<b>Source Pane</b>


<b>Outline Pane</b> <b>Task Sheet</b>


<b>Navigator</b>
<b> Pane</b>



<b>Editor</b>


<b>Views</b>


</div>
<span class='text_page_counter'>(93)</span><div class='page_container' data-page=93>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-11</b>


Uempty


Figure 4-10. Perspectives XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Perspectives</b>



A group of related views and editors


To open a Perspective:



Select via Window -> Open Perspective


Some Perspectives:



Java: to develop and test Java programs



Server: to configure, run, and manage test servers



</div>
<span class='text_page_counter'>(94)</span><div class='page_container' data-page=94>

<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>4-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-11. Views XM3014.1


<i><b>Notes:</b></i>



Views support editors and provide alternative presentations or navigation of the information
in your workbench. For example, the Navigator displays projects and other resources you
are working with.


A view might appear by itself, or stacked with other views in a tabbed notebook.
On Windows platforms, views can be undocked from the main workbench window and
appear as floating windows on the desktop. Undocked views can also be docked back into
the main workbench window.


More info on the Application Developer menu: Help --> Navigating Workbench


© Copyright IBM Corporation 2004


<b>Views</b>



A view displays specialized information. For example:


Bookmarks view displays all bookmarks in workbench.



A view might appear alone in a single pane, or several views might


be stacked within a single tabbed pane.



Views can be undocked/docked from the main workbench window.



Information updates on a view are saved immediately.



</div>
<span class='text_page_counter'>(95)</span><div class='page_container' data-page=95>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-13</b>


Uempty


Figure 4-12. Editors XM3014.1


<i><b>Notes:</b></i>



The key thing to note about editors is the Open-save-close life cycle. You must explicitly
save the corresponding resource after making changes.


© Copyright IBM Corporation 2004


<b>Editors</b>



An editor is used to edit or browse a resource.



Modifications made in the editor follow an open-save-close life


cycle.



An editor can contribute to the Workbench menu bar.


Examples:



Java Source Editor




Web Deployment Descriptor Editor


Web Site Configuration Editor


JSP Editor



</div>
<span class='text_page_counter'>(96)</span><div class='page_container' data-page=96>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-13. Online Help XM3014.1


<i><b>Notes:</b></i>



<b>Tips : F1, </b>


F1 : info pop on a selected task


To hide the navigation frame, click the Hide Navigation button on the Help view's toolbar.
<b>Note: Your product may include more than one information set (a collection of </b>


documentation topics). When you run a search, only the current information set is


searched. The current information set is shown in the drop-down list at the top of the Help
view. To search another information set, select it from the list, and run the search again.


© Copyright IBM Corporation 2004


<b>Online Help</b>



To learn more on Workbench, select Help ->Help Contents)



Select Application Developer information



Select Getting Started



</div>
<span class='text_page_counter'>(97)</span><div class='page_container' data-page=97>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-15</b>


Uempty


Figure 4-14. Cheat Sheets XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Cheat Sheets</b>



Guide developer through an application development process


Sequence of documented steps with relevant documentation


Displayed in workbench pane



Task-related tools are automatically launched or have launch icons


in cheat sheet



</div>
<span class='text_page_counter'>(98)</span><div class='page_container' data-page=98>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 4-15. Application Developer Design Points XM3014.1


<i><b>Notes:</b></i>



Reduced learning curve through the consolidation of tooling to one platform. For example,
with customizable perspectives, one could customize Application Developer to look similar
to other Java IDEs.


© Copyright IBM Corporation 2004


<b>Application Developer Design Points</b>



Performance



Customizable Perspectives



Promote role-based development (Web Developer, Java


Developer, DBA, and so forth)



Reduces the learning curve



Perspectives use same project artifacts regardless of perspective


being used



Pluggable development environment


Java and ActiveX plug-in support



IBM and ISVs use same plug-in architecture to extend the


Workbench




Support for automated builds


Apache.org "Ant" support



</div>
<span class='text_page_counter'>(99)</span><div class='page_container' data-page=99>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-17</b>


Uempty


Figure 4-16. Tooling XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Tooling</b>



Java IDE


J2EE Tooling


Portlet Tooling


Data Tooling


Web Tooling


XML Tooling



</div>
<span class='text_page_counter'>(100)</span><div class='page_container' data-page=100>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 4-17. Java IDE (1 of 3) XM3014.1


<i><b>Notes:</b></i>



A default JRE can be selected for the Workbench with Windows-> Preferences. Project
specific JRE is selected in the Launch Configuration Dialog.


For more on hot method replace, refer to the foil at the end of the unit.


© Copyright IBM Corporation 2004


<b>Java IDE (1 of 3)</b>



Ships with SDK 1.3


Pluggable JRE Support



Defined at project and workbench level


Hot Method Replace



Dynamically replace Java classes during debug



Enabled when Application Server V5 runs in debug mode


Java Snippet Support (Scrapbook)



Task Sheet (All Problems Page)


Code Assist



Refactoring Support




Rename/move support for method/class/package


Fix all dependencies for renamed element



</div>
<span class='text_page_counter'>(101)</span><div class='page_container' data-page=101>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-19</b>


Uempty


Figure 4-18. Java IDE (2 of 3) XM3014.1


<i><b>Notes:</b></i>



JDI: Java Debugging Interface. The JDI is a high-level Java API providing information
useful for debuggers and similar systems needing access to the running state of a Java
virtual machine.


© Copyright IBM Corporation 2004


<b>Java IDE (2 of 3)</b>



Faster IDE



Smart Compilation



No lengthy compile/build/run steps



Pluggable Framework, in-placetool launching


Running class/code with errors




Precise reference searching


Text and Java-based



JDI-based debugger for local/remote debugging


Run code with errors



</div>
<span class='text_page_counter'>(102)</span><div class='page_container' data-page=102>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-19. Java IDE (3 of 3) XM3014.1


<i><b>Notes:</b></i>



Starting with V5.1, Application developer adds support for UML visualization. You can
select an existing components and have the system generate the UML diagrams, or you
can start with a blank diagram and develop components from the diagram, or use a
combination of the two approaches. These features let developers understand existing
components better by producing UML that represents the existing components and also
assists them in generating components based on the UML diagrams.


The entire class diagram or portions may be exported in bmp, jpg, or gif image formats.


© Copyright IBM Corporation 2004


<b>Java IDE (3 of 3)</b>



UML Class Diagram Editing and Visualization



Support for Java classes and EJB components



Diagrams generated from existing classes/components


New diagrams built and used to develop corresponding


component



Typical Class Diagram Editor operations:


Create classes, packages, and interfaces


Create extends and implements relationships


Create methods and fields



Refactor components


Add EJB relationships


Add EJBQL queries



</div>
<span class='text_page_counter'>(103)</span><div class='page_container' data-page=103>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-21</b>


Uempty


Figure 4-20. J2EE Tooling (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>J2EE Tooling (1 of 2)</b>




J2EE 1.3



EJB 2.0 Support



Servlet 2.3, JSP 1.2 Support



J2EE Perspective provides views and editors for EJB/Servlet/JSP


Developer



Object-relational Mapping for EJBs



Top-down/Bottom-up/Meet-in-the-middle


All metadata exposed as XMI



No hidden metadata



EAR and WEB Deployment Descriptor Editors


Forms-based (no need to directly edit XML)


Source view also available



Struts Support



</div>
<span class='text_page_counter'>(104)</span><div class='page_container' data-page=104>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-21. J2EE Tooling (2 of 2) XM3014.1


<i><b>Notes:</b></i>




WebSphere Studio provides a Web-based Universal Test Client where you can test your
Enterprise JavaBeans (EJBs) and other objects. Using this test client, you can test the
home and remote interface methods of your enterprise beans. By calling the methods and
passing user-defined arguments you can test methods to ensure that they work correctly.


© Copyright IBM Corporation 2004


<b>J2EE Tooling (2 of 2)</b>



Connector Projects



J2EE Connector Architecture (JCA) based


EJB Test Client – Universal Test Client



HTML-based



J2EE programming model


Built-in JNDI registry Browser


Unit Test Environment for J2EE



WebSphere Application Server V4 or V5 and Apache Tomcat


Create multiple projects with different Server



configurations/instances



Allows for versioning of unit test environment


</div>
<span class='text_page_counter'>(105)</span><div class='page_container' data-page=105>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-23</b>


Uempty


Figure 4-22. Portlet Tooling XM3014.1


<i><b>Notes:</b></i>



There are actually two related plug-ins. The first, WebSphere Portal Toolkit ships with all
offerings of WebSphere Portal V4.x. The second, WebSphere Everyplace Toolkit ships with
WebSphere Everyplace Server.


The test environment interacts with a developer configuration of WebSphere Portal Server
running on WebSphere Application Advanced Single Server Edition (AEs). This is


facilitated by the Remote WebSphere Server configuration.


© Copyright IBM Corporation 2004


<b>Portlet Tooling</b>



Wizards to create Portlet Application


Management of Deployment Descriptors



web.xml


portlet.xml



Multiple portlets per application




Integrated development and test environment


Full use of debugger



</div>
<span class='text_page_counter'>(106)</span><div class='page_container' data-page=106>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-23. Data Tooling (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Data Tooling (1 of 2)</b>



Data Perspective



Provides views geared for DBAs to:


Create Databases


Create Tables/Views/Indexes/Keys
Generate DDL


Connect to and view existing relational database objects

Online and off-line support for working with databases



Metadata generated as XMI


SQL Query Builder and SQL Wizards



Visually construct SQL statements


SELECT, INSERT, UPDATE, DELETE supported
Metadata generated as XMI


</div>
<span class='text_page_counter'>(107)</span><div class='page_container' data-page=107>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-25</b>


Uempty


Figure 4-24. Data Tooling (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Data Tooling (2 of 2)</b>



DB2 Stored Procedures



Create / Build and Register/ Debug / Drop a stored procedure or


User Defined Function (UDF)



SQL or Java-based


SQLJ Files



Create / Build / Debug SQLJ




</div>
<span class='text_page_counter'>(108)</span><div class='page_container' data-page=108>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-25. Web Tooling (1 of 2) XM3014.1


<i><b>Notes:</b></i>



The Web Site Designer is new with 5.1. The configuration of the entire Web site is


maintained in the Web Site Configuration object. The choice of static or dynamic web sites
and the Palette view are also newly introduced in release 5.1.


Examples of the drawer labels in the Palette view are: HTML, Free Layout, JSP, Java
Server pages, and Site Parts. The Site Parts include items such as Vertical and Horizontal
Navigation Bars, which help to maintain consistency in the look and feel of pages across
the site.


© Copyright IBM Corporation 2004


<b>Web Tooling (1 of 2)</b>



Web Site Designer



Provide site-level views of Web project



Graphical and detail tabular views of site structure


Page Designer




Provides page-level view of Web project components


HTML and JSP editing



WYSIWYG page design, source editing and page preview


Choice of static or dynamic Web project



Appropriate tool support loaded at project creation time


Palette View



</div>
<span class='text_page_counter'>(109)</span><div class='page_container' data-page=109>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-27</b>


Uempty


Figure 4-26. Web Tooling (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Web Tooling (2 of 2)</b>



Multiple markup types (WML, cHTML) and pervasive device support


Built in Servlet, Database, and JavaBean Wizards



Built-in JSP Debugging



Site Style Sheet and Page Template Support



Links View



View HTML/JSP and all links reference in page



Parsing and link management updates link when resources are


renamed or moved



Jakarta JSP Taglibs



Specify in project Properties or New…Project to include



</div>
<span class='text_page_counter'>(110)</span><div class='page_container' data-page=110>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-27. XML Tooling (1 of 3) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>XML Tooling (1 of 3)</b>



XML Tooling provides integrated tools/perspectives to create XML


based components:



XML Source Editor


DTD/Schema validation



Code Assist for building XML documents

DTD Editor



Visual tooling for working with DTDs
Create DTDs from existing documents
Generate an XML Schema from a DTD


Generate JavaBeans for creating/manipulating XML documents
Generate an HTML form from a DTD


XML Schema Editor



</div>
<span class='text_page_counter'>(111)</span><div class='page_container' data-page=111>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-29</b>


Uempty


Figure 4-28. XML Tooling (2 of 3) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>XML Tooling (2 of 3)</b>



XSL Editor



Edit/create and validate XSL




XSL Debug and Transformation Tool


Trace XSL transformation



Examine relationships between the result node, the template rule,


and the source node



XML to/from Relational Databases



Generate XML, XSL, XSD from an SQL Query


RDB/XML Mapping Editor



Map columns in a table to elements and attributes in an XML


document



</div>
<span class='text_page_counter'>(112)</span><div class='page_container' data-page=112>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-29. XML Tooling (3 of 3) XM3014.1


<i><b>Notes:</b></i>



XPath expressions can be used to search through XML documents, extracting information
from the nodes (such as an element or attribute).


© Copyright IBM Corporation 2004


<b>XML Tooling (3 of 3)</b>




XPath Expressions Wizard


Create XPath expressions


XML to XML Mapping Editor



</div>
<span class='text_page_counter'>(113)</span><div class='page_container' data-page=113>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-31</b>


Uempty


Figure 4-30. Performance/Trace Tooling XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Performance/Trace Tooling</b>



Built-in tooling helps developer isolate and fix performance


problems with their Web application



Profiling and Logging Perspective allows developers to:



Attach to local/remote agents for capturing performance data


JVM Monitoring



Heap
Stack



Class/Method details
Object References

Resource Monitors


Execution patterns
CPU usage


</div>
<span class='text_page_counter'>(114)</span><div class='page_container' data-page=114>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-31. Team Development XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Team Development</b>



Workbench integration occurs through a pluggable, adapter-based


design:



A published framework API allows any SCM provider to add an


adapter to integrate their SCM into the Workbench



Application Developer ships with


CVS Plugin



</div>
<span class='text_page_counter'>(115)</span><div class='page_container' data-page=115>

<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-33</b>


Uempty


Figure 4-32. Web Services Tooling (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Web Services Tooling (1 of 2)</b>



Tools to Construct Web Services:


Discover



Browse UDDI registry to locate Web Service (Web Services
Explorer)


Generate JavaBean proxy for existing Web Services

Create / Transform



Create new Web Services from JavaBeans, databases

Build



Wrap existing artifacts such as SOAP and HTTP GET/POST
accessible services


Generate Java client proxy to Web Services



Maintain Web Services Description Language (WSDL) files


(WSDL Editor)



Create new WSDL files


Create ports, port types, messages, bindings, operations, types within
WSDL files


</div>
<span class='text_page_counter'>(116)</span><div class='page_container' data-page=116>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-33. Web Services Tooling (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Web Services Tooling (2 of 2)</b>



Tools to Construct Web Services:


Deploy



Deploy Web Services to WebSphere or Tomcat Servers

Test



Built-in test client allows for immediate testing of local and remote
Web Services



Publish



</div>
<span class='text_page_counter'>(117)</span><div class='page_container' data-page=117>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-35</b>


Uempty


Figure 4-34. Standards Support XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Standards Support</b>



EJB 2.0



J2EE 1.2 and 1.3


Servlet 2.3



JSP 1.2


JRE 1.3



Web Services Definition Language (WSDL) 1.1



Web Servers Interoperability (WS-I) Basic Profile 1.0


Apache SOAP 2.3




XML DTD 1.0 10/2000 Revision


XML Namespaces 1/99 Version


XML Schema 5/2001 Version



</div>
<span class='text_page_counter'>(118)</span><div class='page_container' data-page=118>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>4-36 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 4-35. Review XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Review</b>



Name some of the roles in Web application development.



What is the name of the Application Developer perspective you


would usually use for EJB development?



</div>
<span class='text_page_counter'>(119)</span><div class='page_container' data-page=119>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 4. WebSphere Studio Application Developer Overview</b> <b>4-37</b>


Uempty



Figure 4-36. Unit Summary XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Summary</b>



Having completed this unit, you should be able to see:


The concept of Role-Based Development



The WebSphere Studio Family



The WebSphere Studio Workbench in the context of WebSphere


Studio products



</div>
<span class='text_page_counter'>(120)</span><div class='page_container' data-page=120>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(121)</span><div class='page_container' data-page=121>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-1</b>


Uempty

<b><sub>Unit 5. Document Type Definition (DTD)</sub></b>



<b>What This Unit is About</b>



This unit covers XML 1.0 DTDs, which provide a way to define the
structure of an XML document. DTDs provide an additional level of


syntactic checking.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:
• Describe the reasons for using DTDs


• Define well-formed versus valid documents


• Define the grammar rules for an XML document using DTD
• Describe the difference between non-validating and validating


processors


• Describe examples of DTDs being used in business
• Describe best practices used in DTDs


• Define the limitations of DTDs


• Describe the status of the DTD in the industry


<b>How You Will Check Your Progress</b>


Accountability:


• Checkpoint


</div>
<span class='text_page_counter'>(122)</span><div class='page_container' data-page=122>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 5-1. Unit Objectives XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Objectives</b>



After completing this unit, you should be able to:


Describe the reasons for using DTDs



Define well-formed versus valid documents



Define the grammar rules for an XML document using DTDs


Describe the difference between non-validating and validating


processors



Describe examples of DTDs being used in business


Describe best practices used in DTDs



Define the limitations of DTDs



</div>
<span class='text_page_counter'>(123)</span><div class='page_container' data-page=123>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-3</b>


Uempty



Figure 5-2. Review: Well-Formed XML XM3014.1


<i><b>Notes:</b></i>



This is a quick review of the important rules for XML well-formedness. It's important to
recognize that the well formedness rules are very simple.


© Copyright IBM Corporation 2004


<b>Review: Well-Formed XML</b>



Has the optional first line; required if encoding is not


UTF-8 or UTF-16.



<b><?xml version="1.0"?></b>



Matching start and end element tags with correct syntax.


<b><tag>data</tag></b>



Defines attributes within start tag and quotes correctly.


<b><tag attribute="x">data</tag></b>



Correct nesting of elements.


<b><employee></b>



<b> <name>John Smith</name></b>


<b> <id>X04913</id></b>



<b></employee></b>




...and Single Root and XML naming constraints



</div>
<span class='text_page_counter'>(124)</span><div class='page_container' data-page=124>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-3. Why Do We Need DTDs? XM3014.1


<i><b>Notes:</b></i>



The difficulty with well-formedness is that the rules are very simple.
Quite often we want to express more complicated constraints such as:


The element <message> can only have two children, <greeting> and <farewell>, and
the two children must appear in that order


The element <message> may have an optional urgent attribute?


What if we want the computer to be able to verify that an XML document meets these kinds
of constraints?


What if we want to have reusable pieces of text between two XML documents?


© Copyright IBM Corporation 2004


What if we want some additional constraints:



<b><message urgent="yes"> </b>


<b> <greeting>hi</greeting></b>



<b> <farewell>bye</farewell> </b>


<b></message></b>



Can only have two specific children (greeting, farewell).


The greeting child must precede the farewell child.


Message may have an optional urgent attribute.



What if we want to define and publish the structure an XML


document is to conform to?



What if we want the computer to be able to verify that an XML


document meets these kinds of constraints?



What if we want to have reusable pieces of text between two


XML documents?



</div>
<span class='text_page_counter'>(125)</span><div class='page_container' data-page=125>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-5</b>


Uempty


Figure 5-4. What Is a DTD? XM3014.1


<i><b>Notes:</b></i>



A Document Type Definition is essentially the framework or skeleton of an XML document.
It defines which elements are allowed, which attributes are allowed for each element, and
whether such elements or attributes are required or optional. XML Schemas (often referred


to as Schemas) extend the functionality of the DTD by adding data typing and other


enhancements. An XML document that conforms to its specified DTD or XML Schema is
said to be valid.


The DTD can be a separate file or it can also be embedded in the XML file. In fact, the DTD
contents can be split across an external file and the XML file.


© Copyright IBM Corporation 2004


Blueprint of a document's structure.


Contains a series of declarations.


DTDs



Can be a separate file from the XML document.


Can be embedded within the XML file.



Can be split between a separate file and the XML file.


DTDs define:



The elements that can or must appear.


How often the elements can appear.


How the elements can be nested.



Allowable, required and default attributes.


But note: the use of DTDs is optional.



An XML document that obeys the rules in a DTD is said to be



<i>valid.</i>




</div>
<span class='text_page_counter'>(126)</span><div class='page_container' data-page=126>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-5. What is Allowed in a DTD? (1 of 2) XM3014.1


<i><b>Notes:</b></i>



Similar material can also be found in the WSAD IE 5.1 help file for DTD.
This page and the next list the elements you may use in a DTD file.


© Copyright IBM Corporation 2004


<i><b>What Is Allowed in a DTD? (1 of 2)</b></i>



Element type declaration <!ELEMENT . . .>



A syntax for formally describing what an element type is and what


type of data it can contain. Its basic format is: <!ELEMENT name


(content-model)>, where name is the element-type and



(content-model) is the type of data the element can contain.


Many pages follow to more fully explain "content."



Attribute list <!ATTLIST . . .>



A list of attributes for an element. Attribute lists enable you to


group together all related attributes for an element. All elements



must have their attributes listed in an attribute list.



</div>
<span class='text_page_counter'>(127)</span><div class='page_container' data-page=127>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-7</b>


Uempty


Figure 5-6. What is Allowed in a DTD? (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<i><b>What Is Allowed in a DTD? (2 of 2)</b></i>



Entity <!ENTITY . . .>



A shortcut used to represent complex strings or symbols that


would otherwise be impossible, difficult or repetitive to include by


hand.



There are built-in or predefined ENTITYs, too.


Notation <!NOTATION . . .>



A means of associating a binary description, typically stored


external to the DTD or XML file, with an entity or attribute. For


example: to include an image such as a GIF or JPEG image.


<i>Comments: No change: <!-- whatever is legal --></i>




</div>
<span class='text_page_counter'>(128)</span><div class='page_container' data-page=128>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-7. XML and DTD Example XM3014.1


<i><b>Notes:</b></i>



Here's a simple example of an XML document on the left, and the DTD rules that describe
it on the right. We're not going to go into the details of the rules right here -- that's what the
rest of this unit is about. We just wanted you to have an idea of how an XML file and it's
related DTD might look.


© Copyright IBM Corporation 2004


<b>XML and DTD Example</b>


<b><?xml version='1.0'?></b>
<b><address></b>
<b><name></b>
<b><title>Mrs.</title></b>
<b><first-name>Mary</first-name></b>
<b><last-name>McGoon</last-name></b>
<b></name></b>


<b><street>1401 Main Street</street></b>
<b><city>Sheboygan</city></b>


<b><state>WI</state></b>


<b><zip>38472</zip></b>


<b><country>USA</country></b>
<b></address></b>


<b><!ELEMENT address (name, street+,</b>


<b> city, state, zip?, country)></b>
<b><!ELEMENT name (title?, first-name,</b>


</div>
<span class='text_page_counter'>(129)</span><div class='page_container' data-page=129>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-9</b>


Uempty


Figure 5-8. What Is Allowed. . .Declaring Elements XM3014.1


<i><b>Notes:</b></i>



Here's our introduction to declaring elements. An element declaration begins with
<!ELEMENT followed by the name of the element being declared and then the content
model for the element.


Here's a sample declaration for an element called greeting that accepts #PCDATA (text),
along with two <greeting> elements that are valid according to this declaration. The second
<greeting> element is using a CDATA section to quote its contents.


Remember, element names must start with a letter or underscore, however, the letters xml,


xsl, xsi and xsd are reserved (regardless of case) by the W3C; future development may
reserve other "x--" prefixes. The colon character is also reserved (see Unit 5.


Namespaces), a period or alphanumeric characters may follow the first character (while
technically legal, an underscore-period combination is not recommended).


#PCDATA (parsed character data) indicates that only text and entities can be included in
the element. This data will be examined by the parser for entities and markup. Parsed
character data cannot contain the characters "&", "<", or ">"; these need to be represented
by their respective entities (Refer to the slide Built-in Entities).


© Copyright IBM Corporation 2004


Syntax:



<b><!ELEMENT</b>

<b> elementName (contentModel)</b>

<b>></b>



An element declaration in the DTD, and the corresponding element


in the XML document:



Declaration (DTD):



<b><!ELEMENT</b>

<b> greeting (#PCDATA)</b>

<b>></b>



Corresponding valid XML fragments:



<b><greeting></b>

<b>Hello, World!</b>

<b></greeting></b>



<b><greeting></b>




<b> <![CDATA[</b>

<b>G'day!</b>

<b>]]></b>


<b></greeting</b>

<b>></b>



</div>
<span class='text_page_counter'>(130)</span><div class='page_container' data-page=130>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-9. Element Content Models XM3014.1


<i><b>Notes:</b></i>



The content is the stuff in between the element's start and end tag.
There are four types of content models in XML 1.0 DTDs.


Types of DTD Content models
<b> • EMPTY</b>


<b> • ANY</b>


<b> • Element only - this includes child elements</b>
<b> • Mixed - this includes child elements and text</b>


© Copyright IBM Corporation 2004


<b>Element Content Models</b>



The content of an element is described by a content specification.


Types of DTD Content Models




EMPTY


ANY


Elements


Mixed



</div>
<span class='text_page_counter'>(131)</span><div class='page_container' data-page=131>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-11</b>


Uempty


Figure 5-10. EMPTY Content Model XM3014.1


<i><b>Notes:</b></i>



The EMPTY content model is used for an element that will have no content whatsoever.
Note that such an element may have as many attributes as it likes.


To specify the EMPTY content model, provide the word EMPTY for the content model.
The two examples on the foil show two elements that are valid with an empty content
model.


Empty elements are not much use unless they have attributes. We'll learn more about
declaring attributes in a bit.


An EMPTY element can be very useful for testing snippets of XML. There is an example of
this later in this chapter.


© Copyright IBM Corporation 2004



<b>EMPTY Content Model</b>



Element is to have no data. It may have attributes.


Declaration (DTD):



<b><!ELEMENT</b>

<b> placeholder EMPTY</b>

<b>></b>



Valid XML:



<b><placeholder></placeholder></b>


or



</div>
<span class='text_page_counter'>(132)</span><div class='page_container' data-page=132>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-11. ANY Content Model XM3014.1


<i><b>Notes:</b></i>



Contrary to what you might expect, the ANY content model does not allow you to put
anything you like between the start and end tag. When you use the ANY content model,
you must supply well-formed xml if what you supply has markup in it. Moreover, the


elements that you use must be declared in the DTD as well. So for the third example on the
foil, the <galaxy> element must be declared in the DTD for the document.


To specify the ANY content model provide the word ANY for the content model.



© Copyright IBM Corporation 2004


<b>ANY Content Model</b>



Can contain ANY data or well-formed XML.


Elements you use must be declared in DTD.



Declaration (DTD):



<b><!ELEMENT</b>

<b> universe ANY</b>

<b>></b>



<b><!ELEMENT</b>

<b>galaxy (#PCDATA)</b>

<b>></b>



Valid XML fragments:



<b><universe/> </b>

<b>or</b>

<b> <universe></universe></b>



<b><universe></b>

<b>the whole universe</b>

<b></universe></b>



<b><universe></b>



</div>
<span class='text_page_counter'>(133)</span><div class='page_container' data-page=133>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-13</b>


Uempty


Figure 5-12. Elements Content Model XM3014.1



<i><b>Notes:</b></i>



If the content of an element consists solely of child elements, the element is said to have
element content.


The element content model is specified by content model particles that are combinations of
either element names or other content model particles.


The table describes the operators that can be used to form these combinations.
In the table, a or b can be either content particles or element names.


To create the content model of a followed by b, use the comma (,).
To create the content model of a or b, use the vertical bar (|).
To repeat a content particle at least once, use the (+).


To repeat a content particle zero or more times, use the (*).


To allow a content particle to be absent or present exactly once, use the (?).


© Copyright IBM Corporation 2004


<b>Elements Content Model</b>



<i>The elements content model is specified by content model particles.</i>


Content model particles are element names as represented by a


and b and the occurrence indicators below.



<b> <!ELEMENT</b>

<b> name (particle structure) </b>

>




<b>Note: a or b may be a composite particle, that is, a = (c,d)</b>


Particle Syntax


sequence <b><!ELEMENT name (a,b)></b>


choice <b><!ELEMENT name (a|b)></b>


one <b><!ELEMENT name (a)></b>


one or more <b><!ELEMENT name (a)+></b>


zero or more <b><!ELEMENT name (a)*></b>


</div>
<span class='text_page_counter'>(134)</span><div class='page_container' data-page=134>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-13. Elements Content Examples (1 of 3) XM3014.1


<i><b>Notes:</b></i>



The first example specifies that <person> has a content model that accepts an <fname>
followed by an <lname> or an <lname> followed by an <fname>. The matches show all the
possible permutations.


© Copyright IBM Corporation 2004

<b>Declaration:</b>



<b><!ELEMENT person ((fname,lname)|(lname,fname))></b>


<b><!ELEMENT fname (#PCDATA)></b>


<b><!ELEMENT lname (#PCDATA)></b>


<b>Valid XML:</b>


<b><person></b>


<b> <lname>Smith</lname></b>
<b> <fname>John</fname></b>
<b></person></b>


<b>Also valid XML:</b>


<b><person></b>


<b> <fname></fname></b>


<b> <lname>Smith</lname></b>
<b></person></b>


<b>& also valid XML:</b>


<b><person></b>


<b> <fname/></b>


<b> <lname>Smith</lname></b>
<b></person></b>


</div>
<span class='text_page_counter'>(135)</span><div class='page_container' data-page=135>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-15</b>


Uempty


Figure 5-14. Elements Content Examples (2 of 3) XM3014.1


<i><b>Notes:</b></i>



The second example specifies that an <order> is a sequence of at least one <order-item>
followed by a <delivery-address>, followed by an optional <order-date>.


The valid XML shows


1. One <order-item>, a <delivery-address> and no <order-date>.
2. Two <order-items> a <delivery-address> and no <order-date>.
3. Two <order-items>, a <delivery-address> and an <order-date>.


© Copyright IBM Corporation 2004


Declaration:


<b><!ELEMENT order (order-item+,delivery-address,order-date?)></b>
<b><!-- Child elements defined as containing #PCDATA --></b>


Valid XML fragments:


<b><order></b>


<b> <order-item>item1</order-item></b>



<b> <delivery-address>123 State Street</delivery-address></b>
<b></order></b>


<b><order></b>


<b> </b> <b><order-item>item3</order-item></b>
<b> </b> <b><order-item>item4</order-item></b>


<b><delivery-address>123 State Street</delivery-address></b>
<b></order></b>


<b><order></b>


<b> <order-item>item5</order-item></b>
<b> <order-item>item6</order-item></b>


<b> </b> <b><delivery-address>123 State Street</delivery-address></b>
<b><order-date>July 5, 2001</order-date></b>


<b></order></b>


</div>
<span class='text_page_counter'>(136)</span><div class='page_container' data-page=136>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-15. Elements Content Examples (3 of 3) XM3014.1


<i><b>Notes:</b></i>




This example says that a phone book is at least one <entry>, <column-heading> or


<page-number>, but that there may be more than one of any of these three, and that they
may appear in any order.


The valid XML shows show:
1. Three <entry>'s.


2. Two <column-headings>.


The invalid example is invalid because page-number cannot have entry as a child.


© Copyright IBM Corporation 2004


<b>Declaration:</b>


<b><!ELEMENT phonebook (page)+></b>


<b><!ELEMENT page (heading, (entry|advert)+)></b>
<b><!ELEMENT heading (#PCDATA)></b>


<b><!ELEMENT entry (#PCDATA)></b>
<b><!ELEMENT advert (#PCDATA)></b>


<b>Valid XML fragment:</b>


<b><phonebook></b>
<b> <page></b>


<b><heading>The whole town</heading></b>


<b> <entry>John Smith, 555-1212</entry></b>


<b><advert>Fred's Fish n' Chips - 123-4567</advert></b>
<b></page></b>


<b></phonebook></b>


<b>Invalid XML fragments:</b>


<b><phonebook><page><entry/><entry/></page></phonebook></b>
<b><phonebook><page/></phonebook></b>


</div>
<span class='text_page_counter'>(137)</span><div class='page_container' data-page=137>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-17</b>


Uempty


Figure 5-16. Mixed Content Model XM3014.1


<i><b>Notes:</b></i>



Elements that have the mixed content model can contain (parsed) character data. In
addition to the character data, mixed content models may also contain child elements
interspersed with the character data. If a mixed content model contains child elements, it
can specify which elements may appear, but the child elements can appear in any order,
and any number of times.


The valid XML shows:



1. An element with character data content only.


2. An element allowing a single child element in addition to the character data content.


© Copyright IBM Corporation 2004


<b>Mixed Content Model</b>



<i>Mixed content: elements that contain character data </i>



optionally interspersed with child elements.


Two Cases of Declarations:



<b><!ELEMENT</b>

<b> product (#PCDATA)</b>

<b>></b>



<b><!ELEMENT</b>

<b>review (#PCDATA | product)*</b>

<b>></b>



Valid XML fragments:



<b><review></b>

<b>review text goes here</b>

<b></review></b>


<b><review></b>

<b>This is a review of some </b>



<b><product></b>

<b>car</b>

<b></product></b>

<b> that goes on for </b>


<b>pages of </b>

<b>regular </b>

<b>text.</b>

<b></review></b>



</div>
<span class='text_page_counter'>(138)</span><div class='page_container' data-page=138>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



Figure 5-17. What Is Allowed. . .Declaring Attributes XM3014.1


<i><b>Notes:</b></i>



The syntax for declaring attributes looks like this:
<!ATTLIST followed by


elementName - the name of element we are declaring that attribute for.
attributeName - is the name of the attribute being declared.


attributeType - specifies the data type (see Attribute Type table).
attributeDefault - specifies the attribute's default behavior.


To declare multiple attributes, you can write multiple ATTLIST declarations or repeat the
(attributeName attributeType attributeDefault) part as necessary.


© Copyright IBM Corporation 2004


<b>Option 1:</b>



<i><b><!ATTLIST elementName</b></i>



<i><b> attributeName attributeType defaultDecl</b></i>

<i><b>></b></i>



<b>Option 2:</b>



<i><b><!ATTLIST elementName</b></i>



<i><b> attributeName attributeType defaultDecl</b></i>



<i><b> ...</b></i>



<i><b> attributeName attributeType defaultDecl</b></i>

<i><b>></b></i>



<b>What Is allowed. . .Declaring Attributes</b>



Attribute-list declarations



</div>
<span class='text_page_counter'>(139)</span><div class='page_container' data-page=139>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-19</b>


Uempty


Figure 5-18. Organizational Note XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Organizational Note</b>



The next several charts identify possible choices for each


syntactical piece on the previous chart



These charts are followed by examples



We then continue with the concepts identified in the "What is


allowed in a DTD" chart:




ENTITY


ENTITIES


NOTATION



Our intent is to provide you with solid, tested examples you can use


on your own projects



</div>
<span class='text_page_counter'>(140)</span><div class='page_container' data-page=140>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-19. Attribute Types XM3014.1


<i><b>Notes:</b></i>



CDATA attributes contain character data. Whitespace crunching is not performed.
We covered this on previous charts.


The ID data type contains a string value that must be unique to each element. No element
type may have more than one ID attribute specified, although the declared ID attribute may
be #IMPLIED or #REQUIRED. ID valued attributes can be combined with IDREF and
IDREFS valued attributes to create cross referencing within an XML document.


IDREF's must contain values which are specified in an ID-valued attribute elsewhere in the
document. IDREFS are a space separated list of ID values.


ENTITY and ENTITIES are the name or a space separated list of entity names. (More on
entities in a moment).



NMTOKENs are strings composed of the legal characters in an XML element name -- they
are not the same as XML element names, because the first character of an XML element
name may not contain some of the characters that are legal as the first character of an
NMTOKEN.


© Copyright IBM Corporation 2004


<b>Attribute Type</b> <b>Description</b>


String Type


CDATA


Used to declare an attribute whose value may contain
arbitrary character data. Whitespace crunching is not done.
This is the only attribute type permitting attribute values that
do not match the NAME production in the XML 1.0 grammar.
Tokenized Type


NMTOKEN Used to declare an attribute whose value must conform to
the definition of a NAME in XML 1.0


NMTOKENS Allows multiple NMTOKENs separated by white space.
ID Used to declare an attribute whose value must be a unique


within the XML document.


IDREF, IDREFS The value of the attribute must refer to an ID value declared
elsewhere in the document. IDREFS? See NMTOKENS


ENTITY Used to declare an attribute whose value must correspond to


the name of a declared ENTITY.


ENTITIES Allows multiple ENTITY names separated by whitespace.
Enumerated Type


NOTATION References a <!NOTATION declaration in the DTD.
ENUMERATION Attributes have a specified list of acceptable NMTOKEN


values.


</div>
<span class='text_page_counter'>(141)</span><div class='page_container' data-page=141>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-21</b>


Uempty


These were introduced earlier.


NOTATION valued attributes must contain the name of a NOTATION declared elsewhere in
the document. (More on NOTATION later).


</div>
<span class='text_page_counter'>(142)</span><div class='page_container' data-page=142>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-20. Attribute Default Declarations XM3014.1



<i><b>Notes:</b></i>



Every attribute must specify a default type. The possible values for the default type are:
#REQUIRED: Indicates that the attribute must occur; the value may be enumerated or
fixed.


#IMPLIED: Indicates that the attribute or the attribute's value can remain unspecified;
#FIXED value: Indicates that this attribute, when used, has a single (fixed) value, this
value must appear immediately after the keyword and be in quotes.


enumerated list: gives a list of choices in parentheses, each separated by an "or"
operator. A default value (from the enumerated list) may be given after the list and must
be in quotes. If a default value is declared, when the attribute is not present, the


element is treated as if the attribute were present with the declared default value.


© Copyright IBM Corporation 2004


<b>Attribute Default Declarations</b>



<b>Default Declaration</b> <b>Description</b>


#REQUIRED The attribute must be present


#IMPLIED The attribute does not need to be present
and no default value was supplied


<i>attribute-value</i>



If the attribute’s value is not present,


<i>"attribute-value" is supplied as a default </i>


value


</div>
<span class='text_page_counter'>(143)</span><div class='page_container' data-page=143>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-23</b>


Uempty


Figure 5-21. Attribute Default Declaration Examples XM3014.1


<i><b>Notes:</b></i>



Here we've declared a few attributes with the various default types. Size has a default
value, type is required, and manufacturer is fixed.


Let's look at how the examples come out:
For the valid examples:


<shirt type="short"/> will also pickup the default value "large" for size, and the fixed value
"Levi" for manufacturer


<shirt type="short" size="large"/> will pick up the fixed value "Levi" for manufacturer
For the invalid examples:


<shirt/> is missing the required "type=" attribute



<shirt type=short size="medium large"/> is invalid because "medium large" isn't in the
enumerated value list for size


<shirt type="short" manufacturer="Gap"/> is invalid because "Gap" isn't the fixed value for
manufacturer


© Copyright IBM Corporation 2004


<b>Attribute Default Declaration Examples</b>



<b>Declaration:</b>



<b><!ELEMENT shirt (#PCDATA)></b>


<b><!ATTLIST shirt type CDATA #REQUIRED></b>
<b><!ATTLIST shirt collar CDATA #IMPLIED></b>


<b><!ATTLIST shirt size (small|medium|large) "large"></b>
<b><!ATTLIST shirt manufacturer CDATA #FIXED "Levi"></b>


<b>Valid XML:</b>



<b><shirt type="short">cotton</shirt> </b>


<b><shirt type="short" size="large">wool</shirt> </b>


<b><shirt type="short" manufacturer="Levi">denim</shirt></b>
<b><shirt type="short sleeve" collar="button-down"></shirt> </b>



<b>Invalid XML:</b>


<b><shirt></shirt></b>


</div>
<span class='text_page_counter'>(144)</span><div class='page_container' data-page=144>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-22. Attribute Alternate Declaration XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Attribute Alternate Declaration </b>



<b>Here is the same information presented using the second </b>


<b>form of the attribute declaration statement.</b>



<b>Declaration:</b>



<b><!ELEMENT shirt (#PCDATA)></b>


<b><!ATTLIST shirt size (small|medium|large) "large"</b>
<b> collar CDATA #IMPLIED</b>


<b> type CDATA #REQUIRED</b>


</div>
<span class='text_page_counter'>(145)</span><div class='page_container' data-page=145>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-25</b>


Uempty


Figure 5-23. Attribute Types: Tokenized Types: IDREFS Example XM3014.1

<i><b>Notes:</b></i>



This foil shows a declaration for an implied attribute of type IDREFS.


According to the syntax rules for IDs, numbers cannot be ID's. That is why the
serialNumber values begin with a letter.


Aside from naming rules, manager2 could have any value as long as there is an element
with that value defined.


Consequently, an employee could be self-managed!


The uniqueness constraint applies to IDs not to IDREFs so the employee could be
self-managed twice: both manager1 and manager2 could have the same value.


© Copyright IBM Corporation 2004


<b>Attribute Types: Tokenized Types: </b>


<b>IDREFS Example</b>



<b>Syntax:</b>



<b><!ATTLIST elementName attributeName IDREF defaultDecl></b>



<b>Declaration:</b>



<b><!ELEMENT employee (#PCDATA)></b>


<b><!ATTLIST employee serialNumber ID #REQUIRED></b>
<b><!ATTLIST employee manager1</b> <b>IDREF</b> <b>#IMPLIED></b>
<b><!ATTLIST employee manager2</b> <b>IDREFS</b> <b>#IMPLIED></b>


<b>Valid XML fragment:</b>



<b><employee serialNumber="e00001">Joe Smith</employee></b>
<b><employee serialNumber="e00002">Bill Smith</employee></b>
<b><employee serialNumber="e00003" manager1="e00001">John </b>
<b>Smith</employee></b>


<b><employee serialNumber="e00004" manager1="e00001" </b>
<b>manager2="e00002 e00001">John Smith</employee> </b>


</div>
<span class='text_page_counter'>(146)</span><div class='page_container' data-page=146>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-26 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-24. Attribute Types: Tokenized Types: ENTITY Example XM3014.1

<i><b>Notes:</b></i>



This foil shows a declaration for an implied attribute of type ENTITY.


As you can see there are several concepts involved that we have yet to discuss.
Not the least of which is "what is an 'entity'?"



You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.


The concepts themselves are described on subsequent charts.


© Copyright IBM Corporation 2004


<b>Attribute Types: Tokenized Types: </b>


<b>ENTITY Example</b>



<b>Syntax:</b>



<b><!ATTLIST elementName attributeName ENTITY defaultDecl></b>


<b>Declaration:</b>



<b><!ELEMENT employee (#PCDATA)></b>


<b><!ATTLIST employee companyName ENTITY #REQUIRED></b>
<b><!ENTITY company </b>


<b> SYSTEM /><b> NDATA txt></b>


<b><!NOTATION txt </b>


<b> SYSTEM "file:///C:/Windows/System32/notepad.exe"></b>


<b>Valid XML fragment:</b>




<b><employee companyName="company">Joe Smith</employee></b>


ENTITY is also used in its own right as another element of a DTD; this is
covered in subsequent charts. Here we focus on ENTITY as an attribute.
NDATA and NOTATION are concepts we have yet to discuss.


</div>
<span class='text_page_counter'>(147)</span><div class='page_container' data-page=147>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-27</b>


Uempty


Figure 5-25. Attribute Types: Tokenized Types: ENTITIES Example XM3014.1

<i><b>Notes:</b></i>



ENTITIES provide a mechanism for including data from multiple sources.


As you can see there are several concepts involved that we have yet to discuss.
You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.


While DTDs may be lacking in several important aspects (listed later), they can still be very
complex!


Like the ENTITY example, we need to define several concepts for this chart to be
understood. The explanations follow.


© Copyright IBM Corporation 2004



<b>Attribute Types: Tokenized Types: </b>


<b>ENTITIES Example</b>



<b>Syntax:</b>


<b><!ATTLIST elementName attribName ENTITIES defaultDecl></b>


<b>Declaration:</b>


<b><!ELEMENT employee (#PCDATA)></b>


<b><!ATTLIST employee companyAtts ENTITIES #REQUIRED></b>
<b><!ENTITY company "IBM"></b>


<b><!ENTITY</b> <b>division "19"></b>


<b><!ENTITY</b> <b>branch " />


Valid XML fragment:


</div>
<span class='text_page_counter'>(148)</span><div class='page_container' data-page=148>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-28 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-26. DTDs Part II XM3014.1


<i><b>Notes:</b></i>



DTDs Part II




</div>
<span class='text_page_counter'>(149)</span><div class='page_container' data-page=149>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-29</b>


Uempty


Figure 5-27. Declaring ENTITYs: an Internal, Parsed ENTITYs Example XM3014.1

<i><b>Notes:</b></i>



Here is an example.


But we just told you that entities are related to separate storage units, and the entity
declaration that we just saw fit completely into the DTD. This kind of entity is called an
internal entity and is not associated with a separate physical storage unit. Let's look at how
to declare the same entity as an external entity, in a separate physical storage unit.


© Copyright IBM Corporation 2004


<b>Declaring ENTITYs: an Internal, Parsed </b>


<b>ENTITYs Example</b>



<b>Syntax:</b>



<!ENTITY entityName "replacementText">


<b>Usage:</b>



&entityName;

<b>Declaration:</b>




<!ENTITY xmlExpert "Ron Smith">


<!ENTITY topic "XML Documents">


<b>Valid XML:</b>



<response>For additional help with &topic;,
Please contact &xmlExpert;.</response>


<b>Processed XML:</b>



</div>
<span class='text_page_counter'>(150)</span><div class='page_container' data-page=150>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-30 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-28. Declaring ENTITYs: an External, Parsed ENTITYs Example XM3014.1

<i><b>Notes:</b></i>



In this case where the entity defines a public URI, the parser must understand how to
handle the "publicURI" identifier. This is traditionally only used when the parser provided
was hard-coded to handle it, or if you will be creating your own parser to handle entity
replacement.


According to 4.2.2 (External Entities) of the XML 1.0 specification: "Definition: In
<b>addition to a system identifier, an external identifier may [</b><i>emphasis added] include a </i>


<b>public identifier. An XML processor attempting to retrieve the entity's content may </b>
[<i>emphasis added] use the public identifier to try to generate an alternative URI </i>



<b>reference. If the processor is unable to do so, it must [</b><i>emphasis added] use the URI </i>


reference specified in the system literal...."
Here is their example:


<!ENTITY open-hatch


SYSTEM " /><!ENTITY open-hatch


PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"


© Copyright IBM Corporation 2004


<b>Declaring ENTITYs: an External, Parsed </b>


<b>ENTITYs Example</b>



<b>Syntax:</b>



<!ENTITY entityName SYSTEM "systemURI">


<!ENTITY entityName PUBLIC "publicURI" "systemURI">*
<b>*refer to the Notes.</b>


<b>Declaration:</b>



<!ENTITY copyrightInfo SYSTEM "file:///c:/legal/boilerplate.txt">

boilerplate.txt file:



Copyright 2003, IBM. All rights reserved.



<b>Valid XML:</b>



<notices>This application was developed using WebSphere Studio.


&copyrightInfo;</notices>


<b>Processed XML:</b>



This application was developed using WebSphere Studio.


</div>
<span class='text_page_counter'>(151)</span><div class='page_container' data-page=151>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-31</b>


Uempty


<!ENTITY hatch-pic


SYSTEM "../grafix/OpenHatch.gif"
NDATA gif >


Find out more at: />


Be aware that an external entity may not recursively reference itself, either directly or
indirectly.


</div>
<span class='text_page_counter'>(152)</span><div class='page_container' data-page=152>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>5-32 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-29. Unparsed Entity Declarations: a Review XM3014.1


<i><b>Notes:</b></i>



Here's an example of unparsed entity use:


First we declare a notation called jpeg and associate it with a photoshop.exe somewhere
on the local machine.


Then we declare an external unparsed entity called prod17792 and add the NDATA jpeg
clause to specify the notation.


The rest of the DTD declares an empty element item with an ENTITY valued attribute
called picture.


You can see in the XML instance document that we supply prod17792 (the name of the
entity) as the value of the picture attribute of item.


This is how you can associate a piece of unparsed/binary data with a portion of an XML
document.


© Copyright IBM Corporation 2004


<b>Unparsed Entity Declarations: a Review</b>



<b>Syntax:</b>



<b><!ENTITY entityName SYSTEM "URI" NDATA notationName></b>



<b>Declaration:</b>



<b><!NOTATION jpeg SYSTEM</b>


<b>"file:///c:/Program Files/Photoshop/photoshop.exe"></b>
<b><!ENTITY prod17792 SYSTEM "prod17792.jpg" NDATA jpeg></b>


<b><!ELEMENT item EMPTY></b>


<b><!ATTLIST item picture ENTITY #REQUIRED></b>


<b>Valid XML:</b>



<b><item picture='prod17792'/></b>
<b>Rules:</b>


Unparsed entities can only be external entities. In order to declare an unparsed
entity, you start with a regular external entity declaration and before the closing
angle bracket you insert NDATA and the name of a notation. This associates a
notation name with the unparsed entity.


</div>
<span class='text_page_counter'>(153)</span><div class='page_container' data-page=153>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-33</b>


Uempty


Figure 5-30. Parameter ENTITYs XM3014.1



<i><b>Notes:</b></i>



The parameter entity replacement works like regular entity replacement. The parser will
substitute the replacement text, and then continue evaluating the DTD from the point of
replacement.


Parameter entities are entities that are meant to be used in the DTD. Parameter entities
are very useful if you want to reuse portions of an attribute list declaration or if you want to
reuse parts of a complex content model specification.


Parameter entities are the primary tool that is available to help you structure a complex
DTD.


© Copyright IBM Corporation 2004


<b>Parameter ENTITYs</b>



<b>Parameter entities:</b>



Can only be used in the DTD



Allows reuse of attribute lists and complex type definitions



<b>Syntax:</b>



<!ENTITY % parameterEntityName "replacementText">


<b>Usage:</b>




%parameterEntityName;


<b>Declaration:</b>



<!ENTITY % commonAtts "make CDATA #IMPLIED
model CDATA #IMPLIED">
<!ELEMENT phone (#PCDATA)>


<!ATTLIST phone %commonAtts


type (rotary | touch-tone) #IMPLIED>


<b>Processed DTD:</b>



<!ELEMENT phone (#PCDATA)>


<!ATTLIST phonemake CDATA #IMPLIED


model CDATA #IMPLIED


</div>
<span class='text_page_counter'>(154)</span><div class='page_container' data-page=154>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-34 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-31. Parameter ENTITYs - Another Example XM3014.1


<i><b>Notes:</b></i>



In this example the commonAtts parameter entity is used to represent common attributes


for the three different elements: car, computer and phone.


© Copyright IBM Corporation 2004


<b><!ENTITY % commonAtts</b>


<b> "typeID ID #REQUIRED</b>
<b> make CDATA #IMPLIED</b>
<b> model CDATA #IMPLIED"></b>


<b><!ELEMENT car (#PCDATA)></b>
<b><!ATTLIST car %commonAtts;></b>


<b><!ELEMENT computer (#PCDATA)></b>
<b><!ATTLIST computer %commonAtts;></b>


<b><!ELEMENT phone (#PCDATA)></b>
<b><!ATTLIST phone %commonAtts;</b>


<b> type (rotary|digital) #IMPLIED></b>


</div>
<span class='text_page_counter'>(155)</span><div class='page_container' data-page=155>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-35</b>


Uempty


Figure 5-32. What Is Allowed. . . Declaring Comments XM3014.1



<i><b>Notes:</b></i>



To insert a comment in a DTD (or an XML document for that matter) place the comment
text inside <!-- and -->.


Comments cannot be nested. The space after the <!-- is required, as is the space before
-->. The characters "--" may not be used within the comment. This form of declaration is
also usable within HTML, XML and XSL documents.


© Copyright IBM Corporation 2004


<b>What Is Allowed. . . Declaring Comments</b>



Use comments to clarify the semantics of elements and attributes


for those who are using the DTD to define conforming XML



documents.


Syntax:



<b><!-- This is a comment --></b>



Whitespace may be used to format the comment:


<!--



This is also


a comment


-->



</div>
<span class='text_page_counter'>(156)</span><div class='page_container' data-page=156>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>5-36 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-33. Joining a DTD to an XML Instance XM3014.1


<i><b>Notes:</b></i>



Overriding (changing) the data contained in an XML instance may cause confusion for
other users of the instance.


The application of an XSL transform or a processor program (for example, DOM, SAX,
or similar) may be a better alternative.


© Copyright IBM Corporation 2004


<b>Joining a DTD to an XML Instance</b>



Three ways to inform an XML instance that there is an associated


DTD:



1. Embed the DTD content inside the XML instance;


2. Provide the URI where the DTD file resides;



3. Use a combination of 1. and 2.



Best practice: if the DTD will override one or more attribute values


(not advised), set the 'standalone' attribute in the XML declaration to


'no' as a warning to users that they need to be aware.



Include a comment in the XML for each attribute whose value



may be changed by the DTD file.



If the DTD file is large, include a comment near the beginning for


each element that overrides a value in the associated XML



</div>
<span class='text_page_counter'>(157)</span><div class='page_container' data-page=157>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-37</b>


Uempty


Figure 5-34. External DTD Subset XM3014.1


<i><b>Notes:</b></i>



Up until now we've described some of the contents of a DTD without showing how to
actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at placing the DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.


The external DTD subset is an entity even though DTD declarations are not elements.
Therefore you need to supply a text declaration at the beginning of the external DTD
subset. This is especially important if the document and the DTD are going to be using
different character encodings.


In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have it's own encoding declaration (which


may be different from the encoding of documents that reference the DTD file).


The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the


© Copyright IBM Corporation 2004


DTD and XML as separate files:



<b>Filename: hello.xml</b>



<b><?xml version="1.0" encoding="UTF-8"?></b>


<b><!DOCTYPE message SYSTEM "message.dtd"></b>


<b><message></b>



<b><greeting></b>

<b>Hello, World!</b>

<b></greeting></b>


<b><farewell></b>

<b>Goodbye, World!</b>

<b></farewell></b>


<b></message></b>



<b>Filename: message.dtd</b>



<b><!ELEMENT</b>

<b> message (greeting,farewell)</b>

<b>></b>


<b><!ELEMENT</b>

<b> greeting (#PCDATA)</b>

<b>></b>



<b><!ELEMENT</b>

<b> farewell (#PCDATA)</b>

<b>></b>



</div>
<span class='text_page_counter'>(158)</span><div class='page_container' data-page=158>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-38 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



DTD subset. This means that potentially any element declaration in the DTD can serve as
the root element. It is up to the DOCTYPE writer to specify this. Following the name of the
root element is the keyword SYSTEM followed by a URI reference that the local machine
can use to locate the actual file containing the external DTD.


</div>
<span class='text_page_counter'>(159)</span><div class='page_container' data-page=159>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-39</b>


Uempty


Figure 5-35. Internal DTD Subset XM3014.1


<i><b>Notes:</b></i>



Up until now we've describe some of the contents of a DTD without showing how to


actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at the placing DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.


The external DTD subset is an entity even though DTD declarations are not elements.
Therefore you need to supply a text declaration at the beginning of the external DTD
subset. This is especially important if the document and the DTD are going to be using
different character encodings.



In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have its own encoding declaration (which
may be different from the encoding of documents that reference the DTD file).


The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the


© Copyright IBM Corporation 2004


DTD and XML as a combined file:



<b>Filename: hello.xml</b>



<b><?xml version="1.0" encoding="UTF-8"?></b>


<b><!DOCTYPE message [</b>



<b><!ELEMENT message (greeting,farewell)></b>


<b><!ELEMENT greeting (#PCDATA)></b>



<b><!ELEMENT farewell (#PCDATA)></b>


<b>]></b>



<b><message></b>



<b><greeting></b>

<b>Hello, World!</b>

<b></greeting></b>


<b><farewell></b>

<b>Goodbye, World!</b>

<b></farewell></b>


<b></message></b>



</div>
<span class='text_page_counter'>(160)</span><div class='page_container' data-page=160>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>5-40 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


DTD subset. This means that potentially any element declaration in the DTD can serve as
the root element. It is up to the DOCTYPE writer to specify this. Following the name of the
root element is the keyword SYSTEM followed by a URI reference that the local machine
can use to locate the actual file containing the external DTD.


</div>
<span class='text_page_counter'>(161)</span><div class='page_container' data-page=161>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-41</b>


Uempty


Figure 5-36. Split DTD Subsets XM3014.1


<i><b>Notes:</b></i>



The example on this foil shows a DTD with an entity called destination in both the internal
and external subsets. The declaration for destination in the internal subset will override the
declaration in the external subset, leaving the messages "Hello cruel world" and "good-bye
cruel world" after entity expansion has occurred.


This allows local entity declarations in the internal subset to override entity declarations in
the external subset.


A best practice would be to include a comment drawing attention to the intent of this
internal subset to override a value set in the external subset.



© Copyright IBM Corporation 2004


Embedding DOCTYPE declarations and the DTD within the XML file:


<b>Filename: hello.xml</b>


<b><?xml version="1.0" encoding="UTF-8"?></b>
<b><!DOCTYPE message SYSTEM "message.dtd" [</b>
<b> <!ENTITY destination "cruel world"></b>


<b> <!-- overrides destination in message.dtd --></b>


<b>]></b>


<b><message></b>


<b><greeting>Hello, &destination;</greeting></b>
<b><farewell>Goodbye, &destination;</farewell></b>
<b></message></b>


<b>Filename: message.dtd</b>


<b><!ELEMENT message (greeting, farewell)></b>
<b> <!ELEMENT greeting (#PCDATA)></b>


<b> <!ELEMENT farewell (#PCDATA)></b>
<b> <!ENTITY destination "World"></b>


</div>
<span class='text_page_counter'>(162)</span><div class='page_container' data-page=162>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>5-42 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-37. Whitespace and DTDs XM3014.1


<i><b>Notes:</b></i>



Whitespace is white space isn't it? Not if you are a validating XML processor. There are two
kinds of white space:


Whitespace in #PCDATA element content (between the same start and end tag pair) -
you only know this if you have a DTD


Whitespace in non-character data content


Whitespace not in #PCDATA data element content is ignorable


Parsers report whitespace and ignorable whitespace differently. The parser does not
actually discard the ignorable white space -- this is the application's job. But the parser can
use different data structures / callback routines in order to report ignorable versus not
ignorable whitespace.


© Copyright IBM Corporation 2004


<b>Whitespace and DTDs</b>



Whitespace is white space isn't it?



Not if you are a validating XML processor.




Whitespace in #PCDATA element content (between the same


start and end tag pair)



Only know this if you have a DTD



Whitespace in non-character data content



</div>
<span class='text_page_counter'>(163)</span><div class='page_container' data-page=163>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-43</b>


Uempty


Figure 5-38. Ignorable Whitespace Example XM3014.1


<i><b>Notes:</b></i>



This slide shows an example XML document and DTD, and shows which whitespace is
ignorable and which whitespace is not.


Again, it is up to the application to decide what to do about ignorable whitespace. An XML
processor will report all of the whitespace and indicate whether or not it is ignorable or note.


© Copyright IBM Corporation 2004
<b><?xml version='1.0'?></b>


<b><!DOCTYPE example [</b>


<b><!ELEMENT example (source-code)></b>


<b><!ELEMENT source-code (#PCDATA)></b>


<b>]></b>


<b><example> <-- ignorable</b>


<b><source-code> <-- not ignorable</b>


<b> int i; <-- not ignorable</b>


<b> i = 0; <-- not ignorable</b>


<b></source-code></b> <b><-- ignorable</b>


<b></example></b>


</div>
<span class='text_page_counter'>(164)</span><div class='page_container' data-page=164>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-44 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-39. Validating versus Non-validating Processors XM3014.1


<i><b>Notes:</b></i>



Validating processors are straightforward. The XML spec tells implementors exactly what a
validating processor must do (in fact, they must do everything).


Non-validating processors have options because the XML spec says that a non-validating
processor may do certain things, but is not required to do them. Unfortunately, every parser


implementor has chosen a different subset of items from this list to implement, so every
non-validating parser behaves just a little differently.


A non-validating processor must check the document entity including the internal subset.
If there is an external DTD subset, they may or may not:


normalize attribute values from the external subset
replace internal entity text from the external subset
supply attribute defaults from the external subset


Since the behavior of non-validating processors is up to implementors, you need to be
careful when working with a non-validation processor if you have complicated attribute
values or use entities.


© Copyright IBM Corporation 2004


<b>Validating versus Non-validating Processors</b>



Validating processors will validate an XML document using the


DTD.



Processors will report validity errors.



Some behavior of parsers is up to implementors.


Parsers have options:



They check document entity including internal subset.


They report well-formedness errors.



If there is an external DTD subset, they may or may not:



Normalize attribute values.


</div>
<span class='text_page_counter'>(165)</span><div class='page_container' data-page=165>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-45</b>


Uempty


Figure 5-40. Example DTDs XM3014.1


<i><b>Notes:</b></i>



Many organizations are producing DTD's for various applications. Here some examples:
<b> • cXML - </b>


<b> - cXML is a streamlined protocol intended for consistent communication of business </b>
documents between procurement applications, e-commerce hubs and suppliers.
The current standard includes documents for setup (company details and


transaction profiles), catalogue content, application integration (including the
widely-used PunchOut feature), original, change and delete purchase orders and
responses to all of these requests, as well as new order confirmation and ship notice
documents (cXML analogues of EDI 855 and 856 transactions).


<b> • RosettaNet - </b>


<b> - RosettaNet Partner Interface Processes (PIPs™) define business processes </b>
between trading partners. RosettaNet dictionaries provide a common set of
properties for PIPs™. The RosettaNet Business Dictionary designates the


properties used in basic business activities. RosettaNet Technical Dictionaries


© Copyright IBM Corporation 2004


<b>Example DTDs</b>



W3C XHTML


cXML



B2B between procurement applications, e-commerce hubs and


suppliers.



RosettaNet



Business processes between trading partners and properties for


defining products.



RDF Site Summary (RSS)


Syndicating news articles.


DocBook



Production of documentation which can be rendered into multiple


output formats.



Open Financial Exchange(OFX).



</div>
<span class='text_page_counter'>(166)</span><div class='page_container' data-page=166>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-46 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>



standards expedite the alignment of business processes between trading partners.
<b> • RSS - />


<b> - The RDF Site Summary format was originally developed by Netscape and is widely </b>
used across the World Wide Web for the purpose of syndicating news articles.
<b> • DocBook - </b>


<b> - DocBook is an XML version of the SGML DocBook DTDs that are widely used in the </b>
production of documentation which can be rendered into multiple output formats.
<b> • OFX - </b>


</div>
<span class='text_page_counter'>(167)</span><div class='page_container' data-page=167>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-47</b>


Uempty


Figure 5-41. What's Wrong with DTDs? XM3014.1


<i><b>Notes:</b></i>



There are number of problems with DTD's, which are listed on the chart.


These problems have led to the creation of a number of alternate languages for defining
the structure of XML grammars. The two leading contenders are W3C's XML Schema, and
OASIS's Relax NG.


© Copyright IBM Corporation 2004



<b>What's Wrong with DTDs?</b>



No type support.



#PCDATA can be any string of characters (except tags)


DTD syntax is different from XML syntax.



<b><!ELEMENT zip (#PCDATA)></b>



There are some constraints DTDs cannot easily express:


Element x can occur from 4 to 17 times



XML schema addresses many of the limitations of DTDs.


XML schema is now a W3C recommendation.



Support for W3C Schema is new.


Features include:



XML syntax, strong typing, constraints


</div>
<span class='text_page_counter'>(168)</span><div class='page_container' data-page=168>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-48 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-42. Status of DTDs XM3014.1


<i><b>Notes:</b></i>



DTD's are a part of the XML 1.0 recommendation. They are a stable technology and widely


adopted. As we noted earlier there are variations in XML processors in accordance with
varying definition of non-validating. Most XML parsers available today come with the
capability to use DTDs to validate documents.


XML Schema is the W3C approved replacement for DTD's, but this is a new technology
and has not reached broad usage at the time of this writing.


© Copyright IBM Corporation 2004


Part of XML 1.0


Widely adopted



Variations in XML processors in accordance with varying definition


of non-validating



</div>
<span class='text_page_counter'>(169)</span><div class='page_container' data-page=169>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-49</b>


Uempty


Figure 5-43. Tooling XM3014.1


<i><b>Notes:</b></i>



The Tooling for DTDs is pretty simple at the base. You can use the same editor that you
use to edit an XML file to edit a DTD. They are the same kind of text.


There are also many tools for working with DTD's.


IBM's alphaworks has a number of useful tools.


The commercially available XML Spy is a popular graphical tool for working with XML,
DTD's and XML Schema.


There are many parsers that perform validation using a DTD. This is true of all of the
parsers available from the Apache Software Foundation.


© Copyright IBM Corporation 2004


Can use any text editor



As long as the editor supports Unicode or the chosen encoding.


WebSphere Studio Application Developer



Provides guided editing for DTDs and documents that reference


them



Can generate a DTD from sample XML.



Write sample XML that illustrates all the ways you'll use the


data



Supports document validation



Free IBM Alphaworks tools to help you



/>

Many validating parsers:



Apache's Xerces for Java, C++, Perl



Apache's Xerces Perl



JAXP, Java XML Parser



</div>
<span class='text_page_counter'>(170)</span><div class='page_container' data-page=170>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>5-50 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-44. Checkpoint Questions (1 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Checkpoint Questions (1 of 2)</b>



1. Which DTD entry correctly depicts phone number, with optional


area code?



<b>a.<!ELEMENT phone ((areaCode)*, prefix, body)> </b>


<b>b.<!ELEMENT phone (areaCode?, prefix, body )> </b>


<b>c.<!ELEMENT phone?(areaCode, prefix, body )> </b>


<b>d.<!ELEMENT phone (areaCode, (prefix, body)+)> </b>


2. Which of the following is a limitation of DTD?



a. Non-XML syntax.



b. Does not easily allow range of values (that is, 5 to 1000


elements).




c. Does not provide proper typing of values (that is, integer versus


string).



</div>
<span class='text_page_counter'>(171)</span><div class='page_container' data-page=171>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-51</b>


Uempty


Figure 5-45. Checkpoint Questions (2 of 2) XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Checkpoint Questions (2 of 2)</b>



3. Which DTD entry correctly depicts an optional attribute named



<b>type for a pet element, that defaults to the value "dog"? </b>



<b>a.<!ATTLIST pet type CDATA #IMPLIED> </b>



<b>b.<!ATTLIST type dog CDATA #FIXED "dog"> </b>


<b>c.<!ATTLIST pet type CDATA "dog"> </b>



</div>
<span class='text_page_counter'>(172)</span><div class='page_container' data-page=172>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>5-52 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 5-46. Unit Summary XM3014.1


<i><b>Notes:</b></i>



In this section you have learned about:
<b> • XML 1.0 DTD's</b>


<b> • Element declarations</b>
<b> • Attribute declarations</b>
<b> • Comments</b>


<b> • Entity declarations</b>
<b> • General</b>


<b> • Parameter</b>


<b> • Notation declarations</b>


<b> • The difference between validating and non validating processors</b>
<b> ã Example DTD's</b>


<b> ã Best Practices</b>


â Copyright IBM Corporation 2004


<b>Unit Summary</b>




In this section you have learned:


XML 1.0 DTDs



Element declarations


Attribute declarations


Entity declarations



General
Parameter


Notation declarations


Comments



The difference between validating and non validating processors


Example DTDs



</div>
<span class='text_page_counter'>(173)</span><div class='page_container' data-page=173>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 5. Document Type Definition (DTD)</b> <b>5-53</b>


</div>
<span class='text_page_counter'>(174)</span><div class='page_container' data-page=174>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


</div>
<span class='text_page_counter'>(175)</span><div class='page_container' data-page=175>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-1</b>


Uempty

<b><sub>Unit 6. XML Namespaces</sub></b>




<b>What This Unit is About</b>



This unit describes the XML Namespaces Facility.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:
• Describe the reasons for using namespaces
• Describe the syntax used in namespaces


• Define and illustrate an example using namespaces
• Define myths about namespaces


• Define problems with namespaces


• List and define the best practices to use when using namespaces
• Describe the status of namespaces in the industry


<b>How You Will Check Your Progress</b>


Accountability:


• Checkpoint


</div>
<span class='text_page_counter'>(176)</span><div class='page_container' data-page=176>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-1. Unit Objectives XM3014.1



<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Objectives</b>



After completing this unit, you should able to:


Describe the reasons for using namespaces


Describe the syntax used in namespaces



Define and illustrate an example using namespaces


Define problems with namespaces



</div>
<span class='text_page_counter'>(177)</span><div class='page_container' data-page=177>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-3</b>


Uempty


Figure 6-2. Problem: Element and Attribute Names can be Ambiguous XM3014.1

<i><b>Notes:</b></i>



The double use of title in the example illustrates the need for a namespace solution in XML.
We need to be able to tell that the two title elements in this document are not the same
element. Even though the elements have the same name, they have different meanings to
the application. Using the context to disambiguate the two uses is not a generally


applicable solution.



© Copyright IBM Corporation 2004


Consider the following XML document:



How does an application know that:



The first occurrence of title is a book title.



The second occurrence of title is a person's title.



Need a way to eliminate the ambiguity for the purpose of


processing.



<b>Problem: Element and Attribute Names</b>


<b>Can Be Ambiguous</b>



<b><catalogEntry></b>
<b> <book></b>


<b> <title>this book</title></b>


<b> <isbn>0001</isbn></b>
<b> <author></b>


<b> <title>Dr.</title></b>


<b> <lastName>Expert</lastName></b>
<b> <firstName>Iman</firstName></b>
<b> </author></b>



<b> </book></b>


</div>
<span class='text_page_counter'>(178)</span><div class='page_container' data-page=178>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-4 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-3. Elaboration XM3014.1


<i><b>Notes:</b></i>



URI's are not actually used for lookup, only as reference. The only purpose is to give the
namespace a unique name. Sometimes the URI is a pointer to a web page, which provides
information about the namespace, but this is not required. The URI is not looked up as part
of XML parsing or processing.


The application is responsible for deciding what to do with the names.


© Copyright IBM Corporation 2004


Some possibilities:



Adopt industry standard document formats and naming conventions


This approach works at the document level, a good example is


ebXML, refer to



Problems:



No industry is an island, industries interact: who decides?




Naming standards down to the element/attribute level are too brittle


Use verbose element names, that is, bookTitle, courtesyTitle



Problem: naming becomes fundamentally difficult, there is no way to


know if a name is already in use, further, the data and/or its model


may not belong to the consuming application.



Solution



Use some name qualifier that is already established as unique, that is, a


domain-name-qualified URI (uniform resource identifier).



Domain names are already managed and maintained as unique. This


approach was developed into XML Namespaces.



</div>
<span class='text_page_counter'>(179)</span><div class='page_container' data-page=179>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-5</b>


Uempty


Figure 6-4. Namespaces: The Big Idea XM3014.1


<i><b>Notes:</b></i>



URI, recall, is uniform resource identifier.


© Copyright IBM Corporation 2004



<b>Namespaces: The Big Idea</b>



In concept, each element name and attribute name could be


expressed as: URI+name, for example, <title> might become:


< />


There are two problems with this format:



1. It is not well-formed XML under the 1.0 specification.


2. It is a lot of typing.



If it were possible to create a synonym for the URI and replace


occurrences of the URI with that synonym, the amount of typing


would be reduced and, if handled correctly, the result would be


compatible with XML 1.0



For example, specify books=" and


code the element as <books:title>



This concept forms the basis of the XML Namespace specification

.



</div>
<span class='text_page_counter'>(180)</span><div class='page_container' data-page=180>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-6 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-5. XML Namespaces XM3014.1


<i><b>Notes:</b></i>




URI's are not actually used for lookup, only as reference. The only purpose is to give the
namespace a unique name. Sometimes the URI is a pointer to a web page, which provides
information about the namespace, but this is not required. The URI is not looked up as part
of XML parsing or processing.


The application is responsible for deciding what to do with the names.


© Copyright IBM Corporation 2004


<b>XML Namespaces</b>



<b>The Namespace specification </b>


<b>refers to these two-part names </b>


<b>as Qualified Names or QNames</b>



For the purposes of XML namespaces, URIs are considered


identical when they match character for character. If URIs are


different, they represent different Namespaces.



<b>Note: There is no network lookup associated with the use of URIs </b>



in this specification, it is a lexical convention only.



<i>URIs are not checked by the processor to ensure they exist.</i>


The Namespace specification deals with the mechanics of


associating a URI qualifier (aka namespace) with element and



</div>
<span class='text_page_counter'>(181)</span><div class='page_container' data-page=181>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-7</b>


Uempty


Figure 6-6. Qualified Names (QNames) XM3014.1


<i><b>Notes:</b></i>



You can think of a QName like <books:title> as being equivalent to the following Clark
Notation:


{ />


© Copyright IBM Corporation 2004


<b>Qualified Names (QNames)</b>



QNames are used in place of element and attribute names.


QNames have a prefix and a local part - they look like this:



prefix:localPart




<books : title >


At all times, the prefix should be thought of as shorthand for the


actual URI/namespace.



That is, the above is really < />

prefix



</div>
<span class='text_page_counter'>(182)</span><div class='page_container' data-page=182>

<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>6-8 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-7. Declaring Namespaces (1 of 2) XM3014.1


<i><b>Notes:</b></i>



Note that you can declare a namespace on any element that you like, not just the root
element.


© Copyright IBM Corporation 2004


The syntax of a namespace declaration is:



<b><</b>

<b>prefix</b>

<b>:</b>

<b>elementName </b>

<b>xmlns:</b>

<b>prefix</b>

<b>=</b>

<b>'URI'</b>

<b>/></b>


The following example declares the namespace



<b> assigns it a prefix of 'books' </b>


<b>and identifies the book element as a member of that namespace.</b>



<b><</b>

<b>books</b>

<b>:book</b>



<b> xmlns:</b>

<b>books</b>

<b>=' />


Attributes

<i> may also be assigned to a namespace. As with elements, </i>


attributes are prefixed as follows:



<b><</b>

<b>books</b>

<b>:book </b>



<b> xmlns:</b>

<b>books</b>

<b>=' />



<b> </b>

<b>books</b>

<b>:</b>

<b>hardcover='true'</b>

<b>/></b>



<i>Attributes are not automatically in a namespace</i>



</div>
<span class='text_page_counter'>(183)</span><div class='page_container' data-page=183>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-9</b>


Uempty


Figure 6-8. Declaring Namespaces (2 of 2) XM3014.1


<i><b>Notes:</b></i>



Now let's look at example with nested elements.


© Copyright IBM Corporation 2004


<b>Declaring Namespaces (2 of 2)</b>



Suppose a document without namespaces looked like:


<b><book hardcover='true'></b>


<b> <title>Tom Sawyer</title></b>
<b></book></b>


One way to use a namespace is:



<b><books:book</b>



<b> xmlns:books=' />


<b> books:hardcover='true'></b>


<b> <books:title</b>


<b> xmlns:books=' /><b> Tom Sawyer</b>


<b> </books:title></b>
<b></books:book></b>


It is clear that declaring the namespace on every single element


becomes unwieldy (and error prone).



</div>
<span class='text_page_counter'>(184)</span><div class='page_container' data-page=184>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-10 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-9. Namespace Scope XM3014.1


<i><b>Notes:</b></i>



Note that every element or attribute name that is in the namespace has the appropriate
namespace prefix in front of it.


© Copyright IBM Corporation 2004


<b>Namespace Scope</b>




When a namespace prefix is declared, it remains in scope for:


Attributes of the element where it is declared.



Child elements (and their attributes) of the element where it is


declared.



Unless the prefix is redefined on a nested element.



QNames are still required, the namespace is not assumed.


Applying this technique, the previous example becomes:



<b><books:book</b>


<b>xmlns:books=' />


<b>books:hardcover='true'></b>


<b><books:title></b>


</div>
<span class='text_page_counter'>(185)</span><div class='page_container' data-page=185>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-11</b>


Uempty


Figure 6-10. Default Namespaces XM3014.1


<i><b>Notes:</b></i>



Once you have specified the default namespace, all unprefixed elements in the scope of


the default declaration are assumed to be in the namespace specified as the default. It is
very important to note that default namespace declarations only apply to element names,
not attribute names.


In our example, we set the books namespace to the default and get rid of all the prefixes on
element names. We still need the prefix on the attribute names because default


namespaces don't apply to attributes.


© Copyright IBM Corporation 2004


<b>Default Namespaces</b>



For situations where a majority of elements are associated with the


same Namespace, a default namespace may be declared.



<b> Syntax:</b>



<b> <</b>

<b>elementName </b>

<b>xmlns=</b>

<b>'URI'</b>

<b>/></b>



QNames are used to identify nested elements that are from a


different namespace.



The default may be respecified for each element scope



Nesting is respected, that is, respecification does not influence


the outerscope containing the nested elements.



</div>
<span class='text_page_counter'>(186)</span><div class='page_container' data-page=186>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>



<b>6-12 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-11. Example - Default Namespaces XM3014.1


<i><b>Notes:</b></i>



The result of these apparent duplications is to put the hardcover attribute inside a
namespace.


© Copyright IBM Corporation 2004


<b>Example - Default Namespaces</b>



<b><book xmlns=' />


<b> xmlns:</b>

<b>books=' />

<b> </b>

<b>books:hardcover</b>

<b>='true'></b>



<b> <title>Tom Sawyer</title></b>


<b></book></b>



</div>
<span class='text_page_counter'>(187)</span><div class='page_container' data-page=187>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-13</b>


Uempty


Figure 6-12. Documents with Multiple Namespaces XM3014.1


<i><b>Notes:</b></i>




All that we did to enable this was add two more namespace declarations, and then add the
new elements and use the appropriate namespace prefix.


In the case of the isbn element, we declared the namespace that it needed on the element
itself -- you can declare namespaces on any element that you like, not just the root


element. When you do this, the prefix is only good for the element it was declared on. You
can also change the default namespace for a particular element by redefining the default
namespace on that element. Again, the scope will be the element that the declaration is
attached to.


© Copyright IBM Corporation 2004


<b>Documents with Multiple Namespaces</b>



Document with three namespaces:


<b><</b>

<b>book</b>



<b>xmlns=</b>

<b>' />


<b>xmlns:amazon=' />

<b> <</b>

<b>title</b>

<b>>Tom Sawyer</</b>

<b>title</b>

<b>></b>



<b> <</b>

<b>isbn</b>

<b> xmlns=</b>

<b>' />

<b> 0140390839 </b>



<b> </</b>

<b>isbn</b>

<b>></b>



<b> <amazon:skuNo>A25</amazon:skuNo></b>


<b></</b>

<b>book</b>

<b>></b>



</div>
<span class='text_page_counter'>(188)</span><div class='page_container' data-page=188>

<b>Course materials may not be reproduced in whole or in part </b>


<b>without the prior written permission of IBM.</b>


<b>6-14 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-13. Elements with No Namespace XM3014.1


<i><b>Notes:</b></i>



The unprefixed <title> element is in no namespace, because there is no default null
namespace.


In order to repair this example, we need to prefix title with the books namespace prefix
again.


WRONG!


© Copyright IBM Corporation 2004


<b>Elements with No Namespace</b>



What happens to the previous example with no default


namespace?



<b><book</b>


<b>xmlns=' />


<b>xmlns:amazon=' /><b> </b> <b><title>Tom Sawyer</title></b>


<b> </b> <b><isbn xmlns=""></b>
<b> </b> <b>0140390839 </b>


<b> </b> <b></isbn></b>


<b> </b> <b><amazon:skuNo>A25</amazon:skuNo></b>


<b></book></b>


The xmlns="" syntax resets the default namespace for the scope in


<b>which it occurs. The <isbn></b>

element is not in a namespace.



</div>
<span class='text_page_counter'>(189)</span><div class='page_container' data-page=189>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-15</b>


Uempty


Figure 6-14. Attributes and Namespaces XM3014.1


<i><b>Notes:</b></i>



There are two interacting rules that affect attributes and namespaces:
<b> • Attributes are not affected by a default namespace declaration.</b>
<b> • Attributes on a single element must be unique.</b>


In the example above, the <bad> element is invalid because there are two unprefixed att
attributes. In the second invalid element the two attributes are the same because ns1 and
ns2 are two prefixes for the same namespace URI. Therefore, the two attribute names are
identical.


It should be obvious that the first <valid> element is valid -- a and b are unprefixed, and a is


not the same as b. The second <valid> element is valid because the unprefixed attribute a
is in no namespace (remember that default namespace declarations don't affect attributes),
and the ns1:a attribute is in the namespace -- they are in different
namespaces.


© Copyright IBM Corporation 2004


<b>Attributes and Namespaces</b>



Attributes are not affected by a default namespace declaration.


Attributes on a single element must be unique.



<b><bad xmlns:ns1="" </b>


<b> xmlns:</b>

<b>ns2=""</b>

<b> ></b>


<b> <invalid att="1" att="2" /></b>


<b> <invalid ns1:att="1" </b>

<b>ns2:att</b>

<b>="2" /></b>


<b></bad></b>



</div>
<span class='text_page_counter'>(190)</span><div class='page_container' data-page=190>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-16 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-15. Namespace Processing XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


How does an XML parser deal with namespaces?



Needs the right API



SAX2


DOM Level 2


The parser simply reports the prefix, localName, and URI


associated with the element or attribute.



It's up to your application to decide what to do.



There are no validation rules associated with Namespaces - it


depends on XMLSchema, DTD, or whatever grammar description


language you are using.



</div>
<span class='text_page_counter'>(191)</span><div class='page_container' data-page=191>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-17</b>


Uempty


Figure 6-16. Example: Use of Namespaces XM3014.1


<i><b>Notes:</b></i>



Here's an example of namespaces in use:


Here we have an imaginary record that might be used in an airline's airplane fleet inventory.
For each airplane, we want to know which manufacturer provided each major part of the


airplane.


This example shows how we could use namespaces to identify which components came
from which manufacturers.


An application that processed this document could then use the namespaces to determine
which manufacturer's diagnostic equipment would be needed to perform a full maintenance
cycle on a particular airplane.


While not required, it is a best practice to collect all the namespace definitions in one place;
especially in large, composite files.


© Copyright IBM Corporation 2004


<b>Example: Use of Namespaces</b>



<b>Composition of a particular airplane in an airline fleet:</b>



</div>
<span class='text_page_counter'>(192)</span><div class='page_container' data-page=192>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-18 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-17. Problems with Namespaces XM3014.1


<i><b>Notes:</b></i>



Namespace recommendation after XML 1.0 - because the namespace recommendation
came after XML 1.0, it's not really part of the spec. This means there are places where
namespaces and XML 1.0 don't fit together.



DTD's don't really integrate well - We've showed you an ad hoc solution for using a fixed
set of namespaces with a DTD, but that solution doesn't really satisfy a lot of desires that
users have for namespaces.


Testing equality of namespaces is a pain - there's no easy way to test equality of two
namespaces except to get the two namespace URIs and compare them character by
character.


© Copyright IBM Corporation 2004


<b>Problems with Namespaces</b>



Namespace recommendation after XML 1.0.


DTDs don't integrate well.



Must use QNames as element names in DTDs (remember, "

:

" is


legal in an element name)



If the prefix changes, the DTD must also change



</div>
<span class='text_page_counter'>(193)</span><div class='page_container' data-page=193>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-19</b>


Uempty


Figure 6-18. Best Practices XM3014.1



<i><b>Notes:</b></i>



When to use namespaces:


<b> • When you think your DTD/Schema will be used outside your organization. </b>


<b> • When you think you will need to combine your DTD/Schema with other grammars.</b>
<b> • As a practical note, this means that anybody doing serious grammar work really ought </b>


to be using namespaces.
Performance implications:


<b> • Namespace processing slows down the parser and increases memory usage. The </b>
parser needs to look at all the namespace declarations and QNames. Even if you turn
off namespace processing in your parser, there will still be a performance impact
because your input document will still be larger (because of namespace declarations
and QNames) than if you were not using namespaces.


Don't use relative URIs for namespace identifiers; they are deprecated post the
namespaces recommendation.


© Copyright IBM Corporation 2004


<b>Best Practices</b>



When to use namespaces



When the data requires uniqueness for application processing.


When the need to combine a schema [TBD] with other grammars


is necessary.




Performance implications



Namespace processing may slow down the parser and/or


increases memory use.



Don't use relative URIs for namespace identifiers.


Pick the default namespace carefully.



Don't declare more than one prefix for a namespace URI.


Be careful with attributes when using namespaces.



</div>
<span class='text_page_counter'>(194)</span><div class='page_container' data-page=194>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-20 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


choose carefully.


Don't declare more than one prefix for a namespace URI - there's no reason to do it and it
will cause confusion to someone else.


</div>
<span class='text_page_counter'>(195)</span><div class='page_container' data-page=195>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-21</b>


Uempty


Figure 6-19. Status of Namespaces XM3014.1



<i><b>Notes:</b></i>



Namespaces in XML Recommendation 1/1999 - it is a stable recommendation.
Supported by most parsers relative to DTDs.


Much better support with XML Schema.


Namespaces are ready for use, especially now that XML Schema has reached
recommendation status.


© Copyright IBM Corporation 2004


<b>Status of Namespaces</b>



XML namespaces became a recommendation of the W3C on


January 14, 1999.



</div>
<span class='text_page_counter'>(196)</span><div class='page_container' data-page=196>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-22 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-20. More Information XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>More Information </b>




<b>Reference</b>

<b>Description</b>



/>


NamespacesFAQ.htm XML Namespaces FAQ
/>


namespaces/index.html


XML.com article about Namespace
Myths


James Clark's notes on XML
Namespaces


</div>
<span class='text_page_counter'>(197)</span><div class='page_container' data-page=197>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 6. XML Namespaces</b> <b>6-23</b>


Uempty


Figure 6-21. Checkpoint Questions XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Checkpoint Questions</b>



1. Which is true of XML namespaces?



(Select all that apply)



a. They are stored in an Internet-based registry.


b. They are associated with URIs



c. They are integrated with DTDs



d. They are integrated with XML Schema.



2. An XML namespace prefix (Select all that apply):


a. Links to a schema definition.



b. Is scoped to the element where it is defined.


c. Is short hand for a URI.



d. Can stand for more than one namespace.


3. Default namespaces apply to:



a. Elements


b. Attributes



c. Elements and attributes



</div>
<span class='text_page_counter'>(198)</span><div class='page_container' data-page=198>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>6-24 Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 6-22. Unit Summary XM3014.1



<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Summary</b>



Having completed this unit, you should understand:


The reasons for using namespaces



The syntax used in namespaces


The use of default namespaces



The interaction between namespaces and attributes


Problems with namespaces



</div>
<span class='text_page_counter'>(199)</span><div class='page_container' data-page=199>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>© Copyright IBM Corp. 2001, 2004</b> <b>Unit 7. XML Schema</b> <b>7-1</b>


Uempty

<b><sub>Unit 7. XML Schema</sub></b>



<b>What This Unit is About</b>



This unit presents an introduction to the essential features of the W3C
XML Schema language.


<b>What You Should Be Able to Do</b>



After completing this unit, you should be able to:



• List and describe the reasons for using XML Schemas
• List the key new features of Schemas


• Define the grammar rules of an XML document using the syntax of
XMLSchemas


• List and define the best practices to use when using XML Schemas
• Describe the status of XML Schemas in the industry


<b>How You Will Check Your Progress</b>


Accountability:


</div>
<span class='text_page_counter'>(200)</span><div class='page_container' data-page=200>

<b>Course materials may not be reproduced in whole or in part </b>
<b>without the prior written permission of IBM.</b>


<b>7-2 </b> <b>Introduction to XML</b> <b>© Copyright IBM Corp. 2001, 2004</b>


Figure 7-1. Unit Objectives XM3014.1


<i><b>Notes:</b></i>



© Copyright IBM Corporation 2004


<b>Unit Objectives</b>



After completing this unit, you should be able to:


Understand what an XML Schema represents



List and describe the reasons for using XML Schema




List the key features of the XML Schema definition language



Define the grammar rules of an XML document using the syntax of


the XML Schema definition language



</div>

<!--links-->

×