this site is individual site for ueh students of information management faculty this site provides some students resources of it courses such as computer network data structure and algorithm enterprise resource planning

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.49 MB, 594 trang )

(1)<div class='page_container' data-page=1>

Introduction to XML

and Related Technologies

(Course Code XM301)

Student Notebook

ERC 4.1

IBM Certified Course Material

cover

</div>
(2)<div class='page_container' data-page=2>

July 2004 Edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

This document may not be reproduced in whole or in part without the prior written permission of IBM.

Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United

States, or other countries, or both:

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other
countries.

SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.

AFS AIX alphaWorks

AS/400 CICS ClearCase

Database 2 DB2 DB2 Universal Database

DFS Distributed Relational

Database Architecture Domino

DRDA Encina Everyplace

IMS Lotus Enterprise Integrator Lotus Notes

Lotus MQSeries MVS

NetRexx Network Station Notes

Open Blueprint OS/2 OS/390

RACF RDN RS/6000

S/390 SecureWay Tivoli

Tivoli Enterprise Tivoli Management

Environment TME

TME 10 TXSeries VisualAge

</div>
(3)<div class='page_container' data-page=3>

TOC

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Contents iii

Contents

Trademarks . . . xi

Course Description . . . xiii

Agenda . . . xv

</div>
(4)<div class='page_container' data-page=4>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

iv Introduction to XML © Copyright IBM Corp. 2001, 2004

</div>
(5)<div class='page_container' data-page=5>

TOC

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Contents v

</div>
(6)<div class='page_container' data-page=6>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

vi Introduction to XML © Copyright IBM Corp. 2001, 2004

</div>
(7)<div class='page_container' data-page=7>

TOC

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Contents vii

</div>
(8)<div class='page_container' data-page=8>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

viii Introduction to XML © Copyright IBM Corp. 2001, 2004

</div>
(9)<div class='page_container' data-page=9>

TOC

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Contents ix

</div>
(10)<div class='page_container' data-page=10>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

x Introduction to XML © Copyright IBM Corp. 2001, 2004

<xsl:choose Element . . . .9-38
<xsl:choose Example . . . .9-39
Elements to Generate Output (XML to XML) . . . .9-40
<xsl:element Element . . . .9-42
<xsl:attribute> . . . .9-43
XML to XML Example (1 of 2) . . . .9-44
XML to XML Example (2 of 2) . . . .9-45
Numbers, Sorting, and Functions . . . .9-46
Working with Numbering in XSLT . . . .9-47
<xsl:number Element format Attribute Values . . . .9-49
<xsl:number Example . . . .9-50
<xsl:sort Element . . . .9-51
<xsl:sort Attributes . . . .9-52
Sort Example . . . .9-53
XPath/XSLT Functions . . . .9-54
Other Elements . . . .9-56
Attribute Value Templates . . . .9-57
Attribute Value Templates Example . . . .9-58
XSLT Processors . . . .9-59
Xalan . . . .9-60
XSL Resources from IBM . . . .9-61

XSL References . . . .9-62
Checkpoint Questions . . . .9-63
Unit Summary . . . .9-64
Appendix A. Introduction to Databases and XML . . . A-1

Appendix B. Additional Information for XML Schema . . . B-1

Appendix C. What’s New in WebSphere Studio V5.1.1 . . . C-1

Appendix D. Additional Information and Examples . . . D-1

Appendix E. Bibliography and References . . . E-1

Appendix F. Acronyms and Abbreviations . . . F-1

Appendix G. Glossary . . . G-1

</div>
(11)<div class='page_container' data-page=11>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Trademarks xi

TMK

Trademarks

The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United

States, or other countries, or both:

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other
countries.

SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product and service names may be trademarks or service marks of others.

AFS® AIX® alphaWorks®

AS/400® CICS® ClearCase®

Database 2™ DB2® DB2 Universal Database™

DFS™ Distributed Relational

Database Architecture™ Domino®

DRDA® Encina® Everyplace®

IMS™ Lotus Enterprise Integrator® Lotus Notes®

Lotus® MQSeries® MVS™

NetRexx™ Network Station® Notes®

Open Blueprint® OS/2® OS/390®

RACF® RDN™ RS/6000®

S/390® SecureWay® Tivoli®

Tivoli Enterprise™ Tivoli Management

Environment® TME®

TME 10™ TXSeries® VisualAge®

</div>
(12)<div class='page_container' data-page=12>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(13)<div class='page_container' data-page=13>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Course Description xiii

pref

Course Description

Introduction to XML and Related Technologies

Duration: 2.5 days

Purpose

This course provides an introduction to XML (eXtensive Markup

Language) and related technologies. Students will gain conceptual
and practical knowledge of the concepts that are required to work with
XML. The course will build the basic skills to enable architects,

designers, analysts, developers, testers, and administrators to use
XML and its related technologies in the context of building e-business
applications. The course is a 2.5-day classroom course with hands-on
lab exercises that reinforce the lecture material.

Audience

This course is designed for information technology individuals,

including enterprise application architects, designers, developers, and
content modelers and creators.

Prerequisites

Knowledge of Internet technologies is required. Some experience with
using HTML would be helpful, but is not necessary.

Objectives

After completing this course, you should be able to:

• Describe the important XML standards and recommend their use in
business applications

• Define XML documents using namespaces, DTD, or Schema
• Develop and test XML processing applications

• Use XSLT to transform XML documents as necessary
• Identify open areas in XML, such as security, and emerging

technologies such as DB support, XHTML, Web Services, XLink,
and so forth. Plan for their incorporation into XML processing
applications

</div>
(14)<div class='page_container' data-page=14>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(15)<div class='page_container' data-page=15>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Agenda xv

pref

Agenda

Day 1

Unit 1 - Introduction to XML and Related Technologies
Unit 2 - Issues in Electronic Information Exchange
Unit 3 - What Is XML?

XML Basics Lab

Unit 4 - WebSphere Studio Application Developer Overview
Introduction to WebSphere Studio Application Developer Lab
Unit 5 - Document Type Definition (DTD)

DTD Lab

Unit 6 - XML Namespaces
XML Namespaces Lab

Day 2

Unit 7 -XML Schema
XML Schema Lab

Unit 8 - XPath - XML Path Language
XPath Lab

Unit 9 - XSL - eXtensible Stylesheet Language Part 1
XSLT Lab Part 1 - Simple Transforms

Day 3

Unit 9 - XSL - Extensible Stylesheet Language Part 2
XSLT Lab Part 2 - Conditional Transforms

</div>
(16)<div class='page_container' data-page=16>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(17)<div class='page_container' data-page=17>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-1

Uempty

Unit 1. Introduction to XML and Related

Technologies

What This Unit is About

This unit describes the audience, prerequisites, and overall objectives
for XM301. The overall agenda for the course is also covered.

What You Should Be Able to Do

</div>
(18)<div class='page_container' data-page=18>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

1-2 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 1-1. Introduction XM3014.1

Notes:

Introduction

XM301 Introduction to XML and Related Technologies

Instructor:

Please introduce yourself and provide your:

Name and organization

Job Role

</div>
(19)<div class='page_container' data-page=19>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-3

Uempty

Figure 1-2. Course Description XM3014.1

Notes:

Course Description

This course is designed to introduce students to the fundamentals

of XML and its significant derivative companion technologies: XML

Schema, Namespaces, XPath, and XSL Transformations.

Document Type Declarations (DTDs) are also introduced.

The focus of the course is on the creation, specification and

processing of XML documents.

The course is 2.5 days in length and provides extensive hands-on

labs throughout.

</div>
(20)<div class='page_container' data-page=20>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

1-4 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 1-3. Audience XM3014.1

Notes:

Audience

</div>
(21)<div class='page_container' data-page=21>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-5

Uempty

Figure 1-4. Prerequisites XM3014.1

Notes:

Prerequisites

Prerequisites:

</div>
(22)<div class='page_container' data-page=22>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

1-6 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 1-5. Course Objectives (1 of 2) XM3014.1

Notes:

Course Objectives (1 of 2)

After completing this course, you should be able to:

Describe/differentiate the use of HTML and XML

Enumerate the rules of a well-formed XML document

Create and maintain XML documents

Describe the purpose and use of Document Type Definitions

(DTDs)

Create DTDs describing the validation rules for specific XML

instances*

Describe the purpose and use of XML Schema

Enumerate the benefits of XML Schema over DTDs

Create XML Schemas describing the validation rules for specific

XML instances*

</div>
(23)<div class='page_container' data-page=23>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-7

Uempty

Figure 1-6. Course Objectives (2 of 2) XM3014.1

Notes:

Course Objectives (2 of 2)

After completing this course, you should be able to:

Describe the purpose of XML Namespaces

Declare and use XML Namespaces in an XML document*

Describe the use of an XPath in the context of XSLT and XML

Schema

Create XPath expressions that locate specific information in an

XML instance*

Describe the use of XSL in the processing of XML documents

Create an XSL Transformation to transform an XML document

into some other instance*

</div>
(24)<div class='page_container' data-page=24>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

1-8 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 1-7. Agenda - Day 1 XM3014.1

Notes:

Agenda - Day 1

Welcome and Introductions

Issues in Information Exchange

What is XML?

Lab Exercise

Overview of IBM WebSphere Studio Application Developer

Lab Exercise

Document Type Definitions

Lab Exercise

</div>
(25)<div class='page_container' data-page=25>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-9

Uempty

Figure 1-8. Agenda - Day 2 XM3014.1

Notes:

Agenda - Day 2

XML Schema

Lab Exercise

XPath

Lab Exercise

</div>
(26)<div class='page_container' data-page=26>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

1-10 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 1-9. Agenda - Day 3 XM3014.1

Notes:

Agenda - Day 3

</div>
(27)<div class='page_container' data-page=27>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 1. Introduction to XML and Related Technologies 1-11

Uempty

Figure 1-10. Unit Summary XM3014.1

Notes:

Unit Summary

We've looked at the overall course objectives and a day-by-day

agenda.

</div>
(28)<div class='page_container' data-page=28>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(29)<div class='page_container' data-page=29>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-1

Uempty

Unit 2. Issues in Electronic Information Exchange

What This Unit is About

This unit examines the different ways in which information is

exchanged in modern computer systems, identifying issues in each
case. The discussion is restricted to what is exchanged (the content)
not how it is exchanged (the mechanism). A set of messaging criteria
are developed that, if met, will reduce the impact of the issues

identified.

This unit shows some of the business drivers for XML, and gives
examples of how XML is being used by businesses today.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the types of information exchange that occur in modern
computer systems

• Describe information exchange issues that exist in modern
computer systems

• Describe what is needed to address many of the issues that exist in
information exchange

How You Will Check Your Progress

Accountability:

</div>
(30)<div class='page_container' data-page=30>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-2 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-1. Unit Objectives XM3014.1

Notes:

Unit Objectives

After completing this unit, you should be able to:

Describe the types of information exchange that occur in modern

computer systems

Describe information exchange issues that exist in modern

computer systems

</div>
(31)<div class='page_container' data-page=31>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-3

Uempty

Figure 2-2. Electronic Information Exchange (1 of 2) XM3014.1

Notes:

Electronic Information Exchange (1 of 2)

Electronic information exchange is a simple concept:

Electronically encoded information of one sort or another moves

among software units during the execution of some domain-

(business) related function.

There are several contexts for information exchange:

Intra-application - information movement among the parts of an

application.

Inter-application - information movement between applications in

the same company system.

Intercompany - information movement between companies.

Inter-system - information movement between systems in the

same company.

</div>
(32)<div class='page_container' data-page=32>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-4 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-3. Electronic Information Exchange (2 of 2) XM3014.1

Notes:

Electronic Information Exchange (2 of 2)

Company 1

System 1 (Sales)

Company 2

Company 3

Application (Ordering)

Application (CRM)
System 2
(Accounting)

Intercompany
Inter-System

Inter-Application
Intra-Application

</div>
(33)<div class='page_container' data-page=33>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-5

Uempty

Figure 2-4. Intra-Application Information Exchange XM3014.1

Notes:

Intra-Application Information Exchange

In a well-structured application, information flows between three different layers:

The presentation layer (often called the View): presents information to the user

and collects information from the user. This layer is often coupled to a particular
presentation technology, for example, Presentation Manager, X-Windows, and
so forth. Therefore, it often must change significantly when the presentation
mode changes.

The processing layer (often called the Controller): operates on the information

in accordance with the functional requirements of the application.

The business layer (often called the Model or Business Model): maintains the

operational constraints that govern the business as a whole. It ensures that no
individual application contradicts those rules by performing an operation that is
inconsistent with those constraints.

Presentation
Layer
(View)

Process
Layer
(Controller)

Business
Layer
(Model)

</div>
(34)<div class='page_container' data-page=34>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-6 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-5. Agile Views - Multiple Client/Device Support XM3014.1

Notes:

Agile Views - Multiple Client/Device Support

Prior to the arrival of the World Wide Web, applications were largely presented via
workstations or dumb terminals, and required (relatively) infrequent modification of their
presentation layer. The World Wide Web has changed this.

Now, the addition of the mobile work force and use of handheld devices presents new
opportunities for business and new challenges for application developers.

Applications must be presented via:

Cell phones and Handhelds, Wireless Markup Language (WML)
Web Browsers (HTML, Style Sheets, JavaScript)

PDF

And so forth

Many Web applications suffer from coupling issues where applications habitually
generate output that combines Presentation information (font, color, and so forth) with
business information (bank balance, product information, and so forth) making it difficult
to reuse the data stream.

Ideally, the presentation layer would emit/consume a generic, structured information
stream that can be filtered for the target device.

An external rendering engine worries about how it looks, while the application worries 
about what should be viewed.

Enables speedy, low-cost support for new client devices.
Need a View-independent, structured

</div>
(35)<div class='page_container' data-page=35>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-7

Uempty

Figure 2-6. Inter-Application Information Exchange XM3014.1

Notes:

Inter-Application Information Exchange

Ideally, the design of a system takes into account all the operations that it will
perform and the applications that will perform them.

It is rare that enough information exists to perform such an analysis and rarer still
that the design remains stable as the applications that compose the system are
constructed (typically at disparate points in time).

Technology does not stand still; it is common to see applications built late in
the life of a system using technology that is completely different from that
used by the initial ones, for example, COBOL versus Java.

Experience has shown that it is best to focus on the application at hand and allow
the plans for a system to evolve as further applications are built based on new
knowledge of the problem and new technologies.

The way that applications communicate

should not make assumptions about

</div>
(36)<div class='page_container' data-page=36>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-8 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-7. Context-free Communication XM3014.1

Notes:

Context-free Communication

As much as possible, eliminate assumptions from the way in which information is
exchanged.

This means that the information that flows between applications should not be
coupled to a particular technology or to an assumption about how it will be used.

When possible, send an application domain entity, for example, a Purchase
Order rather than the individual pieces, for example, a total, an item description,
and so forth.

Don't use a message that is bound to an implementation technology, for
example, a Serialized Java Object (a Java-specific bit stream).

Ideally, the communication medium would be based on simple, ubiquitous
technology, for example, straight text.

Should be structured and self-describing to eliminate the need for context
awareness in the receiver.

Requires a structured information

(text) format that supports the

</div>
(37)<div class='page_container' data-page=37>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-9

Uempty

Figure 2-8. B2B Intercompany Information Exchange XM3014.1

Notes:

B2B Intercompany Information Exchange

In this case, the presentation is focussed on the Business to Business (B2B)
relationships that exist in e-business.

In such cases, the systems involved often talk to multiple business partners;
sometimes for the same service where selection is based on price, availability,
and so forth, for example, Credit Transaction Validation.

Scenario 2

Communicate with
business partners
through an intermediate
'Marketplace' vendor.
Forced to evolve at the
rate of the intermediary
C1

C2
C3

Cn
M

Scenario 1

Communicate directly with
business partners, potential
for 'n' communication protocols

C2
C3

</div>
(38)<div class='page_container' data-page=38>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-10 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-9. Need to Establish Common Ground for Communication XM3014.1

Notes:

Technology issues aside, it is clear that successful, unfettered B2B

information exchange depends greatly on the creation of

implementation-independent, vendor-neutral languages in which to

conduct business.

Markup languages have existed as a means to embed semantics in

electronic documents (for example, SGML).

SGML was created as a language for describing documents. B2B

communication may benefit from a similar solution, that is, use a

markup language to describe information.

Such a language could be used to describe documents that whole

industries agree on as a means to exchange the information they

need to conduct business.

Need to Establish Common

Ground for Communication

Requires an implementation-independent, vendor-neutral
markup language for describing information; enabling the

</div>
(39)<div class='page_container' data-page=39>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-11

Uempty

Figure 2-10. Inter-system Information Exchange XM3014.1

Notes:

Inter-system Information Exchange

The exchange of information between systems is subject to most of the problems
discussed so far except, perhaps, view coupling. Typically, this sort of

communication does not involve a presentation layer.

When laying out the infrastructure in which systems will reside, it is wise to
establish a means of insulating systems from one another with a layer that is
devoid of implementation and process coupling ... let's call this the Interface Layer
(it's also known as an Abstraction of the System).

The role of the interface layer is to capture the semantics of a system as seen
from an external point of view, and to represent it as a dialog, with messages
providing the units of communication in the dialog.

As long as the definition of the system doesn't change, the dialog (the interface to
the system) should remain stable. The implementation may change significantly.

System1 System2

</div>
(40)<div class='page_container' data-page=40>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-12 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-11. Exchanging Messages XM3014.1

Notes:

Exchanging Messages

Exchanging messages between systems has a lot in common with

exchanging messages between B2B business partners.

The exception is that though inter-system information exchange

requires an established protocol (the interface), the system does not

necessarily benefit from that protocol being an accepted standard

for B2B communication

There are other differences, for example, the likely use of Message Oriented
Middleware in system integration (MOM), but this presentation is focused on
the information being exchanged not on the exchange mechanism.

So, in common with B2B communication:

</div>
(41)<div class='page_container' data-page=41>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-13

Uempty

Figure 2-12. The Semantic Web XM3014.1

Notes:

The Semantic Web

Requires self-describing information
decoupled from View details

An extension of the WWW.

The Web becomes an active (rather than passive) information

space.

Separation of content from presentation is necessary.

That is, Model-View separation.

HTML doesn't have this.

Look at the browser compatibility problem as evidence for the need

for this.

In order for the Web to reason, it must be possible to identify the

units that are going to be reasoned about.

</div>
(42)<div class='page_container' data-page=42>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-14 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-13. A Common Solution? XM3014.1

Notes:

A Common Solution?

Collecting all the observations together, an information solution

addressing each of these issues would be:

a. A view-independent, structured information stream.

b. A structured information (text) format that supports the

expression of semantics.

c. An implementation-independent, vendor-neutral markup

language for describing information, enabling the creation of

domain-specific business languages.

d. Self-describing, decoupled from view details.

In short:

"A text-based, vendor-neutral markup language that supports the

expression of semantics."

</div>
(43)<div class='page_container' data-page=43>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-15

Uempty

Figure 2-14. Checkpoint Questions (1 of 2) XM3014.1

Notes:

Checkpoint Questions (1 of 2)

1. Which of the following will reduce the coupling related to

Electronic Information Exchange?

(Select all that apply)

a. Create messages that are context-free.

b. Use system interfaces to hide implementation details.

c. Combine view information and data in each message.

d. Use messages that are vendor-neutral and

implementation-independent.

e. All of the above.

2. Text-based messages are preferred because:

(Select all that apply)

a. They are implementation-neutral.

b. All software technologies can read/write them.

c. It's easier to debug messaging problems.

d. They can be spell checked.

</div>
(44)<div class='page_container' data-page=44>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

2-16 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 2-15. Checkpoint Questions (2 of 2) XM3014.1

Notes:

Checkpoint Questions (2 of 2)

3. In general, the properties a message should exhibit are:

(Select all that apply)

a. Self-describing

b. Predictable structure

</div>
(45)<div class='page_container' data-page=45>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 2. Issues in Electronic Information Exchange 2-17

Uempty

Figure 2-16. Unit Summary XM3014.1

Notes:

Unit Summary

Having completed this unit, you should be able to:

Describe the types of information exchange that occur in modern

computer systems

Describe the information exchange issues that exist in modern

computer systems

</div>
(46)<div class='page_container' data-page=46>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(47)<div class='page_container' data-page=47>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-1

Uempty

Unit 3. What Is XML?

What This Unit is About

In this unit, the basic elements of XML are explained.

What You Should Be Able to Do

After completing this unit, you should be able to:
• Describe the basic rules of XML

• Identify what makes XML well-formed

• List the components that make up an XML document
• Differentiate between XML and HTML

• Describe the internationalization support in XML
• Define some best practices for XML

How You Will Check Your Progress

Accountability:

• Checkpoint

</div>
(48)<div class='page_container' data-page=48>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-2 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-1. Unit Objectives XM3014.1

Notes:

Although XML is a stable and mature, the supporting technologies are evolving rapidly.
Keep up with the changes at: />

Unit Objectives

After completing this unit, you should be able to:

Describe the basic rules of XML

Describe what it means for an XML document to be well-formed

List the components that make up an XML document

Differentiate between XML and HTML

</div>
(49)<div class='page_container' data-page=49>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-3

Uempty

Figure 3-2. What Is XML? XM3014.1

Notes:

Usually people will talk about this 'XML' and that 'XML' or this 'XML file' and what they are
really referring to is XML markup text encapsulating specific data.

As long as XML text or definitions follow the syntax set of rules, any data can be
represented.

What Is XML?

At its core XML is text formatted to follow a well-defined set of rules.

XML documents consist primarily of tags and text.

If you've ever seen the source to an HTML document, then the

XML structure should look familiar

This text may be stored/represented in:

A normal file stored on disk

A message being sent over HTTP

A character string in a programming language

A CLOB (character large object) in a database

Any other way textual data can be used

XML documents do

not

need to exist as documents --they may be:

Byte streams sent between applications

Fields in a database record

Collections of XML Infoset information items

</div>
(50)<div class='page_container' data-page=50>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-4 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-3. Example Tree Representation of XML XM3014.1

Notes:

This example shows a typical XML document and how it is represented as a tree of nodes.
This conceptual depiction of XML is important to understand.

book is the root element but ROOT is the highest point in the tree or hierarchy: think of
ROOT as the location of a pointer used to keep track of where you are.

XML documents should be thought of as a hierarchical tree

structure.

Example Tree Representation of XML

"Tom

Wolfe"

"$6.00"

"The

Right

Stuff"

<book>

<author>

<title>

<price>

ROOT

=

<?xml version="1.0"?>
<book>

<author>
 Tom Wolfe

</author>
 <title>

The Right Stuff

</title>
 <price>
 $6.00

</div>
(51)<div class='page_container' data-page=51>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-5

Uempty

Figure 3-4. A Simple XML Document - Basic Structure XM3014.1

Notes:

Textual data between tags is also be referred to as content.
Tagged elements of any sort are also known as markup.

Sometimes the term body is used to refer to anything between a start tag and an end tag.

<?xml version="1.0"?> "Optional" first line; only required if

encoding IS NOT UTF-8 or UTF-16*

<book> Root element start tag

<title>

Alphabet from A to Z

</title>

First child element with data

<isbn number="1112-23-4356" /> Empty element (no data)

<author> Begin element tag

<firstName>Boreng</firstName>

<lastName>Riter</lastName> Nested child elements
 </author> End element tag

The letter A is the first in
 the alphabet. It is also the
 first of five vowels.

</chapter>

Element containing an attribute and

parsed character data (PCDATA) [TBD]

<!-- The rest of the letter

chapters are missing --> Comment
 <chapter title="Letter Z">

The letter Z is the last
 letter in the alphabet.

</chapter>

Last element in document

</book> Root element end tag

</div>
(52)<div class='page_container' data-page=52>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-6 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-5. A Simple XML Document - Basic Nomenclature XM3014.1

Notes:

These definitions will be important when we discuss the XML Schema definition language
in a later chapter.

We introduce these terms here in preparation for their use then.

A Simple XML Document -

Basic Nomenclature

The XML instance on the previous page consists of:
One main element book

Subelements title, isbn, author, chapter, and comment

Author contains other subelements firstName and lastName

ISBN and chapter contain attributes number and title, respectively
Title, firstName, and lastName contain only strings:

Elements that contain numbers, strings, dates, and so forth (TBD) but no 
subelements (or attributes) are said to have simple types

ISBN and chapter carry attributes; author has subelements:

Elements that contain subelements or carry attributes are said to have

complex types

Attributes always have simple types (that is, they are numbers, strings,
dates, and so forth.

TBD -- In a later chapter we describe XML Schemas which have access to

</div>
(53)<div class='page_container' data-page=53>

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-7

Uempty

Figure 3-6. Basics of Well-formed XML (1 of 2) XM3014.1

Notes:

As you can see, creating an XML instance will be a rather straightforward task.

Basics of Well-formed XML (1 of 2)

XML documents are considered to be well-formed when they

adhere to a set of five rules that define basic XML syntax and

structure + a sixth for worldwide conformity.

1. There must be a single root element:

All other elements are nested inside the root element

2. Elements must be properly terminated:

For every opening tag "<...>" there must be a matching closing tag 
"</...>"

The exception is an empty (no content or body) tag "<.../>"

3. Elements must be properly nested underneath a parent tag

(except for the single, root element):

A nested tag-pair may not overlap another tag

</div>
(54)<div class='page_container' data-page=54>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-8 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-7. Basics of Well-formed XML (2 of 2) XM3014.1

Notes:

Version 1.1 is about to emerge. Many of the current XML instances lack this declaration.
It is often useful to identify the processing instructions, of which the XML declaration is but
one, as the prolog; the actual XML instance material, that between the root element open
and closing tags, may then be referred to as the XML document.

Basics of Well-formed XML (2 of 2)

4. Tag names are case sensitive:

All tag and attribute names, attribute values, and data must comply
with XML naming rules.

5. Attributes, extra information that can be provided for elements,

must be properly quoted:

That is, all attribute values must be in quotes.

6. The first line should/must contain the special tag that identifies

the version of the XML specification to apply:

</div>
(55)<div class='page_container' data-page=55>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-9

Uempty

Figure 3-8. Element Rules - Rule 1. Single Root Element XM3014.1

Notes:

XML is a Mark Up language. Tags form the basis of all mark up languages.

The purpose of an Element tag is to identify the contents of the data and children tags held
within them.

The root element should have a name that provides a good definition of all the data
contained in the document.

The first physical line in this sample is there because of Rule 6, which we shall cover later.

Element Rules - Rule 1. Single Root Element

All XML documents must have a single root element.

Legal:

Not legal:

<?xml version="1.0"?>

<colors>

<color>red</color>

<color>green</color>

</colors>

<?xml version="1.0"?>

<color>red</color>

<color>green</color>

Colors is the root element for

this XML.

</div>
(56)<div class='page_container' data-page=56>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-10 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-9. Element Rules - Rule 2. Element Tag Rules XM3014.1

Notes:

The empty element notation (< ... />) is unique to XML. The W3C is currently updating the
SGML recommendation to include this syntax.

Empty elements are practical and common when the only associated information is
enclosed within the element's attributes.

For Empty Element tags, a space is required before the tags terminator (" />").

Element Rules - Rule 2. Element Tag Rules

Elements consist of start and end tags.

End tag is identified by the /.

Example:

<color>red</color>

Elements may contain attributes within the start tag.

Example:

<book isbn="34323"></book>

Note: The attribute is isbn.

Empty elements contain no child elements or data.

These elements can be represented with a special shorthand

notation.

Example:

<record key="123"></record>

Can be shortened to:

<record key="123" />

(preferred)

</div>
(57)<div class='page_container' data-page=57>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-11

Uempty

Figure 3-10. Element Rules - Rule 3. Element Nesting XM3014.1

Notes:

There is no limit to the depth of children in XML, but an overly large number may indicate a
poor design.

If an XML document does not have an associated DTD or Schema, then all whitespace is
retained since a processor does not know if it is considered textual data or just for

aesthetics. DTDs and Schemas are covered in later sections.

Element Rules - Rule 3. Element Nesting

Elements must be properly nested.

The end tags of inner elements must occur before the end tags of

outer elements.

</div>
(58)<div class='page_container' data-page=58>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-12 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-11. Element Nesting Example XM3014.1

Notes:

Indentation and other whitespace is only for human readability, but adds "fat" to a
documents size and processing requirements.

This is only an issue with huge XML documents.

It is important to realize that an XML instance is treated by its processor/parser as one,
continuous stream of characters, some of which are recognized by the parser as "special."

As a consequence, when the parser reports an error its location is where the parser
gave up, which may be far beyond where the actual error occurred.

Element Nesting Example

Legal:

Not legal:

<?xml version="1.0"?>

<shirt>

<style>

Polo

</style>

<color>

red

</color>

<size>

large

</size>

</shirt>

<?xml version="1.0"?>

<shirt>

<style>

<size>

large

<color>

red

Polo

</style>

</size></color>

</shirt>

All elements are properly nested.

The element tags are mixed up

and not ordered.

Best Practice:

Use indentation to represent the document's hierarchy.

Important if your document will likely be read by humans.

</div>
(59)<div class='page_container' data-page=59>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-13

Uempty

Figure 3-12. Element Rules - Rule 4. XML Naming Rules XM3014.1

Notes:

Elements may not use W3C reserved Namespace prefix or the letters "XML" in any case.
Element names may not include words reserved by the XML specification. These include:
 • DOCTYPE

• ELEMENT 
 • ATTLIST 
 • ENTITY

Colons (":"), while technically legal in tag names, should not be used as they are reserved
for use with Namespaces.

Element Rules - Rule 4. XML Naming Rules

XML name construction:

The first character must be A-Z, a-z, or _ (underscore)

Any number of subsequent letters, numbers, hyphens,

periods, colons, and underscore characters.

XML names are case sensitive.

Names cannot contain spaces.

Names must not have a prefix of xml in any case combination

(such names are reserved).

Best Practice:

Brevity in tag names is not necessary.

Use descriptive names for elements and attributes.

<Queue>

or

<que>

is far better than

<q>.

Best Practice:

Maintain standard naming conventions and

quoting.

</div>
(60)<div class='page_container' data-page=60>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-14 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-13. Rule 4... Tag Naming - Samples XM3014.1

Notes:

Rule 4. Tag Naming - Samples

Legal

Not Legal

Comments

title, book.isdn,

lastName, _street,

addrLine1, name:first

1name, -street,

&name

Examples of legal and

illegal element names.

<color>

red

</color>

<SIZE>

small

</SIZE>

<color>

red

</COLOR>

<SIZE>

small

<SiZe>

Element names are

case sensitive and

start and end tags

must match.

<fname>

John

</fname>

<f name>

John

</f name>

Element names must

not contain spaces.

<nameXML>

John

</nameXML>

<xmlName>

John

</xmlName>

</div>
(61)<div class='page_container' data-page=61>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-15

Uempty

Figure 3-14. Rule 4... Element Content (1 of 2): General XM3014.1

Notes:

PCDATA is parsed character data.

A "snippet" is a piece of a larger, legitimate XML file.

Rule 4. Element Content (1 of 2): General

An XML instance is composed of elements expressed in tag pairs

(except for empty tags) plus optional attributes that always have

quoted values and optional data that appears between the element

start tag and the element end tag.

Mixed content - element content that contains data (PCDATA is

shown) and other elements.

Example (snippet):

<title><ref>XML</ref> Example</title>

<chapter>

Chapter information

</div>
(62)<div class='page_container' data-page=62>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-16 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-15. Rule 4... Element Content (2 of 2): Data XM3014.1

Notes:

Rule 4. Element Content (2 of 2): Data

Element data content is handled in one of two ways:

1. Parsed Character Data (PCDATA): is examined by the XML

parser to discover XML content embedded within it.

</div>
(63)<div class='page_container' data-page=63>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-17

Uempty

Figure 3-16. Rule 4... PCDATA - Parsed Character Data XM3014.1

Notes:

XML differentiates between markup characters and text characters by providing special
XML escape characters to be used in XML PCDATA.

Only regular parsed character data is allowed inside the attributes value.

Any special characters such as ">" and "&" must always be represented as escape
characters.

The others may appear non-escaped in some places in XML, but it is best to just use the
escape characters all the time.

These escape characters are independent of the encoding chosen.

Rule 4. PCDATA - Parsed Character Data

Predefined entities exist to address ambiguous syntax situations,

situations where the literal would be interpreted as part of the

XML document syntax rather than its content.

Examples:

<range>> 6 & < 20</range>

<quotes characters="'"'"/>

Entity

Description

Character

</div>
(64)<div class='page_container' data-page=64>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-18 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-17. Rule 4... CDATA - Character Data XM3014.1

Notes:

The 5 XML escape characters will not be interpreted (that is, changed to the non-escaped
character) in CDATA sections, so they should not be used. If you put < in the CDATA, you
will see < in the out put not ">". So use the actual characters.

Encoding refers to the character set for the entire document, so it does apply to CDATA as
well.

CDATA sections cannot be nested.
CDATA will retain spaces.

While XML escape characters are not to used in CDATA, you must be aware of how the
'down-line' applications of the XML will use the CDATA.

Common usage: JavaScript in the XML and specialized HTML

Browser may have problems with some special characters which must then be
represented in hex.

example: micro sign (à) = µ

â Copyright IBM Corporation 2004

Rule 4. CDATA - Character Data

Syntax:

<![CDATA[ ...Anything can go here... ]]>

Note: Anything except the literal string "

]]>

";

to embed "]]>" use "]]>"

CDATA is not parsed and is treated as-is.

Useful for embedding other languages within the XML.

HTML documents.

XML documents.

JavaScript source.

Or any other text with a lot of special characters.

Generally speaking the escaping rules inside a CDATA section are

those of the embedded language

</div>
(65)<div class='page_container' data-page=65>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-19

Uempty

example: ampersand (7) = &

</div>
(66)<div class='page_container' data-page=66>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-20 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-18. Rule 4... CDATA Examples XM3014.1

Notes:

Both 'script' element examples are valid. Which one you would use would depend on the
behavior of the application/browser which will use the transformed XML and therefore the
CDATA.

This topic is important to XSLT processing.

Rule 4. CDATA Examples

These script elements contain JavaScript:

This nameXML element stores actual XML to be treated as text:

<script><![CDATA[

function matchwo(a,b) {
 if (a 
 then

{ return 1 }
 else

{ return 0 }
}

]]></script>

<script><![CDATA[

function matchwo(a,b) {

if (a 
 then

{ return 1 }

else

{ return 0 }
}

]]></script>

<nameXML>
 <![CDATA[

<name common="freddy" breed="springer-spaniel">
 Sir Frederick of Ledyard's End

</name>

]]>

</div>
(67)<div class='page_container' data-page=67>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-21

Uempty

Figure 3-19. Element Rules - Rule 5. Element Attributes XM3014.1

Notes:

Attribute naming follows the same rules as element naming.

An element may contain zero or more attributes within its start tag.

Attributes provide extra information to the meaning of the element. This may include "key"
information or other identifying details.

Name collisions are common in XML as shown in the attributes of the first example. Using
Namespaces resolves these sort of issues.

You cannot use the same style quote in the value of the attribute, that is, style="monty's" is
valid, style='monty's' is invalid.

Element Rules - Rule 5. Element Attributes

Attributes are used to attach information to elements.

Attributes consist of a name="value" pair, where the name is a legal

XML name. This is often referred to as a "key-value" pair.

Attributes are placed in the start tag of the element to which they

apply.

An element may have several attributes, each uniquely named.

Examples:

<title type="section" number="1">XML overview</title>

<title type="boat" state="FL">Yacht</title>

Notice the different usage of the attribute "type" in the two elements;

semantically they are not the same.

Attributes must have a value.

</div>
(68)<div class='page_container' data-page=68>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-22 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-20. Element Rules - Rule 6. XML Declaration (1 of 2) XM3014.1

Notes:

All XML documents should begin with this tag, and it MUST be at the first position of the file
(that is, no blank lines or comments or spaces before the tag).

The current version of all XML documents is "1.0" and must appear within the "<?XML" tag
if that tag is used. It indicates the version of XML to which the Document Entity must
conform.

"stand-alone" is included here for completeness: it is automatically set to the correct value -
if it is not used; most users do not include it. We will have more to say on this in our

discussions of the grammars we can apply to XML instances. "Yes" means the document
that follows can stand alone; that is, without requiring a grammar document to complete its
information.

Element Rules - Rule 6.

XML Declaration (1 of 2)

The XML Declaration is an optional first line in all XML documents:
<?xml version="1.0" ?>

<?xml version="1.0" encoding="UTF-8" ?>
<?xml version="1.0" standalone="yes"?>

If this declaration is used, the version attribute is mandatory.

The encoding attribute indicates the character encoding used in the 
document; if UTF-8 or UTF-16 is used it may be omitted.

ASCII is a subset of UTF-8 and need not be declared.
Comments are not allowed before this statement.

The XML Declaration follows the syntax of a Processing Instruction or PI, 
which is described on a subsequent chart, but it is considered to be
unique and is treated separately in the 1.0 XML specification.

GENERAL NOTE OF CAUTION: You can not always rely on a browser or
tool to completely/correctly enforce the specifications. Nor are the

specifications always written in language that, to a particular reader, is 
unambiguous. Still, the best advice is when in doubt, refer to the

</div>
(69)<div class='page_container' data-page=69>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-23

Uempty

Figure 3-21. Element Rules - Rule 6. XML Declaration (2 of 2) XM3014.1

Notes:

The last point may be problematic if, say, the associated DTD file is not readily available for
inspection. You will see in later sections that we can override the attribute values in our
XML instance from within a DTD or XML Schema file.

This may not appear to be a problem at the outset, but over time we may forget that we are
overriding some values.

As XML instances grow in length and complexity this may become a serious source of
confusion.

A best practice is to design the XML instance data to contain ALL the data so that, from an
internal data perspective, it does stand alone.

The stand-alone attribute is included here for completeness: it is used to 
indicate if this XML document depends on information declared externally to

this document (in a DTD or XSL file (TBD), for examples); value may be yes

or no.

A value of "yes" indicates there are no external markup declarations; if
there are no external markup declarations, the declaration has no

meaning.

A value of "no"indicates there are or may be such external markup
declarations; if there are such declarations but there is no standalone
declaration, "no" is assumed.

. . . so it is typically not used.

In any event, the inclusion in the XML instance of references to external
entities, such as those in an embedded DTD, does not change its

standalone status.

A bigger issue associated with the stand-alone attribute is that of defining or 
setting values in any entity that may be external to the XML instance.

Arguably, the principal reason for using XML is that it explicitly defines the
elements it includes. If attribute values are overridden then the XML
instance before us is no longer declarative.

</div>
(70)<div class='page_container' data-page=70>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-24 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-22. Comments XM3014.1

Notes:

Comments can go anywhere in the XML except:

Before the XML Declaration

Inside the actual element tags
Comments are a good thing.

Use them just as would in a program.

Comments

Defines a comment.

A space after the beginning and before the trailing hyphens is

recommended but not required.

<?xml version="1.0"?>

<!-- This is a comment. They can go anywhere

inside an XML document except within an element

tag.

-->

<book>

<chapter>A is the first letter</chapter>

<chapter>Z is the last letter</chapter>

</book>

Improper usage:

</div>
(71)<div class='page_container' data-page=71>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-25

Uempty

Figure 3-23. Internationalization and Encoding (1 of 2) XM3014.1

Notes:

A good way to test that the encoding is correct is by viewing the XML file in IE 5.0 or later.
There are two error messages you may receive from IE or from a parser:

1. An invalid character was found in text content.

You will get this error message if a character in the document does not match the
encoding attribute.

2. Switch from current encoding to specified encoding not supported.

You will get this error message if there is a disconnect between the encoding used in
saving and specification of the encoding. The common problem is that it has been saved as
a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.

Internationalization and Encoding (1 of 2)

Support for different character encodings is provided through the

encoding attribute of the XML Declaration.

<?xml version="1.0" encoding="charset"?>

The encoding attribute indicates the set of characters that are

permitted in the document.

In the absence of an encoding declaration, Unicode UTF-8 or

UTF-16 characters may be used.

Documents exchanged via network may be presented to the

</div>
(72)<div class='page_container' data-page=72>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-26 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-24. Internationalization and Encoding (2 of 2) XM3014.1

Notes:

A good way to test that the encoding is correct is by viewing the XML file in IE 5.0.
There are two error Messages you may receive from IE or from a parser:

1. An invalid character was found in text content.

You will get this error message if a character in the document does not match the
encoding attribute.

2. Switch from current encoding to specified encoding not supported.

You will get this error message if your file there is a disconnect between the saving and
specification of the encoding. The common problem is that is has been saved as a
single-byte encoding and the encoding attribute specifies a double-byte or visa versa.

Internationalization and Encoding (2 of 2)

It is very important that the editor and operating system used to

write and save an XML document support the encoding specified in

the XML Declaration.

Sample encoding declarations:

ASCII (subset of UTF-8)

<?xml version="1.0" encoding="ISO-8859-1"?>

16 bit UNICODE

<?xml version="1.0" encoding="UTF-16"?>

<?xml version="1.0" encoding="ISO-10646-UCS-2"?>

...

Japanese

<?xml version="1.0" encoding="ISO-2022-JP"?>

<?xml version="1.0" encoding="Shift_JIS"?>

...

</div>
(73)<div class='page_container' data-page=73>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-27

Uempty

Figure 3-25. Processing Instruction XM3014.1

Notes:

If a comment is inserted between the XML Declaration and a PI such as the one shown,
Studio will not consider it an error.

A demo file is available in the XM301 Lectures folder, Unit 3.

This PI, although useful, does NOT define a grammar for the XML document in which it

is used: we will talk about grammars in subsequent chapters.

To reemphasize: the XML Declaration, while it may look like a PI, is treated as special!

Processing Instruction

Syntax <? target arg*?>

Processing Instruction is often abbreviated as PI in

documentation.

A feature inherited from SGML.

Used to embed application-specific instructions in documents.

The target name immediately follows "<?" and is used to

associate the PI with an application.

May include zero or more arguments.

May be preceded by comments.

For example,

<?xml-stylesheet href="common.css" type="text/css"?>

,

</div>
(74)<div class='page_container' data-page=74>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-28 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-26. Well-formed versus Valid XM3014.1

Notes:

All XML parsers must check XML documents for being well formed.
XML parsers are classified as being validating, or non-validating.

Well-formed versus Valid

A well-formed XML document:

Consists of XML elements that are nested within another.

Has a unique root element.

Follows the XML naming conventions.

Follows the XML rules for quoting attributes.

Has tags that are properly terminated.

All XML parsers check for well-formedness.

A valid XML document has an associated vocabulary and obeys the

structural rules specified by that vocabulary.

Associated vocabulary is typically defined by either a DTD or an

XML Schema.

XML parsers may be validating or non-validating depending upon

whether or not they can apply an associated grammar.

</div>
(75)<div class='page_container' data-page=75>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-29

Uempty

Figure 3-27. HTML versus XML (1 of 2) XM3014.1

Notes:

All markup tags in HTML are directed at visual composition. No consideration is given to
the actual semantics of the data.

XML markup tags are based solely on the data content.
Clean separation of data and presentation

XML is about structured information
interchange

HTML is about presentation and
browsing

HTML versus XML (1 of 2)

<name>Java Programming</name> 
<department>EECS</department>
 <teacher>

<name>Paul Thompson</name>

</teacher>

<name>Ron Jones</name>
 </student>

<name>Uma Abingdon</name>
 </student>

<name>Lindsay Garmon</name>
 </student>

</div>
(76)<div class='page_container' data-page=76>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-30 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-28. HTML versus XML (2 of 2) XM3014.1

Notes:

These two source listings really show fundamental differences between HTML and XML.
While both contain text marked up by tags, their meaning is entirely different.

Which would you rather parse and insert into a database?

HTML versus XML (2 of 2)

HTML

XML

<html>

<title>Course Roster</title>
<body>

<h1>Course Roster</h1>
 <h2>XML Programming</h2>
 <h3>Department: EECS</h3>

<th>Teacher</th>

<td>Paul Thompson</td>
 </tr><tr>

<th>Student List</th>
 <td>Ron Jones

Uma Abingdon 
 Lindsay Garmon
 </td>
 </tr>
 </table>
</center>
</body>
</html>
<?xml version="1.0"?>
<course>

<name>Java Programming</name>
 <department>EECS</department>
 <teacher>

<name>Paul Thompson</name>
 </teacher>

<name>Ron Jones</name>
 </student>

<name>Uma Abingdon</name>
 </student>

<name>Lindsay Garmon</name>
 </student>

</div>
(77)<div class='page_container' data-page=77>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-31

Uempty

Figure 3-29. HTML and XML Key Differences XM3014.1

Notes:

HTML has a fixed tag set. In XML there is no predefined tag set. The allowed tags in an
XML document are defined in its DTD or Schema.

XHTML is an effort to correct the sins of HTML's past. It is a new XML technology that
consists of an HTML specific DTD that defines the valid HTML tags.

Unfortunately, many of today's browsers will not recognize XHTML documents properly!

HTML and XML Key Differences

HTML XML

Predefined tags define how to present

data. Defines its own tags to identify data.
Allows missing end tags.

and

Requires matching end tags.

Attributes do not require quotes.

Attributes must be quoted.

Attributes do not require a value.

Attributes must have a value.

Tolerates non-nested tags.

<H1><center>Hello!</H1></center>

Strict nesting and tag matching rules.

<H1><center>Hello!</center></H1>

Browsers will almost always do a "best
guess" on ill-formed HTML.

XML Parsers will generate a fatal
exception for well-formedness violations.
Does not support empty elements, but

allows single start tags.

and <hr>

Provides for empty elements.

Is not case sensitive.

</div>
(78)<div class='page_container' data-page=78>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-32 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-30. Checkpoint Questions (1 of 3) XM3014.1

Notes:

Checkpoint Questions (1 of 3)

1. Basic XML can be described as:

A. A hierarchical structure of tagged elements, attributes and text.

B. All the HTML tags plus a set of new XML only tags.

C. Object-oriented structure of rows and columns.

D. Processing instructions (PIs) for text data.

E. Textual data with tags for visual presentation.

2. Which of these XML fragments is not well-formed?

A. <root><class>XML</class></root>

B. <class><root>XML</root></class>

C. <root><class id="XML"></root>

D. <root>XML<class id="XML"/>XML</root>

</div>
(79)<div class='page_container' data-page=79>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-33

Uempty

Figure 3-31. Checkpoint Questions (2 of 3) XM3014.1

Notes:

Checkpoint Questions (2 of 3)

3. XML Comments are allowed (Select all that apply):

A. Before the XML Declaration

B. Anywhere

C. Between element tags

D. Before the root element

E. All of the Above

4. Which of these XML elements with attributes is not well-formed?

A. <

name

first=

'

Tony

'

LAST=

"

Romeo

" />

B. <

name

name=

"

Tony

"

NAME=

"

ROMEO

" />

C. <

_name_

first-name=

"

Tony

"

last-name=

"

Romeo

"/>

D. <name="Tony Romeo" />

E. <

name

name=

"

first='Tony' last='Romeo

'" />

</div>
(80)<div class='page_container' data-page=80>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

3-34 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 3-32. Checkpoint Questions (3 of 3) XM3014.1

Notes:

Checkpoint Questions (3 of 3)

5. Which of these comments regarding HTML and XML is not true?

A. HTML markup is focused on presentation.

B. XML markup is based on defining the data.

C. XML is based on HTML.

D. HTML tags are not case sensitive.

E. XML tags are case sensitive.

</div>
(81)<div class='page_container' data-page=81>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 3. What Is XML? 3-35

Uempty

Figure 3-33. Unit Summary XM3014.1

Notes:

The status of various XML technologies (W3C Activities) can be found at:
 />

Unit Summary

Having completed this unit, you should be able to:

Describe the basic rules of XML

Describe what it means for an XML document to be well-formed

List the components that make up an XML document

</div>
(82)<div class='page_container' data-page=82>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(83)<div class='page_container' data-page=83>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-1

Uempty

Unit 4. WebSphere Studio Application Developer

Overview

What This Unit is About

This unit describes IBM WebSphere Studio Application Developer.
This is an overview of the broad features and organization of this
application development tool.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe role-based development

• Describe the WebSphere Studio family of tools

• State the role of WebSphere Studio Workbench in the WebSphere
Studio tools

• Describe basic features of WebSphere Studio Application
Developer

How You Will Check Your Progress

Accountability:

• Review

References

</div>
(84)<div class='page_container' data-page=84>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-2 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-1. Unit Objectives XM3014.1

Notes:

After completing this unit, you should be able to:

Describe role-based development

Describe the WebSphere Studio family of tools

State the role of WebSphere Studio Workbench in the WebSphere

Studio tools

Describe basic features of WebSphere Studio Application

Developer

Describe the major sets of tooling provided by WebSphere Studio

Application Developer

</div>
(85)<div class='page_container' data-page=85>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-3

Uempty

Figure 4-2. Roles-based Development XM3014.1

Notes:

There are four distinct development roles shown here:
 • Enterprise Integrator

• Bean Provider

• Application Assembler
 • Page Producer

Tooling needs to support each of these roles and permit easy management and integration
of the developed assets.

Workarea

Products

One tool, many user perspectives

Connection
 Data
Business
 Logic Data
Application 
 Flow
Page Layout
and 
Content
JavaBeans
EJBs
JavaBeans
EJBs
Servlets, JSPs,
JavaBeans
HTML, JSPs,
MIME Types

Operational
Environment
Configuration 
Data, Site Usage

Metrics
Tool
Role
Enterprise 
Integrator
Bean
Provider
Application
Assembler
Page 
Producer
Web
Master

WebSphere Studio Tooling

Developing Web Applications requires more than just

writing Java code

</div>
(86)<div class='page_container' data-page=86>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-4 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-3. Development Environment Goals XM3014.1

Notes:

The development environment should support the tasks performed by the developers.
It should be configurable and customizable for each individual developer.

Tools need to accommodate the rapid change in available technologies.

Development Environment Goals

Create a new Development Environment that will:

Be based on a new open, highly pluggable platform

Unified by a new tooling platform
Provide multilevel vendor integration

Provide a role-based development model where the assets are

the focus, not the tool

Provide a common repository solution for all assets and tools

Provide rapid support for new standards and technologies

</div>
(87)<div class='page_container' data-page=87>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-5

Uempty

Figure 4-4. IBM WebSphere Studio Family XM3014.1

Notes:

The IBM WebSphere Studio family is applied to a development platform (as opposed to a
set of development tools).

IBM WebSphere Studio Family

Provide a sturdy Web/Java development platform in the industry

Open tooling and run-time support

Open programming model

Provide in-depth Enterprise connectivity

EJB/J2EE Tooling

Enterprise Connectivity/Enterprise Access Builders

Provide integrated end-to-end development

Built-in Unit Test Environment

Incremental compilation

Flexible debugging support

</div>
(88)<div class='page_container' data-page=88>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-6 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-5. Family Contents XM3014.1

Notes:

The flagship products in the WebSphere Studio brand (Version 5) are:
 • WebSphere Studio Application Developer

ã WebSphere Studio Enterprise Developer

â Copyright IBM Corporation 2004

Family Contents

WebSphere Studio Products (V5) :

WebSphere Studio Application Developer (includes all of Site

Developer functionality

Focused on development of Web Services, JSPs, Servlets, XML and
J2EE and database applications in a team environment

WebSphere Studio Enterprise Developer

Includes all of Application Developer functionality

Focused on Enterprise Integration using the J2EE Connector
Architecture

</div>
(89)<div class='page_container' data-page=89>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-7

Uempty

Figure 4-6. WebSphere Studio Workbench XM3014.1

Notes:

The Workbench is not a tool, that is, it is not in itself a product that is for sale. It is an open
and portable tool platform providing an integration technology. The Workbench can be
thought of as a set of Java frameworks and a set of development tools geared for tool
builders.

WebSphere Studio Workbench

Workbench is:

Not a tool, not a product, not for sale

A portable, universal tool platform and integration technology

The basis for an open source project

Workbench has:

Frameworks and services that enable tool builders to focus on

tooling building

Tools to help tool builders build tools

Java Development Tools (JDT)

</div>
(90)<div class='page_container' data-page=90>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-8 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-7. WebSphere Studio Workbench Rationale XM3014.1

Notes:

The Workbench offers its greatest support for tool builders; making it easy to add plug-ins
(tools) to the overall IDE. This allows quick "time-to-market" of tools supporting emerging
technologies.

The underlying framework which adds to the tool builders productivity gives end-users a
common look and feel.

WebSphere Studio Workbench Rationale

End-users (Web application developers)

No more on-site integration, tools just work together

Common, easy-to-use interface

Common code, project, file management system

Same tool platform regardless of development role

Same look and feel regardless of tool vendor

Tool Builders

Seamless integration and interoperability with IBM AD tools and

WebSphere Software Platform

Seamless integration with other Workbench tools

Enterprise ready, off the shelf

Globalization, distributed debug, Team, SCM

Easy construction and deployment platform for tools

</div>
(91)<div class='page_container' data-page=91>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-9

Uempty

Figure 4-8. WebSphere Studio Application Developer XM3014.1

Notes:

Start the WebSphere Studio Application Developer

Start -> Programs -> IBM WebSphere Studio -> Application Developer 5.1
Workbench opens when you launch Application Developer

Within the workbench -- open the perspectives, views, and editors

</div>
(92)<div class='page_container' data-page=92>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-10 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-9. Terminology XM3014.1

Notes:

The workbench window displays one or more perspectives that contain views and editors.
You can quickly switch between perspectives and views using the shortcut buttons which
appear on the shortcut bar.

Shortcut Bar

Source Pane

Outline Pane Task Sheet

Navigator
 Pane

Editor

Views

</div>
(93)<div class='page_container' data-page=93>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-11

Uempty

Figure 4-10. Perspectives XM3014.1

Notes:

Perspectives

A group of related views and editors

To open a Perspective:

Select via Window -> Open Perspective

Some Perspectives:

Java: to develop and test Java programs

Server: to configure, run, and manage test servers

</div>
(94)<div class='page_container' data-page=94>

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

4-12 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-11. Views XM3014.1

Notes:

Views support editors and provide alternative presentations or navigation of the information
in your workbench. For example, the Navigator displays projects and other resources you
are working with.

A view might appear by itself, or stacked with other views in a tabbed notebook.
On Windows platforms, views can be undocked from the main workbench window and
appear as floating windows on the desktop. Undocked views can also be docked back into
the main workbench window.

More info on the Application Developer menu: Help --> Navigating Workbench

Views

A view displays specialized information. For example:

Bookmarks view displays all bookmarks in workbench.

A view might appear alone in a single pane, or several views might

be stacked within a single tabbed pane.

Views can be undocked/docked from the main workbench window.

Information updates on a view are saved immediately.

</div>
(95)<div class='page_container' data-page=95>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-13

Uempty

Figure 4-12. Editors XM3014.1

Notes:

The key thing to note about editors is the Open-save-close life cycle. You must explicitly
save the corresponding resource after making changes.

Editors

An editor is used to edit or browse a resource.

Modifications made in the editor follow an open-save-close life

cycle.

An editor can contribute to the Workbench menu bar.

Examples:

Java Source Editor

Web Deployment Descriptor Editor

Web Site Configuration Editor

JSP Editor

</div>
(96)<div class='page_container' data-page=96>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-14 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-13. Online Help XM3014.1

Notes:

Tips : F1,

F1 : info pop on a selected task

To hide the navigation frame, click the Hide Navigation button on the Help view's toolbar.
Note: Your product may include more than one information set (a collection of

documentation topics). When you run a search, only the current information set is

searched. The current information set is shown in the drop-down list at the top of the Help
view. To search another information set, select it from the list, and run the search again.

Online Help

To learn more on Workbench, select Help ->Help Contents)

Select Application Developer information

Select Getting Started

</div>
(97)<div class='page_container' data-page=97>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-15

Uempty

Figure 4-14. Cheat Sheets XM3014.1

Notes:

Cheat Sheets

Guide developer through an application development process

Sequence of documented steps with relevant documentation

Displayed in workbench pane

Task-related tools are automatically launched or have launch icons

in cheat sheet

</div>
(98)<div class='page_container' data-page=98>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-16 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-15. Application Developer Design Points XM3014.1

Notes:

Reduced learning curve through the consolidation of tooling to one platform. For example,
with customizable perspectives, one could customize Application Developer to look similar
to other Java IDEs.

Application Developer Design Points

Performance

Customizable Perspectives

Promote role-based development (Web Developer, Java

Developer, DBA, and so forth)

Reduces the learning curve

Perspectives use same project artifacts regardless of perspective

being used

Pluggable development environment

Java and ActiveX plug-in support

IBM and ISVs use same plug-in architecture to extend the

Workbench

Support for automated builds

Apache.org "Ant" support

</div>
(99)<div class='page_container' data-page=99>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-17

Uempty

Figure 4-16. Tooling XM3014.1

Notes:

Tooling

Java IDE

J2EE Tooling

Portlet Tooling

Data Tooling

Web Tooling

XML Tooling

</div>
(100)<div class='page_container' data-page=100>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-18 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-17. Java IDE (1 of 3) XM3014.1

Notes:

A default JRE can be selected for the Workbench with Windows-> Preferences. Project
specific JRE is selected in the Launch Configuration Dialog.

For more on hot method replace, refer to the foil at the end of the unit.

Java IDE (1 of 3)

Ships with SDK 1.3

Pluggable JRE Support

Defined at project and workbench level

Hot Method Replace

Dynamically replace Java classes during debug

Enabled when Application Server V5 runs in debug mode

Java Snippet Support (Scrapbook)

Task Sheet (All Problems Page)

Code Assist

Refactoring Support

Rename/move support for method/class/package

Fix all dependencies for renamed element

</div>
(101)<div class='page_container' data-page=101>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-19

Uempty

Figure 4-18. Java IDE (2 of 3) XM3014.1

Notes:

JDI: Java Debugging Interface. The JDI is a high-level Java API providing information
useful for debuggers and similar systems needing access to the running state of a Java
virtual machine.

Java IDE (2 of 3)

Faster IDE

Smart Compilation

No lengthy compile/build/run steps

Pluggable Framework, in-placetool launching

Running class/code with errors

Precise reference searching

Text and Java-based

JDI-based debugger for local/remote debugging

Run code with errors

</div>
(102)<div class='page_container' data-page=102>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-20 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-19. Java IDE (3 of 3) XM3014.1

Notes:

Starting with V5.1, Application developer adds support for UML visualization. You can
select an existing components and have the system generate the UML diagrams, or you
can start with a blank diagram and develop components from the diagram, or use a
combination of the two approaches. These features let developers understand existing
components better by producing UML that represents the existing components and also
assists them in generating components based on the UML diagrams.

The entire class diagram or portions may be exported in bmp, jpg, or gif image formats.

Java IDE (3 of 3)

UML Class Diagram Editing and Visualization

Support for Java classes and EJB components

Diagrams generated from existing classes/components

New diagrams built and used to develop corresponding

component

Typical Class Diagram Editor operations:

Create classes, packages, and interfaces

Create extends and implements relationships

Create methods and fields

Refactor components

Add EJB relationships

Add EJBQL queries

</div>
(103)<div class='page_container' data-page=103>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-21

Uempty

Figure 4-20. J2EE Tooling (1 of 2) XM3014.1

Notes:

J2EE Tooling (1 of 2)

J2EE 1.3

EJB 2.0 Support

Servlet 2.3, JSP 1.2 Support

J2EE Perspective provides views and editors for EJB/Servlet/JSP

Developer

Object-relational Mapping for EJBs

Top-down/Bottom-up/Meet-in-the-middle

All metadata exposed as XMI

No hidden metadata

EAR and WEB Deployment Descriptor Editors

Forms-based (no need to directly edit XML)

Source view also available

Struts Support

</div>
(104)<div class='page_container' data-page=104>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-22 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-21. J2EE Tooling (2 of 2) XM3014.1

Notes:

WebSphere Studio provides a Web-based Universal Test Client where you can test your
Enterprise JavaBeans (EJBs) and other objects. Using this test client, you can test the
home and remote interface methods of your enterprise beans. By calling the methods and
passing user-defined arguments you can test methods to ensure that they work correctly.

J2EE Tooling (2 of 2)

Connector Projects

J2EE Connector Architecture (JCA) based

EJB Test Client – Universal Test Client

HTML-based

J2EE programming model

Built-in JNDI registry Browser

Unit Test Environment for J2EE

WebSphere Application Server V4 or V5 and Apache Tomcat

Create multiple projects with different Server

configurations/instances

Allows for versioning of unit test environment

</div>
(105)<div class='page_container' data-page=105>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-23

Uempty

Figure 4-22. Portlet Tooling XM3014.1

Notes:

There are actually two related plug-ins. The first, WebSphere Portal Toolkit ships with all
offerings of WebSphere Portal V4.x. The second, WebSphere Everyplace Toolkit ships with
WebSphere Everyplace Server.

The test environment interacts with a developer configuration of WebSphere Portal Server
running on WebSphere Application Advanced Single Server Edition (AEs). This is

facilitated by the Remote WebSphere Server configuration.

Portlet Tooling

Wizards to create Portlet Application

Management of Deployment Descriptors

web.xml

portlet.xml

Multiple portlets per application

Integrated development and test environment

Full use of debugger

</div>
(106)<div class='page_container' data-page=106>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-24 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-23. Data Tooling (1 of 2) XM3014.1

Notes:

Data Tooling (1 of 2)

Data Perspective

Provides views geared for DBAs to:

Create Databases

Create Tables/Views/Indexes/Keys
Generate DDL

Connect to and view existing relational database objects

Online and off-line support for working with databases

Metadata generated as XMI

SQL Query Builder and SQL Wizards

Visually construct SQL statements

SELECT, INSERT, UPDATE, DELETE supported
Metadata generated as XMI

</div>
(107)<div class='page_container' data-page=107>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-25

Uempty

Figure 4-24. Data Tooling (2 of 2) XM3014.1

Notes:

Data Tooling (2 of 2)

DB2 Stored Procedures

Create / Build and Register/ Debug / Drop a stored procedure or

User Defined Function (UDF)

SQL or Java-based

SQLJ Files

Create / Build / Debug SQLJ

</div>
(108)<div class='page_container' data-page=108>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-26 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-25. Web Tooling (1 of 2) XM3014.1

Notes:

The Web Site Designer is new with 5.1. The configuration of the entire Web site is

maintained in the Web Site Configuration object. The choice of static or dynamic web sites
and the Palette view are also newly introduced in release 5.1.

Examples of the drawer labels in the Palette view are: HTML, Free Layout, JSP, Java
Server pages, and Site Parts. The Site Parts include items such as Vertical and Horizontal
Navigation Bars, which help to maintain consistency in the look and feel of pages across
the site.

Web Tooling (1 of 2)

Web Site Designer

Provide site-level views of Web project

Graphical and detail tabular views of site structure

Page Designer

Provides page-level view of Web project components

HTML and JSP editing

WYSIWYG page design, source editing and page preview

Choice of static or dynamic Web project

Appropriate tool support loaded at project creation time

Palette View

</div>
(109)<div class='page_container' data-page=109>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-27

Uempty

Figure 4-26. Web Tooling (2 of 2) XM3014.1

Notes:

Web Tooling (2 of 2)

Multiple markup types (WML, cHTML) and pervasive device support

Built in Servlet, Database, and JavaBean Wizards

Built-in JSP Debugging

Site Style Sheet and Page Template Support

Links View

View HTML/JSP and all links reference in page

Parsing and link management updates link when resources are

renamed or moved

Jakarta JSP Taglibs

Specify in project Properties or New…Project to include

</div>
(110)<div class='page_container' data-page=110>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-28 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-27. XML Tooling (1 of 3) XM3014.1

Notes:

XML Tooling (1 of 3)

XML Tooling provides integrated tools/perspectives to create XML

based components:

XML Source Editor

DTD/Schema validation

Code Assist for building XML documents

DTD Editor

Visual tooling for working with DTDs
Create DTDs from existing documents
Generate an XML Schema from a DTD

Generate JavaBeans for creating/manipulating XML documents
Generate an HTML form from a DTD

XML Schema Editor

</div>
(111)<div class='page_container' data-page=111>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-29

Uempty

Figure 4-28. XML Tooling (2 of 3) XM3014.1

Notes:

XML Tooling (2 of 3)

XSL Editor

Edit/create and validate XSL

XSL Debug and Transformation Tool

Trace XSL transformation

Examine relationships between the result node, the template rule,

and the source node

XML to/from Relational Databases

Generate XML, XSL, XSD from an SQL Query

RDB/XML Mapping Editor

Map columns in a table to elements and attributes in an XML

document

</div>
(112)<div class='page_container' data-page=112>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-30 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-29. XML Tooling (3 of 3) XM3014.1

Notes:

XPath expressions can be used to search through XML documents, extracting information
from the nodes (such as an element or attribute).

XML Tooling (3 of 3)

XPath Expressions Wizard

Create XPath expressions

XML to XML Mapping Editor

</div>
(113)<div class='page_container' data-page=113>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-31

Uempty

Figure 4-30. Performance/Trace Tooling XM3014.1

Notes:

Performance/Trace Tooling

Built-in tooling helps developer isolate and fix performance

problems with their Web application

Profiling and Logging Perspective allows developers to:

Attach to local/remote agents for capturing performance data

JVM Monitoring

Heap
Stack

Class/Method details
Object References

Resource Monitors

Execution patterns
CPU usage

</div>
(114)<div class='page_container' data-page=114>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-32 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-31. Team Development XM3014.1

Notes:

Team Development

Workbench integration occurs through a pluggable, adapter-based

design:

A published framework API allows any SCM provider to add an

adapter to integrate their SCM into the Workbench

Application Developer ships with

CVS Plugin

</div>
(115)<div class='page_container' data-page=115>

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-33

Uempty

Figure 4-32. Web Services Tooling (1 of 2) XM3014.1

Notes:

Web Services Tooling (1 of 2)

Tools to Construct Web Services:

Discover

Browse UDDI registry to locate Web Service (Web Services
Explorer)

Generate JavaBean proxy for existing Web Services

Create / Transform

Create new Web Services from JavaBeans, databases

Build

Wrap existing artifacts such as SOAP and HTTP GET/POST
accessible services

Generate Java client proxy to Web Services

Maintain Web Services Description Language (WSDL) files

(WSDL Editor)

Create new WSDL files

Create ports, port types, messages, bindings, operations, types within
WSDL files

</div>
(116)<div class='page_container' data-page=116>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-34 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-33. Web Services Tooling (2 of 2) XM3014.1

Notes:

Web Services Tooling (2 of 2)

Tools to Construct Web Services:

Deploy

Deploy Web Services to WebSphere or Tomcat Servers

Test

Built-in test client allows for immediate testing of local and remote
Web Services

Publish

</div>
(117)<div class='page_container' data-page=117>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-35

Uempty

Figure 4-34. Standards Support XM3014.1

Notes:

Standards Support

EJB 2.0

J2EE 1.2 and 1.3

Servlet 2.3

JSP 1.2

JRE 1.3

Web Services Definition Language (WSDL) 1.1

Web Servers Interoperability (WS-I) Basic Profile 1.0

Apache SOAP 2.3

XML DTD 1.0 10/2000 Revision

XML Namespaces 1/99 Version

XML Schema 5/2001 Version

</div>
(118)<div class='page_container' data-page=118>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

4-36 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 4-35. Review XM3014.1

Notes:

Review

Name some of the roles in Web application development.

What is the name of the Application Developer perspective you

would usually use for EJB development?

</div>
(119)<div class='page_container' data-page=119>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 4. WebSphere Studio Application Developer Overview 4-37

Uempty

Figure 4-36. Unit Summary XM3014.1

Notes:

Unit Summary

Having completed this unit, you should be able to see:

The concept of Role-Based Development

The WebSphere Studio Family

The WebSphere Studio Workbench in the context of WebSphere

Studio products

</div>
(120)<div class='page_container' data-page=120>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(121)<div class='page_container' data-page=121>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-1

Uempty

Unit 5. Document Type Definition (DTD)

What This Unit is About

This unit covers XML 1.0 DTDs, which provide a way to define the
structure of an XML document. DTDs provide an additional level of

syntactic checking.

What You Should Be Able to Do

After completing this unit, you should be able to:
• Describe the reasons for using DTDs

• Define well-formed versus valid documents

• Define the grammar rules for an XML document using DTD
• Describe the difference between non-validating and validating

processors

• Describe examples of DTDs being used in business
• Describe best practices used in DTDs

• Define the limitations of DTDs

• Describe the status of the DTD in the industry

How You Will Check Your Progress

Accountability:

• Checkpoint

</div>
(122)<div class='page_container' data-page=122>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-2 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-1. Unit Objectives XM3014.1

Notes:

Unit Objectives

After completing this unit, you should be able to:

Describe the reasons for using DTDs

Define well-formed versus valid documents

Define the grammar rules for an XML document using DTDs

Describe the difference between non-validating and validating

processors

Describe examples of DTDs being used in business

Describe best practices used in DTDs

Define the limitations of DTDs

</div>
(123)<div class='page_container' data-page=123>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-3

Uempty

Figure 5-2. Review: Well-Formed XML XM3014.1

Notes:

This is a quick review of the important rules for XML well-formedness. It's important to
recognize that the well formedness rules are very simple.

Review: Well-Formed XML

Has the optional first line; required if encoding is not

UTF-8 or UTF-16.

<?xml version="1.0"?>

Matching start and end element tags with correct syntax.

<tag>data</tag>

Defines attributes within start tag and quotes correctly.

<tag attribute="x">data</tag>

Correct nesting of elements.

<employee>

<name>John Smith</name>

<id>X04913</id>

</employee>

...and Single Root and XML naming constraints

</div>
(124)<div class='page_container' data-page=124>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-4 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-3. Why Do We Need DTDs? XM3014.1

Notes:

The difficulty with well-formedness is that the rules are very simple.
Quite often we want to express more complicated constraints such as:

The element <message> can only have two children, <greeting> and <farewell>, and
the two children must appear in that order

The element <message> may have an optional urgent attribute?

What if we want the computer to be able to verify that an XML document meets these kinds
of constraints?

What if we want to have reusable pieces of text between two XML documents?

What if we want some additional constraints:

<message urgent="yes">

<greeting>hi</greeting>

<farewell>bye</farewell>

</message>

Can only have two specific children (greeting, farewell).

The greeting child must precede the farewell child.

Message may have an optional urgent attribute.

What if we want to define and publish the structure an XML

document is to conform to?

What if we want the computer to be able to verify that an XML

document meets these kinds of constraints?

What if we want to have reusable pieces of text between two

XML documents?

</div>
(125)<div class='page_container' data-page=125>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-5

Uempty

Figure 5-4. What Is a DTD? XM3014.1

Notes:

A Document Type Definition is essentially the framework or skeleton of an XML document.
It defines which elements are allowed, which attributes are allowed for each element, and
whether such elements or attributes are required or optional. XML Schemas (often referred

to as Schemas) extend the functionality of the DTD by adding data typing and other

enhancements. An XML document that conforms to its specified DTD or XML Schema is
said to be valid.

The DTD can be a separate file or it can also be embedded in the XML file. In fact, the DTD
contents can be split across an external file and the XML file.

Blueprint of a document's structure.

Contains a series of declarations.

DTDs

Can be a separate file from the XML document.

Can be embedded within the XML file.

Can be split between a separate file and the XML file.

DTDs define:

The elements that can or must appear.

How often the elements can appear.

How the elements can be nested.

Allowable, required and default attributes.

But note: the use of DTDs is optional.

An XML document that obeys the rules in a DTD is said to be

valid.

</div>
(126)<div class='page_container' data-page=126>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-6 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-5. What is Allowed in a DTD? (1 of 2) XM3014.1

Notes:

Similar material can also be found in the WSAD IE 5.1 help file for DTD.
This page and the next list the elements you may use in a DTD file.

What Is Allowed in a DTD? (1 of 2)

Element type declaration <!ELEMENT . . .>

A syntax for formally describing what an element type is and what

type of data it can contain. Its basic format is: <!ELEMENT name

(content-model)>, where name is the element-type and

(content-model) is the type of data the element can contain.

Many pages follow to more fully explain "content."

Attribute list <!ATTLIST . . .>

A list of attributes for an element. Attribute lists enable you to

group together all related attributes for an element. All elements

must have their attributes listed in an attribute list.

</div>
(127)<div class='page_container' data-page=127>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-7

Uempty

Figure 5-6. What is Allowed in a DTD? (2 of 2) XM3014.1

Notes:

What Is Allowed in a DTD? (2 of 2)

Entity <!ENTITY . . .>

A shortcut used to represent complex strings or symbols that

would otherwise be impossible, difficult or repetitive to include by

hand.

There are built-in or predefined ENTITYs, too.

Notation <!NOTATION . . .>

A means of associating a binary description, typically stored

external to the DTD or XML file, with an entity or attribute. For

example: to include an image such as a GIF or JPEG image.

Comments: No change:

</div>
(128)<div class='page_container' data-page=128>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-8 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-7. XML and DTD Example XM3014.1

Notes:

Here's a simple example of an XML document on the left, and the DTD rules that describe
it on the right. We're not going to go into the details of the rules right here -- that's what the
rest of this unit is about. We just wanted you to have an idea of how an XML file and it's
related DTD might look.

XML and DTD Example

<?xml version='1.0'?>
<address>
<name>
<title>Mrs.</title>
<first-name>Mary</first-name>
<last-name>McGoon</last-name>
</name>

<street>1401 Main Street</street>
<city>Sheboygan</city>

<!ELEMENT address (name, street+,

city, state, zip?, country)>
<!ELEMENT name (title?, first-name,

</div>
(129)<div class='page_container' data-page=129>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-9

Uempty

Figure 5-8. What Is Allowed. . .Declaring Elements XM3014.1

Notes:

Here's our introduction to declaring elements. An element declaration begins with
<!ELEMENT followed by the name of the element being declared and then the content
model for the element.

Here's a sample declaration for an element called greeting that accepts #PCDATA (text),
along with two <greeting> elements that are valid according to this declaration. The second
<greeting> element is using a CDATA section to quote its contents.

Remember, element names must start with a letter or underscore, however, the letters xml,

xsl, xsi and xsd are reserved (regardless of case) by the W3C; future development may
reserve other "x--" prefixes. The colon character is also reserved (see Unit 5.

Namespaces), a period or alphanumeric characters may follow the first character (while
technically legal, an underscore-period combination is not recommended).

#PCDATA (parsed character data) indicates that only text and entities can be included in
the element. This data will be examined by the parser for entities and markup. Parsed
character data cannot contain the characters "&", "<", or ">"; these need to be represented
by their respective entities (Refer to the slide Built-in Entities).

Syntax:

<!ELEMENT

elementName (contentModel)

>

An element declaration in the DTD, and the corresponding element

in the XML document:

Declaration (DTD):

<!ELEMENT

greeting (#PCDATA)

>

Corresponding valid XML fragments:

<greeting>

Hello, World!

</greeting>

<greeting>

<![CDATA[

G'day!

]]>

</greeting

>

</div>
(130)<div class='page_container' data-page=130>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-10 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-9. Element Content Models XM3014.1

Notes:

The content is the stuff in between the element's start and end tag.
There are four types of content models in XML 1.0 DTDs.

Types of DTD Content models
 • EMPTY

• ANY

• Element only - this includes child elements
 • Mixed - this includes child elements and text

Element Content Models

The content of an element is described by a content specification.

Types of DTD Content Models

EMPTY

ANY

Elements

Mixed

</div>
(131)<div class='page_container' data-page=131>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-11

Uempty

Figure 5-10. EMPTY Content Model XM3014.1

Notes:

The EMPTY content model is used for an element that will have no content whatsoever.
Note that such an element may have as many attributes as it likes.

To specify the EMPTY content model, provide the word EMPTY for the content model.
The two examples on the foil show two elements that are valid with an empty content
model.

Empty elements are not much use unless they have attributes. We'll learn more about
declaring attributes in a bit.

An EMPTY element can be very useful for testing snippets of XML. There is an example of
this later in this chapter.

EMPTY Content Model

Element is to have no data. It may have attributes.

Declaration (DTD):

<!ELEMENT

placeholder EMPTY

>

Valid XML:

<placeholder></placeholder>

or

</div>
(132)<div class='page_container' data-page=132>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-12 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-11. ANY Content Model XM3014.1

Notes:

Contrary to what you might expect, the ANY content model does not allow you to put
anything you like between the start and end tag. When you use the ANY content model,
you must supply well-formed xml if what you supply has markup in it. Moreover, the

elements that you use must be declared in the DTD as well. So for the third example on the
foil, the <galaxy> element must be declared in the DTD for the document.

To specify the ANY content model provide the word ANY for the content model.

ANY Content Model

Can contain ANY data or well-formed XML.

Elements you use must be declared in DTD.

Declaration (DTD):

<!ELEMENT

universe ANY

>

<!ELEMENT

galaxy (#PCDATA)

>

Valid XML fragments:

<universe/>

or

<universe></universe>

<universe>

the whole universe

</universe>

<universe>

</div>
(133)<div class='page_container' data-page=133>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-13

Uempty

Figure 5-12. Elements Content Model XM3014.1

Notes:

If the content of an element consists solely of child elements, the element is said to have
element content.

The element content model is specified by content model particles that are combinations of
either element names or other content model particles.

The table describes the operators that can be used to form these combinations.
In the table, a or b can be either content particles or element names.

To create the content model of a followed by b, use the comma (,).
To create the content model of a or b, use the vertical bar (|).
To repeat a content particle at least once, use the (+).

To repeat a content particle zero or more times, use the (*).

To allow a content particle to be absent or present exactly once, use the (?).

Elements Content Model

The elements content model is specified by content model particles.

Content model particles are element names as represented by a

and b and the occurrence indicators below.

<!ELEMENT

name (particle structure)

>

Note: a or b may be a composite particle, that is, a = (c,d)

Particle Syntax

sequence <!ELEMENT name (a,b)>

choice <!ELEMENT name (a|b)>

one <!ELEMENT name (a)>

one or more <!ELEMENT name (a)+>

zero or more <!ELEMENT name (a)*>

</div>
(134)<div class='page_container' data-page=134>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-14 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-13. Elements Content Examples (1 of 3) XM3014.1

Notes:

The first example specifies that <person> has a content model that accepts an <fname>
followed by an <lname> or an <lname> followed by an <fname>. The matches show all the
possible permutations.

Declaration:

<!ELEMENT person ((fname,lname)|(lname,fname))>

<!ELEMENT fname (#PCDATA)>

<!ELEMENT lname (#PCDATA)>

Valid XML:

<lname>Smith</lname>
 <fname>John</fname>
</person>

Also valid XML:

<lname>Smith</lname>
</person>

& also valid XML:

<lname>Smith</lname>
</person>

</div>
(135)<div class='page_container' data-page=135>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-15

Uempty

Figure 5-14. Elements Content Examples (2 of 3) XM3014.1

Notes:

The second example specifies that an <order> is a sequence of at least one <order-item>
followed by a <delivery-address>, followed by an optional <order-date>.

The valid XML shows

1. One <order-item>, a <delivery-address> and no <order-date>.
2. Two <order-items> a <delivery-address> and no <order-date>.
3. Two <order-items>, a <delivery-address> and an <order-date>.

Declaration:

<!ELEMENT order (order-item+,delivery-address,order-date?)>

Valid XML fragments:

<order>

<order-item>item1</order-item>

<delivery-address>123 State Street</delivery-address>
</order>

<order>

<order-item>item3</order-item>
 <order-item>item4</order-item>

<delivery-address>123 State Street</delivery-address>
</order>

<order>

<order-item>item5</order-item>
 <order-item>item6</order-item>

<delivery-address>123 State Street</delivery-address>
<order-date>July 5, 2001</order-date>

</order>

</div>
(136)<div class='page_container' data-page=136>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-16 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-15. Elements Content Examples (3 of 3) XM3014.1

Notes:

This example says that a phone book is at least one <entry>, <column-heading> or

<page-number>, but that there may be more than one of any of these three, and that they
may appear in any order.

The valid XML shows show:
1. Three <entry>'s.

2. Two <column-headings>.

The invalid example is invalid because page-number cannot have entry as a child.

Declaration:

<!ELEMENT phonebook (page)+>

<!ELEMENT page (heading, (entry|advert)+)>
<!ELEMENT heading (#PCDATA)>

<!ELEMENT entry (#PCDATA)>
<!ELEMENT advert (#PCDATA)>

Valid XML fragment:

<heading>The whole town</heading>

<entry>John Smith, 555-1212</entry>

<advert>Fred's Fish n' Chips - 123-4567</advert>
</page>

</phonebook>

Invalid XML fragments:

</div>
(137)<div class='page_container' data-page=137>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-17

Uempty

Figure 5-16. Mixed Content Model XM3014.1

Notes:

Elements that have the mixed content model can contain (parsed) character data. In
addition to the character data, mixed content models may also contain child elements
interspersed with the character data. If a mixed content model contains child elements, it
can specify which elements may appear, but the child elements can appear in any order,
and any number of times.

The valid XML shows:

1. An element with character data content only.

2. An element allowing a single child element in addition to the character data content.

Mixed Content Model

Mixed content: elements that contain character data

optionally interspersed with child elements.

Two Cases of Declarations:

<!ELEMENT

product (#PCDATA)

>

<!ELEMENT

review (#PCDATA | product)*

>

Valid XML fragments:

<review>

review text goes here

</review>

<review>

This is a review of some

<product>

car

</product>

that goes on for

pages of

regular

text.

</review>

</div>
(138)<div class='page_container' data-page=138>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-18 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-17. What Is Allowed. . .Declaring Attributes XM3014.1

Notes:

The syntax for declaring attributes looks like this:
<!ATTLIST followed by

elementName - the name of element we are declaring that attribute for.
attributeName - is the name of the attribute being declared.

attributeType - specifies the data type (see Attribute Type table).
attributeDefault - specifies the attribute's default behavior.

To declare multiple attributes, you can write multiple ATTLIST declarations or repeat the
(attributeName attributeType attributeDefault) part as necessary.

Option 1:

<!ATTLIST elementName

attributeName attributeType defaultDecl

>

Option 2:

<!ATTLIST elementName

attributeName attributeType defaultDecl

...

attributeName attributeType defaultDecl

>

What Is allowed. . .Declaring Attributes

Attribute-list declarations

</div>
(139)<div class='page_container' data-page=139>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-19

Uempty

Figure 5-18. Organizational Note XM3014.1

Notes:

Organizational Note

The next several charts identify possible choices for each

syntactical piece on the previous chart

These charts are followed by examples

We then continue with the concepts identified in the "What is

allowed in a DTD" chart:

ENTITY

ENTITIES

NOTATION

Our intent is to provide you with solid, tested examples you can use

on your own projects

</div>
(140)<div class='page_container' data-page=140>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-20 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-19. Attribute Types XM3014.1

Notes:

CDATA attributes contain character data. Whitespace crunching is not performed.
We covered this on previous charts.

The ID data type contains a string value that must be unique to each element. No element
type may have more than one ID attribute specified, although the declared ID attribute may
be #IMPLIED or #REQUIRED. ID valued attributes can be combined with IDREF and
IDREFS valued attributes to create cross referencing within an XML document.

IDREF's must contain values which are specified in an ID-valued attribute elsewhere in the
document. IDREFS are a space separated list of ID values.

ENTITY and ENTITIES are the name or a space separated list of entity names. (More on
entities in a moment).

NMTOKENs are strings composed of the legal characters in an XML element name -- they
are not the same as XML element names, because the first character of an XML element
name may not contain some of the characters that are legal as the first character of an
NMTOKEN.

Attribute Type Description

String Type

CDATA

Used to declare an attribute whose value may contain
arbitrary character data. Whitespace crunching is not done.
This is the only attribute type permitting attribute values that
do not match the NAME production in the XML 1.0 grammar.
Tokenized Type

NMTOKEN Used to declare an attribute whose value must conform to
the definition of a NAME in XML 1.0

NMTOKENS Allows multiple NMTOKENs separated by white space.
ID Used to declare an attribute whose value must be a unique

within the XML document.

IDREF, IDREFS The value of the attribute must refer to an ID value declared
elsewhere in the document. IDREFS? See NMTOKENS

ENTITY Used to declare an attribute whose value must correspond to

the name of a declared ENTITY.

ENTITIES Allows multiple ENTITY names separated by whitespace.
Enumerated Type

NOTATION References a <!NOTATION declaration in the DTD.
ENUMERATION Attributes have a specified list of acceptable NMTOKEN

values.

</div>
(141)<div class='page_container' data-page=141>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

© Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-21

Uempty

These were introduced earlier.

NOTATION valued attributes must contain the name of a NOTATION declared elsewhere in
the document. (More on NOTATION later).

</div>
(142)<div class='page_container' data-page=142>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

5-22 Introduction to XML © Copyright IBM Corp. 2001, 2004

Figure 5-20. Attribute Default Declarations XM3014.1

Notes:

Every attribute must specify a default type. The possible values for the default type are:
#REQUIRED: Indicates that the attribute must occur; the value may be enumerated or
fixed.

#IMPLIED: Indicates that the attribute or the attribute's value can remain unspecified;
#FIXED value: Indicates that this attribute, when used, has a single (fixed) value, this
value must appear immediately after the keyword and be in quotes.

enumerated list: gives a list of choices in parentheses, each separated by an "or"
operator. A default value (from the enumerated list) may be given after the list and must
be in quotes. If a default value is declared, when the attribute is not present, the

element is treated as if the attribute were present with the declared default value.

Attribute Default Declarations

Default Declaration Description

#REQUIRED The attribute must be present

#IMPLIED The attribute does not need to be present
and no default value was supplied

attribute-value

If the attribute’s value is not present,

"attribute-value" is supplied as a default

value

</div>
(143)<div class='page_container' data-page=143>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-21. Attribute Default Declaration Examples XM3014.1

Notes:

Here we've declared a few attributes with the various default types. Size has a default
value, type is required, and manufacturer is fixed.

Let's look at how the examples come out:
For the valid examples:

<shirt type="short"/> will also pickup the default value "large" for size, and the fixed value
"Levi" for manufacturer

<shirt type="short" size="large"/> will pick up the fixed value "Levi" for manufacturer
For the invalid examples:

<shirt/> is missing the required "type=" attribute

<shirt type=short size="medium large"/> is invalid because "medium large" isn't in the
enumerated value list for size

<shirt type="short" manufacturer="Gap"/> is invalid because "Gap" isn't the fixed value for
manufacturer

Attribute Default Declaration Examples

Declaration:

<!ELEMENT shirt (#PCDATA)>

<!ATTLIST shirt type CDATA #REQUIRED>
<!ATTLIST shirt collar CDATA #IMPLIED>

<!ATTLIST shirt size (small|medium|large) "large">
<!ATTLIST shirt manufacturer CDATA #FIXED "Levi">

Valid XML:

<shirt type="short">cotton</shirt>

<shirt type="short" manufacturer="Levi">denim</shirt>
<shirt type="short sleeve" collar="button-down"></shirt>

Invalid XML:

</div>
(144)<div class='page_container' data-page=144>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-22. Attribute Alternate Declaration XM3014.1

Notes:

Attribute Alternate Declaration

Here is the same information presented using the second

form of the attribute declaration statement.

Declaration:

<!ELEMENT shirt (#PCDATA)>

<!ATTLIST shirt size (small|medium|large) "large"
 collar CDATA #IMPLIED

type CDATA #REQUIRED

</div>
(145)<div class='page_container' data-page=145>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-23. Attribute Types: Tokenized Types: IDREFS Example XM3014.1

Notes:

This foil shows a declaration for an implied attribute of type IDREFS.

According to the syntax rules for IDs, numbers cannot be ID's. That is why the
serialNumber values begin with a letter.

Aside from naming rules, manager2 could have any value as long as there is an element
with that value defined.

Consequently, an employee could be self-managed!

The uniqueness constraint applies to IDs not to IDREFs so the employee could be
self-managed twice: both manager1 and manager2 could have the same value.

Attribute Types: Tokenized Types:

IDREFS Example

Syntax:

<!ATTLIST elementName attributeName IDREF defaultDecl>

Declaration:

<!ELEMENT employee (#PCDATA)>

<!ATTLIST employee serialNumber ID #REQUIRED>
<!ATTLIST employee manager1 IDREF #IMPLIED>
<!ATTLIST employee manager2 IDREFS #IMPLIED>

Valid XML fragment:

<employee serialNumber="e00001">Joe Smith</employee>
<employee serialNumber="e00002">Bill Smith</employee>
<employee serialNumber="e00003" manager1="e00001">John 
Smith</employee>

<employee serialNumber="e00004" manager1="e00001" 
manager2="e00002 e00001">John Smith</employee>

</div>
(146)<div class='page_container' data-page=146>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-24. Attribute Types: Tokenized Types: ENTITY Example XM3014.1

Notes:

This foil shows a declaration for an implied attribute of type ENTITY.

As you can see there are several concepts involved that we have yet to discuss.
Not the least of which is "what is an 'entity'?"

You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.

The concepts themselves are described on subsequent charts.

Attribute Types: Tokenized Types:

ENTITY Example

Syntax:

<!ATTLIST elementName attributeName ENTITY defaultDecl>

Declaration:

<!ELEMENT employee (#PCDATA)>

<!ATTLIST employee companyName ENTITY #REQUIRED>
<!ENTITY company

SYSTEM /> NDATA txt>

<!NOTATION txt

SYSTEM "file:///C:/Windows/System32/notepad.exe">

Valid XML fragment:

<employee companyName="company">Joe Smith</employee>

ENTITY is also used in its own right as another element of a DTD; this is
covered in subsequent charts. Here we focus on ENTITY as an attribute.
NDATA and NOTATION are concepts we have yet to discuss.

</div>
(147)<div class='page_container' data-page=147>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-25. Attribute Types: Tokenized Types: ENTITIES Example XM3014.1

Notes:

ENTITIES provide a mechanism for including data from multiple sources.

As you can see there are several concepts involved that we have yet to discuss.
You will find this and the next chart useful on the job when you need to create or
understand a DTD that uses these concepts.

While DTDs may be lacking in several important aspects (listed later), they can still be very
complex!

Like the ENTITY example, we need to define several concepts for this chart to be
understood. The explanations follow.

Attribute Types: Tokenized Types:

ENTITIES Example

Syntax:

<!ATTLIST elementName attribName ENTITIES defaultDecl>

Declaration:

<!ELEMENT employee (#PCDATA)>

<!ATTLIST employee companyAtts ENTITIES #REQUIRED>
<!ENTITY company "IBM">

<!ENTITY division "19">

<!ENTITY branch " />

Valid XML fragment:

</div>
(148)<div class='page_container' data-page=148>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-26. DTDs Part II XM3014.1

Notes:

DTDs Part II

</div>
(149)<div class='page_container' data-page=149>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-27. Declaring ENTITYs: an Internal, Parsed ENTITYs Example XM3014.1

Notes:

Here is an example.

But we just told you that entities are related to separate storage units, and the entity
declaration that we just saw fit completely into the DTD. This kind of entity is called an
internal entity and is not associated with a separate physical storage unit. Let's look at how
to declare the same entity as an external entity, in a separate physical storage unit.

Declaring ENTITYs: an Internal, Parsed

ENTITYs Example

Syntax:

<!ENTITY entityName "replacementText">

Usage:

&entityName;

Declaration:

<!ENTITY xmlExpert "Ron Smith">

<!ENTITY topic "XML Documents">

Valid XML:

<response>For additional help with &topic;,
Please contact &xmlExpert;.</response>

Processed XML:

</div>
(150)<div class='page_container' data-page=150>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-28. Declaring ENTITYs: an External, Parsed ENTITYs Example XM3014.1

Notes:

In this case where the entity defines a public URI, the parser must understand how to
handle the "publicURI" identifier. This is traditionally only used when the parser provided
was hard-coded to handle it, or if you will be creating your own parser to handle entity
replacement.

According to 4.2.2 (External Entities) of the XML 1.0 specification: "Definition: In
addition to a system identifier, an external identifier may [emphasis added] include a

public identifier. An XML processor attempting to retrieve the entity's content may 
[emphasis added] use the public identifier to try to generate an alternative URI

reference. If the processor is unable to do so, it must [emphasis added] use the URI

reference specified in the system literal...."
Here is their example:

<!ENTITY open-hatch

SYSTEM " /><!ENTITY open-hatch

PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"

Declaring ENTITYs: an External, Parsed

ENTITYs Example

Syntax:

<!ENTITY entityName SYSTEM "systemURI">

<!ENTITY entityName PUBLIC "publicURI" "systemURI">*
*refer to the Notes.

Declaration:

<!ENTITY copyrightInfo SYSTEM "file:///c:/legal/boilerplate.txt">

boilerplate.txt file:

Valid XML:

<notices>This application was developed using WebSphere Studio.

&copyrightInfo;</notices>

Processed XML:

This application was developed using WebSphere Studio.

</div>
(151)<div class='page_container' data-page=151>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

<!ENTITY hatch-pic

SYSTEM "../grafix/OpenHatch.gif"
NDATA gif >

Find out more at: />

Be aware that an external entity may not recursively reference itself, either directly or
indirectly.

</div>
(152)<div class='page_container' data-page=152>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-29. Unparsed Entity Declarations: a Review XM3014.1

Notes:

Here's an example of unparsed entity use:

First we declare a notation called jpeg and associate it with a photoshop.exe somewhere
on the local machine.

Then we declare an external unparsed entity called prod17792 and add the NDATA jpeg
clause to specify the notation.

The rest of the DTD declares an empty element item with an ENTITY valued attribute
called picture.

You can see in the XML instance document that we supply prod17792 (the name of the
entity) as the value of the picture attribute of item.

This is how you can associate a piece of unparsed/binary data with a portion of an XML
document.

Unparsed Entity Declarations: a Review

Syntax:

<!ENTITY entityName SYSTEM "URI" NDATA notationName>

Declaration:

<!NOTATION jpeg SYSTEM

"file:///c:/Program Files/Photoshop/photoshop.exe">
<!ENTITY prod17792 SYSTEM "prod17792.jpg" NDATA jpeg>

<!ELEMENT item EMPTY>

<!ATTLIST item picture ENTITY #REQUIRED>

Valid XML:

<item picture='prod17792'/>
Rules:

Unparsed entities can only be external entities. In order to declare an unparsed
entity, you start with a regular external entity declaration and before the closing
angle bracket you insert NDATA and the name of a notation. This associates a
notation name with the unparsed entity.

</div>
(153)<div class='page_container' data-page=153>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-30. Parameter ENTITYs XM3014.1

Notes:

The parameter entity replacement works like regular entity replacement. The parser will
substitute the replacement text, and then continue evaluating the DTD from the point of
replacement.

Parameter entities are entities that are meant to be used in the DTD. Parameter entities
are very useful if you want to reuse portions of an attribute list declaration or if you want to
reuse parts of a complex content model specification.

Parameter entities are the primary tool that is available to help you structure a complex
DTD.

Parameter ENTITYs

Parameter entities:

Can only be used in the DTD

Allows reuse of attribute lists and complex type definitions

Syntax:

<!ENTITY % parameterEntityName "replacementText">

Usage:

%parameterEntityName;

Declaration:

<!ENTITY % commonAtts "make CDATA #IMPLIED
model CDATA #IMPLIED">
<!ELEMENT phone (#PCDATA)>

<!ATTLIST phone %commonAtts

type (rotary | touch-tone) #IMPLIED>

Processed DTD:

<!ELEMENT phone (#PCDATA)>

<!ATTLIST phonemake CDATA #IMPLIED

model CDATA #IMPLIED

</div>
(154)<div class='page_container' data-page=154>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-31. Parameter ENTITYs - Another Example XM3014.1

Notes:

In this example the commonAtts parameter entity is used to represent common attributes

for the three different elements: car, computer and phone.

<!ENTITY % commonAtts

"typeID ID #REQUIRED
 make CDATA #IMPLIED
 model CDATA #IMPLIED">

<!ELEMENT car (#PCDATA)>
<!ATTLIST car %commonAtts;>

<!ELEMENT computer (#PCDATA)>
<!ATTLIST computer %commonAtts;>

<!ELEMENT phone (#PCDATA)>
<!ATTLIST phone %commonAtts;

type (rotary|digital) #IMPLIED>

</div>
(155)<div class='page_container' data-page=155>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-32. What Is Allowed. . . Declaring Comments XM3014.1

Notes:

To insert a comment in a DTD (or an XML document for that matter) place the comment
text inside .

Comments cannot be nested. The space after the . The characters "--" may not be used within the comment. This form of declaration is
also usable within HTML, XML and XSL documents.

What Is Allowed. . . Declaring Comments

Use comments to clarify the semantics of elements and attributes

for those who are using the DTD to define conforming XML

documents.

Syntax:

Whitespace may be used to format the comment:

<!--

This is also

a comment

-->

</div>
(156)<div class='page_container' data-page=156>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-33. Joining a DTD to an XML Instance XM3014.1

Notes:

Overriding (changing) the data contained in an XML instance may cause confusion for
other users of the instance.

The application of an XSL transform or a processor program (for example, DOM, SAX,
or similar) may be a better alternative.

Joining a DTD to an XML Instance

Three ways to inform an XML instance that there is an associated

DTD:

1. Embed the DTD content inside the XML instance;

2. Provide the URI where the DTD file resides;

3. Use a combination of 1. and 2.

Best practice: if the DTD will override one or more attribute values

(not advised), set the 'standalone' attribute in the XML declaration to

'no' as a warning to users that they need to be aware.

Include a comment in the XML for each attribute whose value

may be changed by the DTD file.

If the DTD file is large, include a comment near the beginning for

each element that overrides a value in the associated XML

</div>
(157)<div class='page_container' data-page=157>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-34. External DTD Subset XM3014.1

Notes:

Up until now we've described some of the contents of a DTD without showing how to
actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at placing the DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.

The external DTD subset is an entity even though DTD declarations are not elements.
Therefore you need to supply a text declaration at the beginning of the external DTD
subset. This is especially important if the document and the DTD are going to be using
different character encodings.

In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have it's own encoding declaration (which

may be different from the encoding of documents that reference the DTD file).

The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the

DTD and XML as separate files:

Filename: hello.xml

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE message SYSTEM "message.dtd">

<message>

<greeting>

Hello, World!

</greeting>

<farewell>

Goodbye, World!

</farewell>

</message>

Filename: message.dtd

<!ELEMENT

message (greeting,farewell)

>

<!ELEMENT

greeting (#PCDATA)

>

<!ELEMENT

farewell (#PCDATA)

>

</div>
(158)<div class='page_container' data-page=158>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

DTD subset. This means that potentially any element declaration in the DTD can serve as
the root element. It is up to the DOCTYPE writer to specify this. Following the name of the
root element is the keyword SYSTEM followed by a URI reference that the local machine
can use to locate the actual file containing the external DTD.

</div>
(159)<div class='page_container' data-page=159>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-35. Internal DTD Subset XM3014.1

Notes:

Up until now we've describe some of the contents of a DTD without showing how to

actually place those declarations in a file so that they can be used to validate a document.
Recall that the DTD may be in an external file, embedded directly in an XML file, or split
across an external file and the XML file. Let's look at the placing DTD declarations in an
external file. The part of the DTD that goes into the external file is called the external DTD
subset.

In the example below, the file message.dtd contains the declarations of three elements,
message, greeting and farewell. The DTD may have its own encoding declaration (which
may be different from the encoding of documents that reference the DTD file).

The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE
declaration specifies the name of the root element of the document, message in the

DTD and XML as a combined file:

Filename: hello.xml

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE message [

<!ELEMENT message (greeting,farewell)>

<!ELEMENT greeting (#PCDATA)>

<!ELEMENT farewell (#PCDATA)>

]>

<message>

<greeting>

Hello, World!

</greeting>

<farewell>

Goodbye, World!

</farewell>

</message>

</div>
(160)<div class='page_container' data-page=160>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(161)<div class='page_container' data-page=161>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-36. Split DTD Subsets XM3014.1

Notes:

The example on this foil shows a DTD with an entity called destination in both the internal
and external subsets. The declaration for destination in the internal subset will override the
declaration in the external subset, leaving the messages "Hello cruel world" and "good-bye
cruel world" after entity expansion has occurred.

This allows local entity declarations in the internal subset to override entity declarations in
the external subset.

A best practice would be to include a comment drawing attention to the intent of this
internal subset to override a value set in the external subset.

Embedding DOCTYPE declarations and the DTD within the XML file:

Filename: hello.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE message SYSTEM "message.dtd" [
 <!ENTITY destination "cruel world">

]>

<greeting>Hello, &destination;</greeting>
<farewell>Goodbye, &destination;</farewell>
</message>

Filename: message.dtd

<!ELEMENT message (greeting, farewell)>
 <!ELEMENT greeting (#PCDATA)>

<!ELEMENT farewell (#PCDATA)>
 <!ENTITY destination "World">

</div>
(162)<div class='page_container' data-page=162>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-37. Whitespace and DTDs XM3014.1

Notes:

Whitespace is white space isn't it? Not if you are a validating XML processor. There are two
kinds of white space:

Whitespace in #PCDATA element content (between the same start and end tag pair) -
you only know this if you have a DTD

Whitespace in non-character data content

Whitespace not in #PCDATA data element content is ignorable

Parsers report whitespace and ignorable whitespace differently. The parser does not
actually discard the ignorable white space -- this is the application's job. But the parser can
use different data structures / callback routines in order to report ignorable versus not
ignorable whitespace.

Whitespace and DTDs

Whitespace is white space isn't it?

Not if you are a validating XML processor.

Whitespace in #PCDATA element content (between the same

start and end tag pair)

Only know this if you have a DTD

Whitespace in non-character data content

</div>
(163)<div class='page_container' data-page=163>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-38. Ignorable Whitespace Example XM3014.1

Notes:

This slide shows an example XML document and DTD, and shows which whitespace is
ignorable and which whitespace is not.

Again, it is up to the application to decide what to do about ignorable whitespace. An XML
processor will report all of the whitespace and indicate whether or not it is ignorable or note.

<!DOCTYPE example [

<!ELEMENT example (source-code)>

<!ELEMENT source-code (#PCDATA)>

]>

<example> <-- ignorable

<source-code> <-- not ignorable

int i; <-- not ignorable

i = 0; <-- not ignorable

</source-code> <-- ignorable

</example>

</div>
(164)<div class='page_container' data-page=164>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-39. Validating versus Non-validating Processors XM3014.1

Notes:

Validating processors are straightforward. The XML spec tells implementors exactly what a
validating processor must do (in fact, they must do everything).

Non-validating processors have options because the XML spec says that a non-validating
processor may do certain things, but is not required to do them. Unfortunately, every parser

implementor has chosen a different subset of items from this list to implement, so every
non-validating parser behaves just a little differently.

A non-validating processor must check the document entity including the internal subset.
If there is an external DTD subset, they may or may not:

normalize attribute values from the external subset
replace internal entity text from the external subset
supply attribute defaults from the external subset

Since the behavior of non-validating processors is up to implementors, you need to be
careful when working with a non-validation processor if you have complicated attribute
values or use entities.

Validating versus Non-validating Processors

Validating processors will validate an XML document using the

DTD.

Processors will report validity errors.

Some behavior of parsers is up to implementors.

Parsers have options:

They check document entity including internal subset.

They report well-formedness errors.

If there is an external DTD subset, they may or may not:

Normalize attribute values.

</div>
(165)<div class='page_container' data-page=165>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-40. Example DTDs XM3014.1

Notes:

Many organizations are producing DTD's for various applications. Here some examples:
 • cXML -

- cXML is a streamlined protocol intended for consistent communication of business 
documents between procurement applications, e-commerce hubs and suppliers.
The current standard includes documents for setup (company details and

transaction profiles), catalogue content, application integration (including the
widely-used PunchOut feature), original, change and delete purchase orders and
responses to all of these requests, as well as new order confirmation and ship notice
documents (cXML analogues of EDI 855 and 856 transactions).

• RosettaNet -

- RosettaNet Partner Interface Processes (PIPs™) define business processes 
between trading partners. RosettaNet dictionaries provide a common set of
properties for PIPs™. The RosettaNet Business Dictionary designates the

properties used in basic business activities. RosettaNet Technical Dictionaries

Example DTDs

W3C XHTML

cXML

B2B between procurement applications, e-commerce hubs and

suppliers.

RosettaNet

Business processes between trading partners and properties for

defining products.

RDF Site Summary (RSS)

Syndicating news articles.

DocBook

Production of documentation which can be rendered into multiple

output formats.

Open Financial Exchange(OFX).

</div>
(166)<div class='page_container' data-page=166>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

standards expedite the alignment of business processes between trading partners.
 • RSS - />

- The RDF Site Summary format was originally developed by Netscape and is widely 
used across the World Wide Web for the purpose of syndicating news articles.
 • DocBook -

- DocBook is an XML version of the SGML DocBook DTDs that are widely used in the 
production of documentation which can be rendered into multiple output formats.
 • OFX -

</div>
(167)<div class='page_container' data-page=167>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-41. What's Wrong with DTDs? XM3014.1

Notes:

There are number of problems with DTD's, which are listed on the chart.

These problems have led to the creation of a number of alternate languages for defining
the structure of XML grammars. The two leading contenders are W3C's XML Schema, and
OASIS's Relax NG.

What's Wrong with DTDs?

No type support.

#PCDATA can be any string of characters (except tags)

DTD syntax is different from XML syntax.

<!ELEMENT zip (#PCDATA)>

There are some constraints DTDs cannot easily express:

Element x can occur from 4 to 17 times

XML schema addresses many of the limitations of DTDs.

XML schema is now a W3C recommendation.

Support for W3C Schema is new.

Features include:

XML syntax, strong typing, constraints

</div>
(168)<div class='page_container' data-page=168>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-42. Status of DTDs XM3014.1

Notes:

DTD's are a part of the XML 1.0 recommendation. They are a stable technology and widely

adopted. As we noted earlier there are variations in XML processors in accordance with
varying definition of non-validating. Most XML parsers available today come with the
capability to use DTDs to validate documents.

XML Schema is the W3C approved replacement for DTD's, but this is a new technology
and has not reached broad usage at the time of this writing.

Part of XML 1.0

Widely adopted

Variations in XML processors in accordance with varying definition

of non-validating

</div>
(169)<div class='page_container' data-page=169>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-43. Tooling XM3014.1

Notes:

The Tooling for DTDs is pretty simple at the base. You can use the same editor that you
use to edit an XML file to edit a DTD. They are the same kind of text.

There are also many tools for working with DTD's.

IBM's alphaworks has a number of useful tools.

The commercially available XML Spy is a popular graphical tool for working with XML,
DTD's and XML Schema.

There are many parsers that perform validation using a DTD. This is true of all of the
parsers available from the Apache Software Foundation.

Can use any text editor

As long as the editor supports Unicode or the chosen encoding.

WebSphere Studio Application Developer

Provides guided editing for DTDs and documents that reference

them

Can generate a DTD from sample XML.

Write sample XML that illustrates all the ways you'll use the

data

Supports document validation

Free IBM Alphaworks tools to help you

/>

Many validating parsers:

Apache's Xerces for Java, C++, Perl

Apache's Xerces Perl

JAXP, Java XML Parser

</div>
(170)<div class='page_container' data-page=170>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-44. Checkpoint Questions (1 of 2) XM3014.1

Notes:

Checkpoint Questions (1 of 2)

1. Which DTD entry correctly depicts phone number, with optional

area code?

a.<!ELEMENT phone ((areaCode)*, prefix, body)>

b.<!ELEMENT phone (areaCode?, prefix, body )>

c.<!ELEMENT phone?(areaCode, prefix, body )>

d.<!ELEMENT phone (areaCode, (prefix, body)+)>

2. Which of the following is a limitation of DTD?

a. Non-XML syntax.

b. Does not easily allow range of values (that is, 5 to 1000

elements).

c. Does not provide proper typing of values (that is, integer versus

string).

</div>
(171)<div class='page_container' data-page=171>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 5-45. Checkpoint Questions (2 of 2) XM3014.1

Notes:

Checkpoint Questions (2 of 2)

3. Which DTD entry correctly depicts an optional attribute named

type for a pet element, that defaults to the value "dog"?

a.<!ATTLIST pet type CDATA #IMPLIED>

b.<!ATTLIST type dog CDATA #FIXED "dog">

c.<!ATTLIST pet type CDATA "dog">

</div>
(172)<div class='page_container' data-page=172>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 5-46. Unit Summary XM3014.1

Notes:

In this section you have learned about:
 • XML 1.0 DTD's

• Element declarations
 • Attribute declarations
 • Comments

• Entity declarations
 • General

• Parameter

• Notation declarations

• The difference between validating and non validating processors
 ã Example DTD's

ã Best Practices

Unit Summary

In this section you have learned:

XML 1.0 DTDs

Element declarations

Attribute declarations

Entity declarations

General
Parameter

Notation declarations

Comments

The difference between validating and non validating processors

Example DTDs

</div>
(173)<div class='page_container' data-page=173>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(174)<div class='page_container' data-page=174>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

</div>
(175)<div class='page_container' data-page=175>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Unit 6. XML Namespaces

What This Unit is About

This unit describes the XML Namespaces Facility.

What You Should Be Able to Do

After completing this unit, you should be able to:
• Describe the reasons for using namespaces
• Describe the syntax used in namespaces

• Define and illustrate an example using namespaces
• Define myths about namespaces

• Define problems with namespaces

• List and define the best practices to use when using namespaces
• Describe the status of namespaces in the industry

How You Will Check Your Progress

Accountability:

• Checkpoint

</div>
(176)<div class='page_container' data-page=176>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-1. Unit Objectives XM3014.1

Notes:

Unit Objectives

After completing this unit, you should able to:

Describe the reasons for using namespaces

Describe the syntax used in namespaces

Define and illustrate an example using namespaces

Define problems with namespaces

</div>
(177)<div class='page_container' data-page=177>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-2. Problem: Element and Attribute Names can be Ambiguous XM3014.1

Notes:

The double use of title in the example illustrates the need for a namespace solution in XML.
We need to be able to tell that the two title elements in this document are not the same
element. Even though the elements have the same name, they have different meanings to
the application. Using the context to disambiguate the two uses is not a generally

applicable solution.

Consider the following XML document:

How does an application know that:

The first occurrence of title is a book title.

The second occurrence of title is a person's title.

Need a way to eliminate the ambiguity for the purpose of

processing.

Problem: Element and Attribute Names

Can Be Ambiguous

<lastName>Expert</lastName>
 <firstName>Iman</firstName>
 </author>

</book>

</div>
(178)<div class='page_container' data-page=178>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-3. Elaboration XM3014.1

Notes:

URI's are not actually used for lookup, only as reference. The only purpose is to give the
namespace a unique name. Sometimes the URI is a pointer to a web page, which provides
information about the namespace, but this is not required. The URI is not looked up as part
of XML parsing or processing.

The application is responsible for deciding what to do with the names.

Some possibilities:

Adopt industry standard document formats and naming conventions

This approach works at the document level, a good example is

ebXML, refer to

Problems:

No industry is an island, industries interact: who decides?

Naming standards down to the element/attribute level are too brittle

Use verbose element names, that is, bookTitle, courtesyTitle

Problem: naming becomes fundamentally difficult, there is no way to

know if a name is already in use, further, the data and/or its model

may not belong to the consuming application.

Solution

Use some name qualifier that is already established as unique, that is, a

domain-name-qualified URI (uniform resource identifier).

Domain names are already managed and maintained as unique. This

approach was developed into XML Namespaces.

</div>
(179)<div class='page_container' data-page=179>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-4. Namespaces: The Big Idea XM3014.1

Notes:

URI, recall, is uniform resource identifier.

Namespaces: The Big Idea

In concept, each element name and attribute name could be

expressed as: URI+name, for example, <title> might become:

< />

There are two problems with this format:

1. It is not well-formed XML under the 1.0 specification.

2. It is a lot of typing.

If it were possible to create a synonym for the URI and replace

occurrences of the URI with that synonym, the amount of typing

would be reduced and, if handled correctly, the result would be

compatible with XML 1.0

For example, specify books=" and

code the element as <books:title>

This concept forms the basis of the XML Namespace specification

.

</div>
(180)<div class='page_container' data-page=180>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-5. XML Namespaces XM3014.1

Notes:

The application is responsible for deciding what to do with the names.

XML Namespaces

The Namespace specification

refers to these two-part names

as Qualified Names or QNames

For the purposes of XML namespaces, URIs are considered

identical when they match character for character. If URIs are

different, they represent different Namespaces.

Note: There is no network lookup associated with the use of URIs

in this specification, it is a lexical convention only.

URIs are not checked by the processor to ensure they exist.

The Namespace specification deals with the mechanics of

associating a URI qualifier (aka namespace) with element and

</div>
(181)<div class='page_container' data-page=181>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-6. Qualified Names (QNames) XM3014.1

Notes:

You can think of a QName like <books:title> as being equivalent to the following Clark
Notation:

{ />

Qualified Names (QNames)

QNames are used in place of element and attribute names.

QNames have a prefix and a local part - they look like this:

prefix:localPart

At all times, the prefix should be thought of as shorthand for the

actual URI/namespace.

That is, the above is really < />

prefix

</div>
(182)<div class='page_container' data-page=182>

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

Figure 6-7. Declaring Namespaces (1 of 2) XM3014.1

Notes:

Note that you can declare a namespace on any element that you like, not just the root
element.

The syntax of a namespace declaration is:

<

prefix

:

elementName

xmlns:

prefix

=

'URI'

/>

The following example declares the namespace

assigns it a prefix of 'books'

and identifies the book element as a member of that namespace.

<

books

:book

xmlns:

books

=' />

Attributes

may also be assigned to a namespace. As with elements,

attributes are prefixed as follows:

<

books

:book

xmlns:

books

=' />

books

:

hardcover='true'

/>

Attributes are not automatically in a namespace

</div>
(183)<div class='page_container' data-page=183>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-8. Declaring Namespaces (2 of 2) XM3014.1

Notes:

Now let's look at example with nested elements.

Declaring Namespaces (2 of 2)

Suppose a document without namespaces looked like:

<title>Tom Sawyer</title>
</book>

One way to use a namespace is:

<books:book

xmlns:books=' />

books:hardcover='true'>

<books:title

xmlns:books=' /> Tom Sawyer

</books:title>
</books:book>

It is clear that declaring the namespace on every single element

becomes unwieldy (and error prone).

</div>
(184)<div class='page_container' data-page=184>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-9. Namespace Scope XM3014.1

Notes:

Note that every element or attribute name that is in the namespace has the appropriate
namespace prefix in front of it.

Namespace Scope

When a namespace prefix is declared, it remains in scope for:

Attributes of the element where it is declared.

Child elements (and their attributes) of the element where it is

declared.

Unless the prefix is redefined on a nested element.

QNames are still required, the namespace is not assumed.

Applying this technique, the previous example becomes:

<books:book

xmlns:books=' />

books:hardcover='true'>

<books:title>

</div>
(185)<div class='page_container' data-page=185>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-10. Default Namespaces XM3014.1

Notes:

Once you have specified the default namespace, all unprefixed elements in the scope of

the default declaration are assumed to be in the namespace specified as the default. It is
very important to note that default namespace declarations only apply to element names,
not attribute names.

In our example, we set the books namespace to the default and get rid of all the prefixes on
element names. We still need the prefix on the attribute names because default

namespaces don't apply to attributes.

Default Namespaces

For situations where a majority of elements are associated with the

same Namespace, a default namespace may be declared.

Syntax:

<

elementName

xmlns=

'URI'

/>

QNames are used to identify nested elements that are from a

different namespace.

The default may be respecified for each element scope

Nesting is respected, that is, respecification does not influence

the outerscope containing the nested elements.

</div>
(186)<div class='page_container' data-page=186>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-11. Example - Default Namespaces XM3014.1

Notes:

The result of these apparent duplications is to put the hardcover attribute inside a
namespace.

Example - Default Namespaces

<book xmlns=' />

xmlns:

books=' />

books:hardcover

='true'>

<title>Tom Sawyer</title>

</book>

</div>
(187)<div class='page_container' data-page=187>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-12. Documents with Multiple Namespaces XM3014.1

Notes:

All that we did to enable this was add two more namespace declarations, and then add the
new elements and use the appropriate namespace prefix.

In the case of the isbn element, we declared the namespace that it needed on the element
itself -- you can declare namespaces on any element that you like, not just the root

element. When you do this, the prefix is only good for the element it was declared on. You
can also change the default namespace for a particular element by redefining the default
namespace on that element. Again, the scope will be the element that the declaration is
attached to.

Documents with Multiple Namespaces

Document with three namespaces:

<

book

xmlns=

' />

xmlns:amazon=' />

<

title

>Tom Sawyer</

title

>

<

isbn

xmlns=

' />

0140390839

</

isbn

>

<amazon:skuNo>A25</amazon:skuNo>

</

book

>

</div>
(188)<div class='page_container' data-page=188>

Course materials may not be reproduced in whole or in part

without the prior written permission of IBM.

Figure 6-13. Elements with No Namespace XM3014.1

Notes:

The unprefixed <title> element is in no namespace, because there is no default null
namespace.

In order to repair this example, we need to prefix title with the books namespace prefix
again.

WRONG!

Elements with No Namespace

What happens to the previous example with no default

namespace?

<book

xmlns=' />

xmlns:amazon=' /> <title>Tom Sawyer</title>

<isbn xmlns="">
 0140390839

</isbn>

<amazon:skuNo>A25</amazon:skuNo>

</book>

The xmlns="" syntax resets the default namespace for the scope in

which it occurs. The <isbn>

element is not in a namespace.

</div>
(189)<div class='page_container' data-page=189>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-14. Attributes and Namespaces XM3014.1

Notes:

There are two interacting rules that affect attributes and namespaces:
 • Attributes are not affected by a default namespace declaration.
 • Attributes on a single element must be unique.

In the example above, the <bad> element is invalid because there are two unprefixed att
attributes. In the second invalid element the two attributes are the same because ns1 and
ns2 are two prefixes for the same namespace URI. Therefore, the two attribute names are
identical.

It should be obvious that the first <valid> element is valid -- a and b are unprefixed, and a is

not the same as b. The second <valid> element is valid because the unprefixed attribute a
is in no namespace (remember that default namespace declarations don't affect attributes),
and the ns1:a attribute is in the namespace -- they are in different
namespaces.

Attributes and Namespaces

Attributes are not affected by a default namespace declaration.

Attributes on a single element must be unique.

<bad xmlns:ns1=""

xmlns:

ns2=""

>

<invalid att="1" att="2" />

<invalid ns1:att="1"

ns2:att

="2" />

</bad>

</div>
(190)<div class='page_container' data-page=190>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-15. Namespace Processing XM3014.1

Notes:

How does an XML parser deal with namespaces?

Needs the right API

SAX2

DOM Level 2

The parser simply reports the prefix, localName, and URI

associated with the element or attribute.

It's up to your application to decide what to do.

There are no validation rules associated with Namespaces - it

depends on XMLSchema, DTD, or whatever grammar description

language you are using.

</div>
(191)<div class='page_container' data-page=191>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-16. Example: Use of Namespaces XM3014.1

Notes:

Here's an example of namespaces in use:

Here we have an imaginary record that might be used in an airline's airplane fleet inventory.
For each airplane, we want to know which manufacturer provided each major part of the

airplane.

This example shows how we could use namespaces to identify which components came
from which manufacturers.

An application that processed this document could then use the namespaces to determine
which manufacturer's diagnostic equipment would be needed to perform a full maintenance
cycle on a particular airplane.

While not required, it is a best practice to collect all the namespace definitions in one place;
especially in large, composite files.

Example: Use of Namespaces

Composition of a particular airplane in an airline fleet:

</div>
(192)<div class='page_container' data-page=192>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-17. Problems with Namespaces XM3014.1

Notes:

Namespace recommendation after XML 1.0 - because the namespace recommendation
came after XML 1.0, it's not really part of the spec. This means there are places where
namespaces and XML 1.0 don't fit together.

DTD's don't really integrate well - We've showed you an ad hoc solution for using a fixed
set of namespaces with a DTD, but that solution doesn't really satisfy a lot of desires that
users have for namespaces.

Testing equality of namespaces is a pain - there's no easy way to test equality of two
namespaces except to get the two namespace URIs and compare them character by
character.

Problems with Namespaces

Namespace recommendation after XML 1.0.

DTDs don't integrate well.

Must use QNames as element names in DTDs (remember, "

:

" is

legal in an element name)

If the prefix changes, the DTD must also change

</div>
(193)<div class='page_container' data-page=193>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-18. Best Practices XM3014.1

Notes:

When to use namespaces:

• When you think your DTD/Schema will be used outside your organization.

• When you think you will need to combine your DTD/Schema with other grammars.
 • As a practical note, this means that anybody doing serious grammar work really ought

to be using namespaces.
Performance implications:

• Namespace processing slows down the parser and increases memory usage. The 
parser needs to look at all the namespace declarations and QNames. Even if you turn
off namespace processing in your parser, there will still be a performance impact
because your input document will still be larger (because of namespace declarations
and QNames) than if you were not using namespaces.

Don't use relative URIs for namespace identifiers; they are deprecated post the
namespaces recommendation.

Best Practices

When to use namespaces

When the data requires uniqueness for application processing.

When the need to combine a schema [TBD] with other grammars

is necessary.

Performance implications

Namespace processing may slow down the parser and/or

increases memory use.

Don't use relative URIs for namespace identifiers.

Pick the default namespace carefully.

Don't declare more than one prefix for a namespace URI.

Be careful with attributes when using namespaces.

</div>
(194)<div class='page_container' data-page=194>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

choose carefully.

Don't declare more than one prefix for a namespace URI - there's no reason to do it and it
will cause confusion to someone else.

</div>
(195)<div class='page_container' data-page=195>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-19. Status of Namespaces XM3014.1

Notes:

Namespaces in XML Recommendation 1/1999 - it is a stable recommendation.
Supported by most parsers relative to DTDs.

Much better support with XML Schema.

Namespaces are ready for use, especially now that XML Schema has reached
recommendation status.

Status of Namespaces

XML namespaces became a recommendation of the W3C on

January 14, 1999.

</div>
(196)<div class='page_container' data-page=196>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-20. More Information XM3014.1

Notes:

More Information

Reference

Description

NamespacesFAQ.htm XML Namespaces FAQ
/>

namespaces/index.html

XML.com article about Namespace
Myths

James Clark's notes on XML
Namespaces

</div>
(197)<div class='page_container' data-page=197>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Figure 6-21. Checkpoint Questions XM3014.1

Notes:

Checkpoint Questions

1. Which is true of XML namespaces?

(Select all that apply)

a. They are stored in an Internet-based registry.

b. They are associated with URIs

c. They are integrated with DTDs

d. They are integrated with XML Schema.

2. An XML namespace prefix (Select all that apply):

a. Links to a schema definition.

b. Is scoped to the element where it is defined.

c. Is short hand for a URI.

d. Can stand for more than one namespace.

3. Default namespaces apply to:

a. Elements

b. Attributes

c. Elements and attributes

</div>
(198)<div class='page_container' data-page=198>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Figure 6-22. Unit Summary XM3014.1

Notes:

Unit Summary

Having completed this unit, you should understand:

The reasons for using namespaces

The syntax used in namespaces

The use of default namespaces

The interaction between namespaces and attributes

Problems with namespaces

</div>
(199)<div class='page_container' data-page=199>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

Uempty

Unit 7. XML Schema

What This Unit is About

This unit presents an introduction to the essential features of the W3C
XML Schema language.

What You Should Be Able to Do

After completing this unit, you should be able to:

• List and describe the reasons for using XML Schemas
• List the key new features of Schemas

• Define the grammar rules of an XML document using the syntax of
XMLSchemas

• List and define the best practices to use when using XML Schemas
• Describe the status of XML Schemas in the industry

How You Will Check Your Progress

Accountability:

</div>
(200)<div class='page_container' data-page=200>

Course materials may not be reproduced in whole or in part 
without the prior written permission of IBM.

this site is individual site for ueh students of information management faculty this site provides some students resources of it courses such as computer network data structure and algorithm enterprise resource planning

<i>Introduction to XML</i>

<i>and Related Technologies </i>

Student Notebook

IBM Certified Course Material

<b>Contents</b>

<b><sub>Trademarks</sub></b>

<b><sub>Course Description</sub></b>

<b>Introduction to XML and Related Technologies </b>

<b>Duration: 2.5 days</b>

<b>Purpose</b>

<b>Audience</b>

<b>Prerequisites</b>

<b>Objectives</b>

<b><sub>Agenda</sub></b>

<b>Day 1</b>

<b>Day 2</b>

<b>Day 3</b>

<b><sub>Unit 1. Introduction to XML and Related </sub></b>

<b>Technologies</b>

<b>What This Unit is About</b>

<b>What You Should Be Able to Do</b>

<i><b>Notes:</b></i>

<b>Introduction</b>

XM301 Introduction to XML and Related Technologies

Instructor:

Please introduce yourself and provide your:

<i><b>Notes:</b></i>

<b>Course Description</b>

This course is designed to introduce students to the fundamentals

of XML and its significant derivative companion technologies: XML

Schema, Namespaces, XPath, and XSL Transformations.

Document Type Declarations (DTDs) are also introduced.

The focus of the course is on the creation, specification and

processing of XML documents.

The course is 2.5 days in length and provides extensive hands-on

labs throughout.

<i><b>Notes:</b></i>

<b>Audience</b>

<i><b>Notes:</b></i>

<b>Prerequisites</b>

Prerequisites:

<i><b>Notes:</b></i>

<b>Course Objectives (1 of 2)</b>

After completing this course, you should be able to:

Describe/differentiate the use of HTML and XML

Enumerate the rules of a well-formed XML document

Create and maintain XML documents

Describe the purpose and use of Document Type Definitions

(DTDs)

Create DTDs describing the validation rules for specific XML

instances*

Describe the purpose and use of XML Schema

Enumerate the benefits of XML Schema over DTDs

Create XML Schemas describing the validation rules for specific

XML instances*

<i><b>Notes:</b></i>

<b>Course Objectives (2 of 2)</b>

After completing this course, you should be able to:

Describe the purpose of XML Namespaces

Declare and use XML Namespaces in an XML document*

Describe the use of an XPath in the context of XSLT and XML

Schema

Create XPath expressions that locate specific information in an

XML instance*

Describe the use of XSL in the processing of XML documents

Create an XSL Transformation to transform an XML document

into some other instance*

<i><b>Notes:</b></i>

<b>Agenda - Day 1</b>

Welcome and Introductions

Issues in Information Exchange

What is XML?

Lab Exercise

Overview of IBM WebSphere Studio Application Developer

Lab Exercise

Document Type Definitions

Lab Exercise

<i><b>Notes:</b></i>

<b>Agenda - Day 2 </b>