CSC 330 E-Commerce
Teacher
Ahmed Mumtaz Mustehsan
GM-IT CIIT Islamabad
Virtual Campus, CIIT
COMSATS Institute of Information Technology
T2-Lecture-5
eXtensable Markup Language
(XML)
Part - III
For Lecture Material/Slides Thanks to: www.w3schools.com
Objectives
Part
1:
Review The basics of creating an XML document
Part 2:
Imposing Structure on XML Documents using
Document Type Definition DTD
Part 3:
Strengthening the data-modeling capabilities of
XML Using XML Schemas
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-3
Part 1:
Review The basics of creating an XML
document
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-4
Part 1: A review of XML
An
Extensible Markup Language (XML) document
describes the structure of data.
XML and HTML have a similar syntax … both
derived from SGML
XML
has no mechanism to specify the format for
presenting data to the user
An
XML document resides in its own file with an
‘.xml’ extension
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-5
Main Components of an XML Document
Elements:
<hello>
Attributes:
<item id=“33905”>
Entities:
T2-Lecture-5
< (<)
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-6
The Basic Rules
XML
is case sensitive
All start tags must have end tags
Elements must be properly nested
XML declaration is the first statement
Every document must contain a root element
Attribute values must have quotation marks
Certain characters are reserved for parsing
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-7
XML is different from HTML
HTML
is a Hyper Text Markup language
◦Designed for a specific application, namely,
displaying, viewing, presenting and linking
hypertext documents
XML
describes structure (organization of data)
and content (“semantics” the interpretation of data)
XML
is a subset of SGML (Standard Generalized
Markup Language)
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-8
An Address Book as an XML document
<addresses>
<name> Ahmed Mumtaz</name>
<tel> 9251-2233-44 </tel>
<email> </email>
</person>
<name> Malik Riaz khan</name>
<tel> 9251-123-4450 </tel>
<email></email>
</person>
</addresses>
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
1-9
Important Features of XML
No fixed set of tags
User is allowed to introduce New tags
Already defined set of tags can also be used
Namespaces facilitate uniform and coherent
descriptions of data
For example, a namespace for address books
determines to use of :
<tel> or or <mobile>
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
110
Features of XML (cont’d)
XML
supports internationalization through Unicode
Web services (e.g., e-commerce) require
exchanging data between various applications that
run on different platforms.
XML (with the support of namespaces) is the best
option for data exchange on the Web.
XML is a data model
◦Similar to the semi-structured-data-model
XML has follow the concept of DTD and the more
impressive XML Schema
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
111
XML family
Limited
styling of XML can be done with CSS
Document Type Definitions (DTDs) impose structure
on XML documents
XML Schemas strengthen the data-modeling
capabilities of XML
XPath is a language for accessing XML documents
XLink and XPointer support cross-references
XSLT is a language for transforming XML documents
into other XML documents such as XHTML, for
viewing XML files
XQuery is a language for querying XML documents
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
112
Imposing Structure on
XML Documents
using
DTD : (Document Type Definition)
XML defines structure of the document
Some
XML files only contain text documents with
tags that contain metadata and describe the
structure
Example:
<book year= “2011">
<title> e-Commerce Business, Technology and Society </title>
<author>
<last>Laudon</last>
<first>Kenneth</first>
</author>
PEARSON</publisher>
78.99</price>
</book>
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
114
Document Type Definitions
Document
Type Definitions (DTDs) impose structure
on XML documents
There
is some relationship between a DTD and a
schema, but it is not close hence the need for
additional “typing” systems exists.
The
DTD is a syntactic specification
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
115
Document Type Definitions
A description
of legal, valid data further contributes
to the interoperability and efficiency of using XML
A single DTD ensures a common format for each
XML document that references it.
An application can use a standard DTD to verify
that data that it receives from the outside world is
valid.
An XML document that conforms to the rules within
a DTD is said to be valid document.
If the XML document does not follow the rules
contained within the DTD, a parser generates an
error.
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
116
Motivation
A DTD
adds syntactical requirements in addition to
the well-formed requirement.
It
helps in eliminating errors when creating or
editing XML documents.
It
clarifies the intended semantics.
It
simplifies the processing of XML documents
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
117
An Example
In
an address book, where can a phone number
appear?
Under , under <name> or under both?
If
we have to check for all possibilities, processing
takes longer and it may not be clear to whom a phone
belongs to?
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
118
Example: An Address Book
<name> Homer Simpson </name>
<greet> Dr. H. Simpson </greet>
<addr>1234 Springwater Road </addr>
<addr> Springfield USA, 98765 </addr>
<tel> (321) 786 2543 </tel>
<fax> (051) 786 2544 </fax>
Exactly one name
At most one greeting
As many address
lines as needed
(in order)
Mixed telephones and
faxes
<tel> (051) 786 2544 </tel>
<email> </email>
As many
as needed
</person>
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
119
Specifying the Structure
name
to specify a name element
greet?
to specify an optional
(0 or 1) greet elements
name,
greet? to specify a name followed by
an optional greet
addr*
to specify 0 or more
address
lines
tel
| fax
(tel
| fax)*
T2-Lecture-5
email*
a tel or a fax element
0 or more repeats of tel or fax
0 or morewww.w3schools.com
email elements
Ahmed Mumtaz Mustehsan
120
Specifying the Structure (cont’d)
So
the whole structure of a person entry is
specified by
name, greet?, addr*, (tel | fax)*, email*
This is known as a regular expression
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
121
Element Type Definition
for
each element type E, a declaration of the form:
<!ELEMENT E P>
where P is a regular expression, i.e.,
P ::= EMPTY | ANY | #PCDATA | E’ |
P1, P2 | P1 | P2 | P? | P+ | P*
◦ E’: element type
◦ P1 , P2: concatenation
◦ P1 | P2: disjunction
◦ P?: optional
◦ P+: one or more occurrences
◦ P*: 0 or more occurrences
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
122
Summary of Regular Expressions
A
e1,e2
e*
e?
e+
e1 | e2
(e)
T2-Lecture-5
The tag (i.e., element) A occurs
The expression e1 followed by e2
0 or more occurrences of e
Optional: 0 or 1 occurrences
1 or more occurrences
either e1 or e2
grouping
Ahmed Mumtaz Mustehsan
www.w3schools.com
123
The Definition of an Element Consists of
Exactly One of the Following
A
regular expression (as defined earlier)
EMPTY means that the element has no content
ANY means that content can be any mixture of
PCDATA and elements defined in the DTD
Mixed content which is defined as described on the
next slide :
(#PCDATA)
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
124
The Definition of Mixed Content
Mixed
content is described by a repeatable OR group
(#PCDATA | element-name | …)*
Inside the group, no regular expressions ; just ement
names
#PCDATA must be first followed by 0 or more element
names, separated by |
* The group can be repeated 0 or more times
T2-Lecture-5
Ahmed Mumtaz Mustehsan
www.w3schools.com
125