Tài liệu Web Technology Book doc

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.97 MB, 169 trang )

1. Introduction to Internet
1.1. Introduction
By the turn of the century, information, including access to the Internet, will be the
basis for personal, economic, and political advancement. The popular name for the
Internet is the information superhighway. Whether you want to find the latest financial
news, browse through library catalogs, exchange information with colleagues, or join in a
lively political debate, the Internet is the tool that will take you beyond telephones, faxes,
and isolated computers to a burgeoning networked information frontier.
The Internet supplements the traditional tools you use to gather information, Data
Graphics, News and correspond with other people. Used skillfully, the Internet shrinks
the world and brings information, expertise, and knowledge on nearly every subject
imaginable straight to your computer.
1.2. What is the Internet?
The Internet links are computer networks all over the world so that users can share
resources and communicate with each other. Some computers have direct access to all the
facilities on the Internet such as the universities. And other computers, eg privately-
owned ones, have indirect links through a commercial service provider, who offers some
or all of the Internet facilities. In order to be connected to Internet, you must go through
service suppliers. Many options are offered with monthly rates. Depending on the option
chosen, access time may vary.
The Internet is what we call a metanetwork, that is, a network of networks that spans
the globe. It's impossible to give an exact count of the number of networks or users that
comprise the Internet, but it is easily in the thousands and millions respectively. The
Internet employs a set of standardized protocols which allow for the sharing of resources
among different kinds of computers that communicate with each other on the network.
These standards, sometimes referred to as the Internet Protocol Suite, are the rules that
developers adhere to when creating new functions for the Internet.
The Internet is also what we call a distributed system; there is no central archives.
Technically, no one runs the Internet. Rather, the Internet is made up of thousands of
smaller networks. The Internet thrives and develops as its many users find new ways to
create, display and retrieve the information that constitutes the Internet.

1.3. History & Development of the Internet
In its infancy, the Internet was originally conceived by the Department of Defense
as a way to protect government communications systems in the event of a military
strike. The original network, dubbed ARPANet (for the Advanced Research Projects
Agency that developed it) evolved into a communications channel among contractors,
military personnel, and university researchers who were contributing to ARPA
projects.
1
The network employed a set of standard protocols to create an effective way for
these people to communicate and share data with each other.
ARPAnet's popularity continued to spread among researchers and in the 1980 the
National Science Foundation, whose NSFNet, linked several high speed computers,
took charge of what had come to be known as the Internet.
By the late 1980's, thousands of cooperating networks were participating in the
Internet.
In 1991, the U.S. High Performance Computing Act established the NREN (National
Research & Education Network). NREN's goal was to develop and maintain high-
speed networks for research and education, and to investigate commercial uses for the
Internet.
The rest, as they say, is history in the making. The Internet has been improved
through the developments of such services as Gopher and the World Wide Web.
Even though the Internet is predominantly thought of as a research oriented network,
it continues to grow as an informational, creative, and commercial resource every day
and all over the world.
Birth of Internet - Key terms:
 1960 most universities and Government agencies had individual
mainframe computers, which were not interconnected.
 1968 The national Physical Laboratory in Great Britain was the first to
set up the test network.
 1969 Pentagon’s Department of ARPA came out with the first infant

network with four nodes.
 Which was named as ARPANET
 1971 there were fifteen nodes in ARPANET.
 1972 thirty seven nodes were added.
 1980 NSF net (National Science Foundation).

1.4. Features of internet
a. Key Web Features
b. Key Usenet Newsgroups Features
c. Key Email Features
d. Key Mailing List Features
a) Key Web Features
 Ease Of Use
 Universal Access
 Search Capabilities
2
The web leverages the key features of the Internet and makes them widely accessible
to the public. Key features of the web in particular are its ease of use, universal
accessibility, and ability to be quickly searched:
• Ease of use. The web can be immediately used by anyone already familiar with a
computer window. The only special features are links, which are as natural and
intuitive to use as pressing a button. This ease of use enabled the rapid adoption of
the web in the 1990's, and led to the establishment of the Internet around the
world.
• Universal access. The open design of the web makes it easy to build web
browsers for a wide range of devices. Web browsers have been deployed on cell
phones and personal organizers, and the web is now the standard interface for
providing access to information.
• Search capabilities. The development of search sites greatly multiplied the power
and usefulness of the web by providing the capability to effectively search the

content of millions of web pages in seconds. Search sites significantly enabled the
web to realize Vannevar Bush's vision of an automated library system.
b) Key Usenet Newsgroups Features
 Group Communications
 Common Space
• Group communications. The Usenet is a powerful facilitator of group
communication across time and geographic space. One person can post a message
on the Usenet, another person reply to it, and a third person reply to either
message, no matter where they are in the world, and whenever is convenient to
them.
Usenet messages are organized in newsgroups and threads that are stored
in Usenet archives indefinitely for later retrieval by anyone that wishes to access
them, even years later, connecting people across generations.
Mailing lists, IRC, and MUD's also provide group communications,
although on a lessor scale.
• Common space. The Usenet is the second largest common public space in
existence, next to the Internet itself. Anyone can post anything they wish to any
newsgroup, and anyone can read any message they wish from any newsgroup.
Like most common spaces, the Usenet therefore reflects the best and worst
of human nature, from community newsgroups where people are focused on
selflessly helping each other, to less worthwhile groups where the postings are
filled with pointless and counterproductive information.
3
Like most common spaces, messages posted on the Usenet are public property
and can be freely copied and reused in other sources, although Usenet netiquette
mandates that credit should always be provided.
c) Key Email Features
 Email Is A Push Technology
 Email Waits For You
 Email Is One-To-Many

 Email Is A Push Technology
Email is delivered to the recipient so they don't have to work to get it they just
open their Inbox and there it is.
Technologies are sometimes labeled push or pull as described below:
• Pull . These technologies require the user to actively go and retrieve the
information. A library, the Web, and the Usenet are pull technologies, requiring
active participation of a human being to retrieve the information.
• Push . These technologies deliver information to the user so all they have to do is
receive it. Radio, television, and email are push technologies.
One of the reasons email has been such a big success is because it is a push
technology. The person that sends the email writes it, then POP3 and SMTP transmit it,
and all the recipient has to do is open his email program and double-click on the email to
read it.
An advantage of push technologies is their ease of use they require a minimum of
effort on the part of the recipient, which greatly supports adoption because they get used
more often. Partly because of this feature, the use of email has greatly outstripped all
other Internet applications since its creation, even after the explosive development of the
Web
 Email Waits For You
Email is particularly convenient because it is asynchronous; it waits for you and fits
into your schedule instead of demanding that you structure your activities to synchronize
with those you communicate with.
For example, an email recipient doesn't have to be available when you compose and
send an email you can send it at the time that is most convenient to you. Similarly, you
don't have to be available or even connected to the Internet when someone else sends you
email it waits on your server until you log in and download it when most convenient to
you.
4
Email provides the convenience that voice mail later provided for telephones
except that voice mail is more ephemeral, cannot be conveniently edited, and is usually

accompanied by a preference to talk to the other party in real time. With email you know
that the medium is inherently asynchronous, so you tend to write down all the
information the addressee needs so they can respond when they are able.
 Email Is One-To-Many
You can send an email to several people in one simple action. Communications can be
divided into four types depending on the number of parties participating in the
information transfer: (1) one-to-one, (2) one-to-many, (3) many-to-one, and (4)many-to-
many.
Each type of communication has its own attributes and strengths. For example, the
typical phone call is one-to-one, and the typical meeting is many-to-many. Email is the
most successful one-to-many technology, with respect to both sending and receiving:
• Sending . You can send an email to more than one person at a time, for example to
everyone in your family, or to a group of friends.
• Receiving . You can receive information that has been mailed to more than one
person, for example an announcement sent to hundreds of people on a mailing
list.
The key advantage of this one-to-many communication is efficiency, since instead of
sending emails individually; you can save large amounts of time by sending one email to
several people at once.
Similarly, when you receive an email from an Internet mailing list you are getting
information that would probably be impractical to receive any other way, since most
organizations don't have the time or resources to send out paper based notices
individually to hundreds or even thousands of people.
e. Key Mailing List Features
 One-to-many communication.
Mailing lists enable powerful one-to-many communications.
Because mailing lists are based on email, they share the key features of email.
Mailing lists also build on that technology to create a new capability called "one-to-
many" functionality, enabling one person to communicate with many people at the same
time. This feature is a virtual Internet extension of the real-world communication power

of a person speaking to a group, except that the members of the audience may be located
anywhere in the world.
5
The reverse is also true: mailing lists can give one person unique access to the
informed opinion of a diverse group of people on various subjects with little effort the
email arrives from the groups they have subscribed to, and they then read the ones they
want.
Like so many of the Internet technologies, mailing lists are important primarily
because of their power to bring people around the world together in a single
communication setting.
Features of internet
 Decentralized
 Non proprietary
 Platform independent
 Packet Switching.
 Self maintenance.
 Democracy
1.5. Types of Connection
• Dial Up Connection. (telephone line)
• DSL (Digital Subscriber Line). (broadband)
– Asymmetric digital Subscriber line (ADSL).
– Symmetric digital Subscriber line (SDSL).
– Depending upon the speed
• High-data-rate DSL (HDSL).
• Very high DSL (VDSL).
• RF Link. (radio frequency)
• ISDN (Integrated Services Digital Network).
• A Cable modem.( cable TV line)
1.6. What makes the internet work?
The unique thing about the Internet is that it allows many different computers

to connect and talk to each other. This is possible because of a set of standards,
known as protocols that govern the transmission of data over the network: TCP/IP
(Transmission Control Protocol/Internet Protocol). Most people who use the Internet
aren't so interested in details related to these protocols. They do, however, want to
know what they can do on the Internet and how to do it effectively.
1.7. The Client/Server Model
The most popular Internet tools operate as client/server systems. You're running a
program called a Web client. This piece of software displays documents for you and
carries out your requests. If it becomes necessary to connect to another type of service
6
say, to set up a Telnet session, or to download a file your Web client will take care of
this, too. Your Web client connects (or "talks") to a Web server to ask for information on
your behalf.
The Web server is a computer running another type of Web software which provides
data, or "serves up" an information resource to your Web client.
All of the basic Internet tools including Telnet, FTP, Gopher, and the World Wide
Web are based upon the cooperation of a client and one or more servers. In each case,
you interact with the client program and it manages the details of how data is presented to
you or the way in which you can look for resources. In turn, the client interacts with one
or more servers where the information resides. The server receives a request, processes it,
and sends a result, without having to know the details of your computer system, because
the client software on your computer system is handling those details.
The advantage of the client/server model lies in distributing the work so that each tool
can focus or specialize on particular tasks: the server serves information to many users
while the client software for each user handles the individual user's interface and other
details of the requests and results.
Characteristics of a client
• Initiates requests
• Waits for replies
• Receives replies

• Usually connects to a small number of servers at one time
• Typically interacts directly with end-users using a graphical user interface
Characteristics of a server
• Never initiates requests or activities
• Waits for and replies to requests from connected clients
• A server can remotely install/uninstall applications and transfer data to the
intended clients
1.8. Electronic mail on the internet:
Electronic mail, or e-mail, is probably the most popular and widely used Internet
function. E-mail, email, or just mail, is a fast and efficient way to communicate with
friends or colleagues. You can communicate with one person at a time or thousands;
you can receive and send files and other information. You can even subscribe to
electronic journals and newsletters. You can send an e-mail message to a person in
the same building or on the other side of the world.
7
1.9. How does E-mail Work
E-mail is an asynchronous form of communication, meaning that the person whom
you want to read your message doesn't have to be available at the precise moment you
send your message. This is a great convenience for both you and the recipient
On the other hand, the telephone, which is a synchronous communication medium,
requires that both you and your listener be on the line at the same time in order for
you to communicate (unless you leave a voice message). It will be impossible to
discuss all the details of the many e-mail packages available to Internet users.
Fortunately, however, most of these programs share basic functionality which allows
you to:
 Send and receive mail messages.
 save your messages in a file
 print mail messages
 reply to mail messages
 attach a file to a mail message

1.10. WWW (World Wide Web)
• WWW was found in the year 1989 by Tim Berners-Lee of CERN (the European
Organization for Nuclear Research)
• He founded the basic HTML to use on the web.
• In October 1994 Time Berners-Lee founded the World Wide Web consortium
organization for developing nonproprietary.
WWW (World Wide Web)-Key Terms
Web pages, Web Site, Portal, Web Servers,
Mail Server, File Server, News Server, DNS.
8
2. HTML 4 Protocols
Protocols Introduction
• A protocol is a program written as per mutually accepted standard that two
computers use to communicate with each other.
• Computers use protocols to format consistently their message so that other
computer can understand them.
The Protocols determines the following:
• The type of error checking to be used.
• Data compression method, if any
• How they sending device will indicate that it has finished sending a message.
• How they receiving device will indicate that it has received a message.
• There are varieties of standard protocols from which programmers can choose.
• Each has particular advantages and disadvantages,
• Ex: Some are simpler than others. Some are most reliable, and some are faster
Some standard protocols
• Simple Mail Transfer protocol (SMTP).
• Post Office Protocol version3 (POP3).
• Transmission Control Protocol/Internet Protocol (TCP/IP).
• Hyper Text Transfer Protocol (HTTP).

• File Transfer Protocol (FTP).
• Internet Mail Access Protocol (IMAP).
• Multipurpose Internet Mail Extensions (MIME).
• Network News Transfer Protocol (NNTP).
2.1. Hyper Text Transfer Protocol (HTTP).
HTTP is the network protocol of the Web. It is both simple and powerful. Knowing
HTTP enables you to write Web browsers, Web servers, automatic page downloader’s,
link-checkers, and other useful tools.
2.1.1. What is HTTP
HTTP stands for Hypertext Transfer Protocol. It's the network protocol used to deliver
virtually all files and other data (collectively called resources) on the World Wide Web,
whether they're HTML files, image files, query results, or anything else. Usually, HTTP
takes place through TCP/IP sockets (and this tutorial ignores other possibilities).
A browser is an HTTP client because it sends requests to an HTTP server (Web server),
which then sends responses back to the client. The standard (and default) port for HTTP
servers to listen on is 80, though they can use any port.
9
2.1.2. What are "Resources"
HTTP is used to transmit resources, not just files. A resource is some chunk of
information that can be identified by a URL (it's the R in URL). The most common kind
of resource is a file, but a resource may also be a dynamically-generated query result, the
output of a CGI script, a document that is available in several languages, or something
else.
While learning HTTP, it may help to think of a resource as similar to a file, but more
general. As a practical matter, almost all HTTP resources are currently either files or
server-side script output.
2.1.3. Structure of HTTP Transactions
Like most network protocols, HTTP uses the client-server model: An HTTP client opens
a connection and sends a request message to an HTTP server; the server then returns a
response message, usually containing the resource that was requested. After delivering

the response, the server closes the connection (making HTTP a stateless protocol, i.e. not
maintaining any connection information between transactions).
The format of the request and response messages is similar, and English-oriented. Both
kinds of messages consist of:
• an initial line,
• zero or more header lines,
• a blank line (i.e. a CRLF by itself), and
• an optional message body (e.g. a file, or query data, or query output).
Put another way, the format of an HTTP message is:
<initial line, different for request vs. response>
Header1: value1
Header2: value2
Header3: value3
<optional message body goes here, like file contents or query data;
it can be many lines long, or even binary data $&*%@!^$@>
Initial lines and headers should end in CRLF, though you should gracefully handle lines
ending in just LF. (More exactly, CR and LF here mean ASCII values 13 and 10, even
though some platforms may use different characters.)
10
HTTP-Key Terms
• HTTP is the underlying protocol used by the WWW.
• HTTP is stateless protocol (http is “stateless”
• server maintains no information about past client
requests)
– Each visit as if the only visit so far
– Previous visits are not remembered by the
server.
• HTTP Defines how messages are formatted and
transmitted, and what action web servers and
browsers should take in response to various

commands
• HTTP is called a stateless protocol because each
command is executed independently, without any
knowledge of the commands that came before it.
2.1.4. HTTP Versions: Past and Present
HTTP has evolved into multiple, mostly backwards-compatible protocol versions. RFC
2145 describes the use of HTTP version numbers. The client tells in the beginning of the
request the version it uses, and the server uses the same or earlier version in the response.
2.1.4.1. HTTP/0.9 (1991)
Deprecated. Supports only one command, GET, which does not specify the HTTP
version. Does not support headers. Since this version does not support POST, the
information a client can pass to the server is limited by the URL length.
2.1.4.2. HTTP/1.0 (May 1996)
This is the first protocol revision to specify its version in communications and is still in
wide use, especially by proxy servers.
2.1.4.3. HTTP/1.1 (1997-1999)
Current version; persistent connections enabled by default and works well with proxies.
Also supports request pipelining, allowing multiple requests to be sent at the same time,
allowing the server to prepare for the workload and potentially transfer the requested
resources more quickly to the client.
2.1.4.4. HTTP/1.2
The initial 1995 working drafts of the document PEP – an Extension Mechanism for
HTTP (which proposed the Protocol Extension Protocol, abbreviated PEP) were prepared
by the World Wide Web Consortium and submitted to the Internet Engineering Task
11
Force. PEP was originally intended to become a distinguishing feature of HTTP/1.2. In
later PEP working drafts, however, the reference to HTTP/1.2 was removed. The
experimental RFC 2774, HTTP Extension Framework, largely subsumed PEP. It was
published in February 2000.
The major changes between HTTP/1.0 and HTTP/1.1 include the way HTTP handles

caching; how it optimizes bandwidth and network connections usage, manages error
notifications; how it transmits messages over the network; how internet addresses are
conserved; and how it maintains security and integrity.
2.1.5. HTTP Methods
HTTP defines eight methods (sometimes referred to as "verbs") indicating the desired
action to be performed on the identified resource.
HEAD
Asks for the response identical to the one that would correspond to a GET
request, but without the response body. This is useful for retrieving meta-
information written in response headers, without having to transport the entire
content.
GET
Requests a representation of the specified resource. By far the most common
method used on the Web today. Should not be used for operations that cause side-
effects (using it for actions in web applications is a common misuse). See safe
methods below.
POST
Submits data to be processed (e.g. from an HTML form) to the identified
resource. The data is included in the body of the request. This may result in the
creation of a new resource or the updates of existing resources or both.
PUT
Uploads a representation of the specified resource.
DELETE
Deletes the specified resource.
TRACE
Echoes back the received request, so that a client can see what intermediate
servers are adding or changing in the request.
OPTIONS
Returns the HTTP methods that the server supports for specified URL. This can
be used to check the functionality of a web server by requesting '*' instead of a

specific resource.
12
CONNECT
Converts the request connection to a transparent TCP/IP tunnel, usually to
facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP
proxy.
HTTP servers are required to implement at least the GET and HEAD methods and,
whenever possible, also the OPTIONS method.
2.1.6. Safe & Unsafe Methods
Some methods (e.g. HEAD, GET, OPTIONS, and TRACE) are defined as safe,
which means they are intended only for information retrieval and should not change the
state of the server (in other words, they should not have side effects). Repetition of the
same GET request should therefore be harmless.
Unsafe methods (such as POST, PUT and DELETE) should draw special
attention, typically as a dialog box requesting confirmation of the action. This is because
repeated requests can cause side effects, such as unwanted duplication of a transaction.
2.2. Simple Mail Transfer protocol (SMTP).
2.2.1. Introduction
The objective of the Simple Mail Transfer Protocol (SMTP) is to transfer mail
reliably and efficiently.
SMTP is independent of the particular transmission subsystem and requires only
a reliable ordered data stream channel. While this document specifically discusses
transport over TCP, other transports are possible.
An important feature of SMTP is its capability to transport mail across
networks, usually referred to as "SMTP mail relaying". A network consists of the
mutually-TCP-accessible hosts on the public Internet, the mutually-TCP-accessible hosts
on a firewall-isolated TCP/IP Intranet, or hosts in some other LAN or WAN
environment utilizing a non-TCP transport-level protocol.
2.2.2. Structure of SMTP
The SMTP design is based on the following model of communication: as the

result of a user mail request, the sender-SMTP establishes a two-way transmission
channel to a receiver-SMTP. The receiver-SMTP may be either the ultimate destination
or an intermediate. SMTP commands are generated by the sender-SMTP and sent to the
receiver-SMTP. SMTP replies are sent from the receiver-SMTP to the sender-SMTP in
response to the commands.
Once the transmission channel is established, the SMTP-sender sends a MAIL
command indicating the sender of the mail. If the SMTP-receiver can accept mail it
13
responds with an OK reply. The SMTP-sender then sends a RCPT command identifying
a recipient of the mail. If the SMTP-receiver can accept mail for that recipient it
responds with an OK reply; if not, it responds with a reply rejecting that recipient (but not
the whole mail transaction). The SMTP-sender and SMTP-receiver may negotiate
several recipients. When the recipients have been negotiated the SMTP-sender sends the
mail data, terminating with a special sequence. If the SMTP-receiver successfully
processes the mail data it responds with an OK reply. The dialog is purposely lock-step,
one-at-a-time.
2.2.3. History of SMTP
The Mail Transfer Protocol (MTP) was first defined in RFC 772 in September
1980, and then updated in RFC 780 in May 1981. MTP describes a set of commands and
procedures by which two devices can connect using TCP to exchange e-mail messages.
Its operation is described largely using elements borrowed from two early TCP/IP
application protocols that were already in use at that time: Telnet and FTP. The
commands of MTP are in fact based directly on those of FTP.
There wasn't anything inherently wrong with basing e-mail delivery on something
like FTP, but defining it this way made MTP somewhat of a “hack”. It was also restricted
to the capabilities defined by FTP, a general file transfer protocol, so it was not possible
to include features in the protocol that were specific to sending and receiving mail. Due
to the importance of e-mail, a specific protocol designed for the purpose of delivering e-
mail was warranted. This protocol was first defined in RFC 788, published in November
1981: the Simple Mail Transfer Protocol (SMTP).

The name suggests that SMTP is “simpler” than the “non-simple” MTP that it
replaced. Whether this is true or not is somewhat a matter of opinion; I do note that RFC
788 is 61 pages long, while the earlier RFC 780 was only 43 pages. What SMTP
definitely has over MTP is elegance; the protocol is designed specifically for the
transport of electronic mail. While it retains certain similarities to FTP, it is an
“independent” protocol running over TCP. So, from a conceptual standpoint, it can be
considered simpler than MTP. In terms of mechanics, the process SMTP uses to transfer
an e-mail message is indeed rather simple, especially compared to some other protocols.
14
RFC 788 described the operation of SMTP carrying e-mail messages
corresponding to the ARPAnet text message standard as described in RFC 733.
Development of both e-mail messages and the SMTP protocol continued, of course. In
August 1982, a milestone in TCP/IP e-mail was achieved when RFCs 821 and 822 were
published. RFC 821 revised SMTP, and became the defining standard for the protocol for
the next two decades. RFC 822, its companion standard, became the standard for TCP/IP
electronic mail messages carried by SMTP
2.2.4. SMTP Commands
SMTP commands are sent as plain ASCII text over the TCP connection
established between the client and server in an SMTP connection.
2.2.4.1. SMTP Command Syntax
All SMTP commands are specified using a four-letter command code. Some commands
also either allow or require parameters to be specified.
The basic syntax of a command is:
<Command-code> <parameters>
When parameters are used, they follow the command code and are separated from it by
one or more space characters
SMTP commands are not case sensitive.
Current SMTP Commands
Command
Code

Command Parameters Description
HELO Hello
The domain name of
the sender.
The conventional instruction sent by an
SMTP sender to an SMTP receiver to
initiate the SMTP session.
EHLO
Extended
Hello
The domain name of
the sender.
Sent by an SMTP sender that supports
SMTP extensions to greet an SMTP
receiver and ask it to return a list of
SMTP extensions the receiver supports.
The domain name of the sender is
supplied as a parameter.
MAIL
Initiate Mail
Transaction
Must include a
“FROM:” parameter
specifying the
originator of the
message, and may
contain other
parameters as well.
Begins a mail transaction from the
sender to the receiver.

15
RCPT Recipient
Must include a “TO:”
parameter specifying
the recipient mailbox,
and may also
incorporate other
optional parameters.
Specifies one recipient of the e-mail
message being conveyed in the current
transaction.
DATA
Mail
Message
Data
None
Tells the SMTP receiver that the SMTP
sender is ready to transmit the e-mail
message. The receiver normally replies
with an intermediate “go ahead”
message, and the sender then transmits
the message one line at a time,
indicating the end of the message by a
single period on a line by itself.
RSET Reset None
Aborts a mail transaction in progress.
This may be used if an error is received
upon issuing a MAIL or RCPT
command, if the SMTP sender cannot
continue the transfer as a result.

VRFY Verify
E-mail address of
mailbox to be
verified.
Asks the SMTP receiver to verify the
validity of a mailbox.
EXPN Expand
E-mail address of
mailing list.
Requests that the SMTP server confirm
that the address specifies a mailing list,
and return a list of the addresses on the
list.
HELP Help
Optional command
name.
Requests help information: general help
if no parameter is supplied, otherwise
information specific to the command
code supplied.
NOOP
No
Operation
None
Does nothing except for verifying
communication with the SMTP receiver.
QUIT Quit None Terminates the SMTP session.
Obsolete SMTP Commands
The commands in the preceding table are the ones that are most commonly used
in SMTP today. In addition to those, there are also certain commands that were originally

defined in RFC 821 but have since become obsolete.
These include the following:
16
SEND, SAML (“send and mail”) and SOML (“send or mail”): RFC 821 defined a
distinct mechanism for delivering mail directly to a user's terminal as opposed to a
mailbox, optionally in combination with conventional e-mail delivery. These were rarely
implemented and obsoleted in RFC 2821.
TURN: Reverses the role of the SMTP sender and receiver as described in the
SMTP special features topic. This had a number of implementation and security issues
and was removed from the standard in RFC 2821.
Key Terms
• SMTP is used for sending e-mail messages
between servers
• Simple Mail Transport Protocol or SMTP is the
universal standard for moving mail over the Net.
• Most e-mail systems that send mail over the internet
use SMTP to send messages from one server to
another.
• SMTP generally used to send messages from a mail
client to a mail server.
2.3. Post Office Protocol version3 (POP3).
POP3 has made earlier versions of the protocol, informally called POP1 and
POP2, obsolete. In contemporary usage, the less precise term POP almost always means
POP3 in the context of e-mail protocols
2.3.1. Introduction
Post Office Protocol 3 (POP3) is a standard interface between an e-mail client
program and the mail server, defined by IETF RFC 1939. The 3 at the end of the POP
denotes that this is the third version of the mailbox access protocol. POP3 and IMAP4 are
the two common mailbox access protocols used for Internet e-mail. POP3 provides a
message store that holds incoming e-mail. Users may log on and download their

messages. Most POP3 servers allow the downloading just the headers, the headers and a
specified number of lines from the body of the message, or the entire message. The mail
client may automatically delete downloaded messages. or offer the use the choise of
leaving any or all of the messages on the server.
2.3.2. POP3 Commands
Command Syntax Description
17
USER user Username
Provides username to the POP3 server. Must be followed by
a PASS command.
PASS pass Password
Provides a password to the POP3 server. Must follw a USER
command.
STAT stat Returns the number of messages and total size of mailbox.
LIST
list
list
MessageNumber
Lists message number and size of each message. If a message
number is specified, returns the size of the specified
message.
LAST last
Returns the message number of the last message not marked
as read or deleted.
RETR
retr
MessageNumber
Returns the full text of the specified message, and marks that
message as read.
TOP

top
MessageNumber
lines
Returns the specified number of lines from the specified
mesasge number.
DELE
dele
MessageNumber
Marks the specified message for deletion.
RSET rset
Resets any messages which have been marked as read or
deleted to the standard unread state.
NOOP noop
Returns a simple acknowledgement, without performing any
function.
APOP
apop Username
EncryptedKey
Allows for a secure method of POP3 authentication, in which
a cleartext password does not have to be sent. Instead, the
client creates an MD5 encrypted string from the password,
process id, and timestamp, and sends it to the POP3 server.
QUIT quit Ends the POP3 session
2.3.3. Why POP 4
POP 3 is based on the principal of providing simple functionality on the server
and putting all of the intelligence on the client. This works well enough because most of
the world is using POP 3 and its working just fine.
POP 4 adds a few functions that allow the server to perform some more useful
functionality while adding very little complexity to the server. The design goal was to
create a server protocol based on POP3 that had the minimum functionality required to

operate a useful web-based mail client. It was not intended to solve all of the
disconnected-mode type problems, but certainly includes functionality to make it easier.
18
POP 4 is a superset of POP 3 and was styled around the interface that POP 3
currently supports. To this end, the protocol is every bit as simple as POP 3, very little
was added in terms of grammar to support the new commands. So for example the usual
dot-terminated-list is utilized as well as four letter command names.
Please note that below is the complete list of additions for the POP 4 protocol. Some of
them are mandatory, some are optional. Please read the POP 4 spec in the link above for
more details about each command
Key Terms
• POP is a protocol used to retrieve e-mail from a
mail server.
• Provides centralized storage for e-mail messages
• POP does not allow users to store mail on the
server after they download it
• There are two versions of protocol
– POP2 requires SMTP.
– POP3 Can be used with or without SMTP
2.4. Multipurpose Internet Mail Extensions (MIME).
In 1992, a new standard was defined by an Internet Engineering Task Force Working
Group in RFC 1521 & 1522 - called MIME.
2.4.1. Introduction
MIME is a specification for enhancing the capabilities of standard Internet
electronic mail. It offers a simple standardized way to represent and encode a wide
variety of media types for transmission via Internet mail.
When using the MIME standard, messages can contain the following types:
• Text messages in US-ASCII.
• Character sets other than US-ASCII.
• Multi-media: Image, Audio, and Video messages.

• Multiple objects in a single message.
• Multi-font messages.
• Messages of unlimited length.
• Binary files.
MIME is defined to be completely backwards compatible, yet flexible and open to
extensions. Therefore, it builds on the older standard by defining additional fields for the
mail message header, that describes new content types, and a distinct organization of the
message body.
19
2.4.2. What is MIME
On the Internet, data is sent as 8 bit bytes. The receiving software collects these
bytes and assembles them in the proper order. Now the question becomes, "What do
these bytes represent?" Are they text? Are they a picture? Are they a sound? How is it
possible to know what they mean? Suppose, in addition to the bytes that contain whatever
it is, an additional few bytes are sent along saying what the data is. If this is done, the
recieving software or person knows what is in the bunch of bytes that make up the
message. This is what MIME does. It tells what is in a message so that the message
contents can be used in an appropriate way.
2.4.3. History
The first steps to extend emails were defined 1985 in RFC 934. This document
proposed a standard for message encapsulation when replying and forwarding messages.
After few years, in 1988 a content-type header for email was proposed in RFC 1049,
which supported message contents like postscript and troff. Two years later RFC 1154
proposed encoding header field to be used in email that permitted multi-structural
messages . This was highly experimental document at that time.
In June 1992 were introduced RFCs 1341 and 1342 that can be considered the
first version of MIME that we are today aware of. They introduced extensions for images,
audio and general application, encoding schemes that are used today and representation
of non-ascii data. These documents were refined and expanded two years later in RFCs
1521, 1522 and 1523. The last RFC talked only about enriched text in MIME. Two years

ago, December 1996 MIME related RFCs were reworked once again and this time into a
group of five RFCs 2045 through 2049. After these documents more extensions to MIME
such as security has been proposed, but they have remained in separate documents.
2.4.4. MIME message structure
MIME was designed to be in compliance with RFC 822. New introduces message
headers are themselves consistent with message header syntax defined in RFC 822. In
fact, RFC 822 specifically states that unrecognized message headers should be ignored .
Therefore Mail User Agent should be able to receive MIME messages, although some of
data is not understood.
MIME allows creating composite messages with one or more subparts, each of which
can contain subparts. There is no limit to number of nested message parts in message.
Each subpart is separated with a MIME boundary and has headers similar but not
identical to the mail message headers. MIME defines a number of new header fields in
compliance with RFC822 header syntax. They are used to describe the content of MIME
message. MIME specific header can occur in at least two contexts:
1. top level message headers
2. subpart messages in multipart messages.
20
2.4.5. What are some common MIME types
text/plain as in the usual mail or news message.
text/html HTML text as on the World Wide Web.
image/jpeg a common image format.
image/gif another common image format.
application/octet-stream unknown type, any kind of data as bytes.
audio/midi midi music format for synthesizers.
audio/x-midi an alternate for the above.
application/ps indicates PostScript document.
Key Termes
• MIME protocol allows for the transmission of the
multimedia electronic mail.

• MIME message generally consists of header fields
followed by data.
• The header fields specify the MIME versions.
• MIME is a protocol that allows you to send files as
attachments to e-mail easily.
• Permits nontextual data to be sent in email
– Graphics image
– Voice or video clip
2.5. Internet Mail Access Protocol (IMAP).
Short for Internet Message Access Protocol, a protocol for retrieving e-mail messages.
The latest version, IMAP4, is similar to POP3 but supports some additional features. For
example, with IMAP4, you can search through your e-mail messages for keywords while
the messages are still on mail server. You can then choose which messages to download
to your machine.
2.5.1. History
IMAP was designed by Mark Crispin in 1986 as a remote mailbox protocol, in contrast to
the widely used POP, a protocol for retrieving the contents of a mailbox. IMAP was
developed at Stanford University in 1986.
Original IMAP
The original Interim Mail Access Protocol was implemented as a Xerox Lisp
machine client and a TOPS-20 server.
21
No copies of the original interim protocol or its software exist; all known
installations of the original protocol were updated to IMAP2. Although some of its
commands and responses were similar to IMAP2, the interim protocol lacked
command/response tagging and thus its syntax was incompatible with all other versions
of IMAP.
IMAP2
The interim protocol was quickly replaced by the Interactive Mail Access
Protocol (IMAP2), defined in RFC 1064 and later updated by RFC 1176. IMAP2

introduced command/response tagging and was the first publicly distributed version.
IMAP2bis
With the advent of MIME, IMAP2 was extended to support MIME body
structures and add mailbox management functionality (create, delete, rename, message
upload) that was absent in IMAP2. This experimental revision was called IMAP2bis; its
specification was never published in non-draft form. Early versions of Pine were widely
distributed with IMAP2bis support (Pine 4.00 and later supports IMAP4rev1).
IMAP4
An IMAP Working Group formed in the IETF in the early 1990s and took over
responsibility for the IMAP2bis design. The IMAP WG decided to rename IMAP2bis to
IMAP4 to avoid confusion with a competing IMAP3 proposal from another group that
never got off the ground. The expansion of the IMAP acronym also changed to the
Internet Message Access Protocol.
Some design flaws in the original IMAP4 (defined by RFC 1730) that came out in
implementation experience led to its revision and replacement by IMAP4rev1 two years
later. There were very few IMAP4 client or server implementations due to its short
lifetime.
IMAP4rev1
The current version of IMAP since 1996, IMAP version 4 revision 1
(IMAP4rev1), is defined by RFC 3501 which revised the earlier RFC 2060.
IMAP4rev1 is upwards compatible with IMAP2 and IMAP2bis; and is largely
upwards-compatible with IMAP4. However, the older versions are either extinct or
nearly so.
Unlike many older Internet protocols, IMAP4 natively supports encrypted login
mechanisms. Plain-text transmission of passwords in IMAP4 is also possible. Because
the encryption mechanism to be used must be agreed between the server and client, plain-
text passwords are used in some combinations of clients and servers (typically Microsoft
Windows clients and non-Windows servers). It is also possible to encrypt IMAP4 traffic
22
using SSL, either by tunneling IMAP4 communications over SSL on port 993, or by

issuing STARTTLS within an established IMAP4 session (see RFC 2595).
IMAP4 works over a TCP/IP connection using network port 143.
2.5.2. IMAP advantages
The functional areas where POP is weak, with respect to online/disconnected
operation, are strengths for IMAP, since online access was its original design center.
Specific advantages of IMAP over POP (for online/disconnected use) include:
 Remote Folder Manipulation:
 Multiple folder support
 Online performance optimization
 In addition, IMAP has provision for negotiated extensions, and therefore
its capabilities can grow incrementally
 Support for simultaneous update and update discovery in shared folders
2.5.3. Disadvantages of IMAP
IMAP has two disadvantages when compared to POP:
 The protocol is more complex, and requires more effort to implement.
 There is currently less IMAP software available than POP software
While IMAP remedies many of the shortcomings of POP, this inherently
introduces additional complexity. Much of this complexity (e.g., multiple clients
accessing the same mailbox at the same time) is compensated for by server-side
workarounds such as maildir or database backends.
Unless the mail store and searching algorithms on the server are carefully
implemented, a client can potentially consume large amounts of server resources when
searching massive mailboxes.
IMAP4 clients need to explicitly request new email message content potentially
causing additional delays on slow connections such as those commonly used by mobile
devices. A private proposal, push IMAP, would extend IMAP to implement push e-mail
by sending the entire message instead of just a notification. However, push IMAP has not
been generally accepted and current IETF work has addressed the problem in other ways
(see the Lemonade Profile for more information).
Unlike some proprietary protocols which combine sending and retrieval

operations, sending a message and saving a copy in a server-side folder with a base-level
IMAP client requires transmitting the message content twice, once to SMTP for delivery
and a second time to IMAP to store in a sent mail folder. This is remedied by a set of
extensions defined by the IETF LEMONADE Working Group for mobile devices:
23
URLAUTH (RFC 4467) and CATENATE (RFC 4469) in IMAP and BURL (RFC
4468) in SMTP-SUBMISSION. POP3 servers don't support server-side folders so clients
have no choice but to store sent items on the client. Many IMAP clients can be
configured to store sent mail in a client-side folder. In addition to the LEMONADE
"trio", Courier Mail Server offers a non-standard method of sending using IMAP by
copying an outgoing message to a dedicated outbox folder.
Key Terms
• Better security than POP because it supports
authentication.
• Allows the user to manage their email on the
server.
• Allows the user to access their email from
multiple computers.
• The mail always stays on the server.
3. Introduction to java script
JavaScript is a programming language that can be included on web pages to make
them more interactive. You can use it to check or modify the contents of forms, change
24
images, open new windows and write dynamic page content. You can even use it with
CSS to make DHTML (Dynamic Hyper Text Markup Language).

This allows you to make parts of your web pages appear or disappear or move around
on the page. Java Scripts only execute on the page(s) that are on your browser window at
any set time. When the user stops viewing that page, any scripts that were running on it
are immediately stopped. The only exception is a cookie, which can be used by many

pages to pass information between them, even after the pages have been closed.
Before we go any further, let me say; JavaScript has nothing to do with Java. If we
are honest, JavaScript, originally nicknamed LiveWire and then LiveScript when it was
created by Netscape, should in fact be called ECMAscript as it was renamed when
Netscape passed it to the ECMA for standardization.
JavaScript is a client side, interpreted, object oriented, high level scripting language,
while Java is a client side, compiled, object oriented high level language.
A scripting language developed by Netscape to enable Web authors to design
interactive sites. Although it shares many of the features and structures of the full Java
language, it was developed independently. Javascript can interact with HTML source
code, enabling Web authors to spice up their sites with dynamic content. JavaScript is
endorsed by a number of software companies and is an open language that anyone can
use without purchasing a license. It is supported by recent browsers from Netscape and
Microsoft, though Internet Explorer supports only a subset, which Microsoft calls Jscript.
3.1. What is JavaScript
 JavaScript was designed to add interactivity to HTML pages
 JavaScript is a scripting language
 A scripting language is a lightweight programming language
 JavaScript is usually embedded directly into HTML pages
 JavaScript is an interpreted language (means that scripts execute without
preliminary compilation)
 Everyone can use JavaScript without purchasing a license
3.2. What can a JavaScript Do
 JavaScript gives HTML designers a programming tool - HTML authors
are normally not programmers, but JavaScript is a scripting language with a
very simple syntax! Almost anyone can put small "snippets" of code into
their HTML pages
 JavaScript can put dynamic text into an HTML page - A JavaScript
statement like this: document. write("<h1>" + name + "</h1>") can write a
variable text into an HTML page

 JavaScript can react to events - A JavaScript can be set to execute when
something happens, like when a page has finished loading or when a user
clicks on an HTML element
25

Tài liệu Web Technology Book doc

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về