Tải bản đầy đủ (.pdf) (584 trang)

Principles of digital communication and coding; Andrew J. Viterbi

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (48.4 MB, 584 trang )

Principles
of Digital Communication

and Coding

Andrew JViterbi
Jim K.Omura





PRINCIPLES OF
DIGITAL

COMMUNICATION
AND CODING


McGraw-Hill

Series in Electrical Engineering

Consulting Editor

Stephen

W.

Director, Carnegie-Mellon University


Networks and Systems
Communications and Information Theory
Control Theory
Electronics and Electronic Circuits

Power and Energy
Electromagnetics

Computer Engineering and Switching Theory
Introductory and Survey

Radio, Television, Radar, and Antennas

Previous Consulting Editors

Ronald M. Bracewell, Colin Cherry, James F. Gibbons, Willis W. Harman,
Hubert Heffner, Edward W. Herold, John G. Linvill, Simon Ramo, Ronald A. Rohrer,
Anthony E. Siegman, Charles Susskind, Frederick E. Terman, John G. Truxal,
Ernst Weber, and John R. Whinnery


Communications and Information Theory
Consulting Editor

Stephen

W.

Director, Carnegie-Mellon University


Abramson: Information Theory and Coding
Angelakos and Everhart: Microwave Communications
Antoniou: Digital Filters: Analysis and Design
Bennett: Introduction to Signal Transmission

Berlekamp: Algebraic Coding Theory
Carlson: Communications Systems

Davenport: Probability and

Random

Processes:

An

Introduction for Applied Scientists and

Engineers

Davenport and Root: Introduction to Random Signals and Noise
Drake: Fundamentals of Applied Probability Theory

Gold and Rader: Digital Processing of Signals
Guiasu: Information Theory with

Hancock: An Introduction

New


Applications

to Principles of

Communication Theory

Melsa and Cohn: Decision and Estimation Theory
Papoulis: Probability,

Random

Variables, and Stochastic Processes

Papoulis: Signal Analysis

Schwartz: Information Transmission, Modulation, and Noise
Schwartz, Bennett, and Stein: Communication Systems and Techniques
Schwartz and Shaw: Signal Processing

An Engineering Approach
Communication
Schilling: Principles of
Systems
Viterbi: Principles of Coherent Communication
Shooman:

Probabilistic Reliability:

Taub and


Viterbi and

Omura:

Principles of Digital Communication and Coding



PRINCIPLES OF
DIGITAL

COMMUNICATION
AND CODING
Andrew

J. Viterbi

LINK ABIT

Corporation

Jim K. Omura
University of California,

Los Angeles

McGraw-Hill,
Caracas

Inc.


San Francisco Auckland Bogota
Lisbon London Madrid Mexico City Milan
Montreal New Delhi San Juan Singapore
Sydney Tokyo Toronto

New York

St.

Louis


PRINCIPLES OF DIGITAL
Copyright

COMMUNICATION AND CODING

1979 by McGraw-Hill, Inc. All rights reserved.

Printed in the United States of America.

No

part of this publication

be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying,

may


recording, or otherwise, without the
prior written permission of the publisher.

9101112 KPKP 976543

This book was set

The

editors were

in Times Roman.
Frank J. Cerra and

W. Maisel;

J.

was designed by Albert M. Cetta;
the production supervisor was Charles Hess.
The drawings were done by Santype Ltd.
the cover

Kingsport Press was printer and binder.

Library of Congress Cataloging
Viterbi,

Andrew


in

Publication

Data

J

Principles of digital

communication and coding.

(McGraw-Hill electrical engineering
cations and information theory section)

series:

Communi

Includes bibliographical references and index.
1.
I.

Digital communications.

Omura, Jim

III.


K., joint author.

Coding theory.
Title.

Series.

TK5103.7.V57

ISBN

2.
II.

0-07-0675 16-3

621.38

78-13951


CONTENTS

Preface

Part

One

Chapter


xi

Fundamentals of Digital
Communication and Block Coding
1

1.1

1.2
1.3

1.4

Digital Communication Systems:
Fundamental Concepts and Parameters
Sources, Entropy, and the Noiseless Coding Theorem
Mutual Information and Channel Capacity
The Converse to the Coding Theorem
Summary and Bibliographical Notes

Appendix 1A
Appendix IB
Problems

Chapter 2
2.1

2.2


2.3

2.4
2.5

2.6

2.7
2.8

2.9

3

7

19

28

34

Convex Functions

35

Jensen Inequality for Convex Functions

40
42


Channel Models and Block Coding

47

Block-coded Digital Communication on the Additive
Gaussian Noise Channel
Minimum Error Probability and Maximum Likelihood

47

Decoder

54

Error Probability and a Simple Upper Bound
A Tighter Upper Bound on Error Probability

64

AWGN

58

Channel
Equal Energy Orthogonal Signals on the
Bandwidth Constraints, Intersymbol Interference, and

65


Tracking Uncertainty

69

Channel Input Constraints
Channel Output Quantization: Discrete Memoryless
Channels
Linear Codes

76
78

82

vii


VIII

CONTENTS
*2.10

Systematic Linear Codes and

Optimum Decoding

for the

BSC
*2.11


89

Examples of Linear Block Code Performance on the
Channel and Its Quantized Reductions
Other Memoryless Channels
Bibliographical Notes and References

AWGN

2.12
2.13

Appendix 2A

117

Problems

3.1

3.2

119

Block Code Ensemble Performance Analysis

128

Code Ensemble Average Error Probability: Upper Bound

The Channel Coding Theorem and Error Exponent

128

Properties for Memoryless Channels

133

3.3

Expurgated Ensemble Average Error Probability: Upper
Bound at Low Rates

143

3.4

Examples: Binary-Input, Output-Symmetric Channels, and
Very Noisy Channels
ChernorT Bounds and the Neyman-Pearson Lemma

158

3.5

Sphere- Pack ing Lower Bounds
Zero Rate Lower Bounds

*3.8


Low

*3.9

Conjectures and Converses
Ensemble Bounds for Linear Codes

184

Bibliographical Notes and References

194

*3.10
3.11

164
173

Rate Lower Bounds

178

189

Appendix 3 A Useful Inequalities and the Proofs of Lemma 3.2.1
and Corollary 3.3.2
Appendix 3B Kuhn-Tucker Conditions and Proofs of Theorems
3.2.2 and 3.2.3
Appendix 3C


Computational Algorithm

for

Capacity

Problems

Two

Chapter 4
4.1

4.2

Convolutional Coding and
Communication

4.3

4.5

4.6

207
212

227


Introduction and Basic Structure

227

Maximum

Likelihood Decoder for Convolutional Codes

Viterbi Algorithm
Distance Properties of Convolutional

235

Codes

for

Performance Bounds for Specific Convolutional Codes on
Binary-Input, Output-Symmetric Memoryless Channels
Special Cases and Examples
Structure of Rate I/A? Codes and Orthogonal Convolutional

Codes

May

202

Convolutional Codes


Binary-Input Channels
4.4

194

Digital

The

*

151

*3.7

3.6

Part

116

Gram-Schmidt Orthogonalization and Signal

Representation

Chapter 3

96
102


be omitted without loss of continuity.

239
242
246
253


CONTENTS

IX

Path Memory Truncationa, Metric Quantization, and Code
Synchronization in Viterbi Decoders

258

*4.8

Feedback Decoding

262

*4.9

Intersymbol Interference Channels

272

4.7


*4.10
4.11

Coding

for

Intersymbol Interference Channels

284

Bibliographical Notes and References

286

Problems

287

Convolutional Code Ensemble Performance

301

5.1

The Channel Coding Theorem
Convolutional Codes

301


5.2

Examples: Convolutional Coding Exponents
Channels

5.3

Expurgated Upper Bound for Binary-Input,
Output-Symmetric Channels
Lower Bound on Error Probability

Chapter 5

5.4

*5.5
5.6

5.7

*5.8

5.9

for

Time- varying
for


Very Noisy

Lengths of Error Events
Path Memory Truncation and Initial Synchronization

Critical

315
318
322

Errors

327

Error Bounds for Systematic Convolutional Codes
Time-varying Convolutional Codes on Intersymbol

328

Interference Channels

331

Bibliographical Notes and References

341

Problems


Chapter 6

313

342

Sequential Decoding of Convolutional

Codes

349

6.1

Fundamentals and a Basic Stack Algorithm

349

6.2

Distribution of Computation:

Upper Bound
Error Probability Upper Bound
Distribution of Computations: Lower Bound
The Fano Algorithm and Other Sequential Decoding

355

6.3


6.4
6.5

361

365

Algorithms

370

6.6

Complexity, Buffer Overflow, and Other System
Considerations

374

6.7

Bibliographical Notes and References

378

379

Problems

Part Three


Source Coding for Digital

Communication
Chapter 7
7.1

7.2

Rate Distortion Theory: Fundamental
Concepts for Memoryless Sources
The Source Coding Problem
Discrete Memoryless Sources

385
385

Block Codes

388


X CONTENTS

7.3

7.4
7.5

*7.6


*7.7

7.8

Relationships with Channel Coding
Discrete Memoryless Sources Trellis Codes

404

Continuous Amplitude Memoryless Sources
Evaluation of R(D) Discrete Memoryless Sources
Evaluation of R(D) Continuous Amplitude Memoryless

423

Sources

445

Bibliographical Notes and References

453

Appendix 7A
Problems

Chapter 8

Computational Algorithm


for

R(D)

411
431

454
459

Rate Distortion Theory: Memory, Gaussian
Sources, and Universal Coding

468

8.1

Memoryless Vector Sources

468

8.2

Sources with

8.3

Bounds


8.4
8.5

Gaussian Sources with Squared-Error Distortion
Symmetric Sources with Balanced Distortion Measures and

479
494
498

8.6

Fixed Composition Sequences
Universal Coding

526

8.7

Bibliographical Notes and References

534

Appendix 8A
Problems

for

Memory


R(D)

Chernoff Bounds for Distortion Distributions

513

534
541

Bibliography

547

Index

553


PREFACE

Digital

communication

is

much used term

a


with

many

shades of meaning,

widely varying and strongly dependent on the user s role and requirements.
This book is directed to the communication theory student and to the designer
of the channel, link, terminal, modem, or network used to transmit and receive
digital messages.

Within

this

community,

digital

communication theory has come

to signify the body of knowledge and techniques which deal with the two-faceted
problem of (1) minimizing the number of bits which must be transmitted over

the communication channel so as to provide a given printed, audio, or visual
record within a predetermined fidelity requirement (called source coding): and
(2) ensuring that bits transmitted over the channel are received correctly despite
the effects of interference of various types and origins (called channel coding).
The foundations of the theory which provides the solution to this twofold problem
were laid by Claude Shannon in one remarkable series of papers in 1948. In the


intervening decades, the evolution and application of this so-called information
theory have had ever-expanding influence on the practical implementation of

communication systems, although their widespread application has
required the evolution of electronic-device and system technology to a point
which was hardly conceivable in 1948. This progress was accelerated by the
digital

development of the large-scale integrated-circuit building block and the
economic incentive of communication satellite applications.
We have not attempted in this book to cover peripheral topics related to
digital communication theory when they involve a major deviation from the
basic concepts and techniques which lead to the solution of this fundamental
problem. For this reason, constructive algebraic techniques, though valuable for
developing code structures and important theoretical results of broad interest, are
specifically avoided in this book. Similarly, the peripheral, though practically
important, problems of carrier phase and frequency tracking, and time synchroni
zation are not treated here. These have been covered adequately elsewhere. On
the other hand, the equally practical subject

of intersymbol interference in
xi


xii

PREFACE

digital


communication, which

fundamentally similar to the problem of con-

is

volutional coding, is covered and new insights are developed through connections
with the mainstream topics of the text.

This book was developed over approximately a dozen years of teaching a
sequence of graduate courses at the University of California, Los Angeles, and later
at the University of California, San Diego, with partial notes being distributed
over the past few years. Our goal in the resulting manuscript has been to provide
the

most

direct routes to achieve

an understanding of this field for a variety of
some fundamental background in probability

goals and needs. All readers require

and random processes and preferably their application to communication
problems; one year s exposure to any of a variety of engineering or mathematics
courses provides this background and the resulting maturity required to start.
Given this preliminary knowledge, there are numerous approaches to utiliza
tion of this text to achieve various individual goals, as illustrated graphically

semester or quarter course for the begin
by the prerequisite structure of Fig. P-l.

A

involve only Part One, consisting of the first three
chapters (omitting starred sections) which provide, respectively, the fundamental
concepts and parameters of sources and channels, a thorough treatment of channel
models based on physical requirements, and an undiluted initiation into the eval

ning graduate student

may

uation of code capabilities based on ensemble averages.

The advanced student or

Part one

Fundamentals of
digital communication
and block coding

Part

two

Convolutional coding for
digital


communication

Part three

Source coding for
digital

communication

Introductory --

Figure P.I Organization and prerequisite structure.

Advanced


PREFACE

xiii

can then proceed with Part Two, an equally detailed exposition of
convolutional coding and decoding. These techniques are most effective in ex
ploiting the capabilities of the channel toward approaching virtually error-free
specialist

communications. It
which demonstrates

is


possible in a one-year course to cover Part Three as well,
coding techniques are derived essentially

how optimal source

One and Two.
The applications-oriented engineer or student can obtain an understanding

as the duals of the channel coding techniques of Parts

of channel coding for physical channels by tackling only Chapters 2, 4, and about
half of 6. Avoiding the intricacies of ensemble-average arguments, the reader

can learn

how

to code for noisy channels without

making

the additional effort

to understand the complete theory.
At the opposite extreme, students

with some background in digital
communications can be guided through the channel-coding material in Chapters
3 through 6 in a one-semester or one-quarter course, and advanced students,

who already have channel-coding background, can cover Part Three on source

coding

in a

course of similar duration.

Numerous problems

are provided to

furnish examples, to expand on the material or indicate related results, and
occasionally to guide the reader through the steps of lengthy alternate proofs
and derivations.

Aside from the obvious dependence of any course in this field on Shannon s
work, two important textbooks have had notable effect on the development and
organization of this book. These are Wozencraft and Jacobs [1965], which first

emphasized the physical characteristics of digital communication channels as a
basis for the development of coding theory fundamentals, and Gallager [1968].
which is the most complete and expert treatment of this field to date.
Collaboration with numerous university colleagues and students helped
establish the framework for this book. But the academic viewpoint has been
tempered in the book by the authors extensive involvement with industrial
applications. A particularly strong influence has been the close association of the
first author with the design team at LINKABIT Corporation, led by I. M. Jacobs,
J. A. Heller, A. R. Cohen, and K. S. Gilhousen, which first implemented high
speed reliable versions of all the convolutional decoding techniques treated in this

book. The final manuscript also reflects the thorough and complete reviews and

by J. L. Massey, many of whose suggested improvements
have been incorporated to the considerable benefit of the prospective reader.
Finally, those discouraged by the seemingly lengthy and arduous route to a
thorough understanding of communication theory might well recall the ancient
words attributed to Lao Tzu of twenty-five centuries ago: "The longest journey
critiques of the entire text

starts

with but a single

step."

Andrew

J.

Viterbi

Jim K.

Omura



PART

ONE

FUNDAMENTALS OF
DIGITAL COMMUNICATION
AND BLOCK CODING



CHAPTER

ONE
DIGITAL COMMUNICATION SYSTEMS:

FUNDAMENTAL CONCEPTS
AND PARAMETERS

In the field of communication system engineering, the second half of the twentieth
is destined to be recognized as the era of the evolution of digital communi

century

cation, as indeed the

first

half was the era of the evolution of radio

to the point of reliable transmission of messages, speech,

and

communication


television,

mostly

in

analog form.

The development of digital communication was given impetus by

three prime

driving needs:

1.

2.

Greatly increased demands for data transmission of every form, from computer
data banks to remote-entry data terminals for a variety of applications, with
ever-increasing accuracy requirements
Rapid evolution of synchronous artificial satellite relays which facilitate world
at very high data rates, but whose launch costs, and
and
bandwidth limitations, impose a significant economic
consequent power
incentive on the efficient use of the channel resources
Data communication networks which must simultaneously service many differ
ent users with a variety of rates and requirements, in which simple and efficient

multiplexing of data and multiple access of channels is a primary economic

wide communications

3.

concern

These requirements and the solid-state electronic technology needed to sup
port the development of efficient, flexible, and error-free digital communication


4 FUNDAMENTALS OF DIGITAL COMMUNICATION

AND BLOCK CODING

Destination

Source

Figure

1.1

Basic model of a digital communication system.

systems evolved simultaneously and in parallel throughout the third quarter of
this century, but the theoretical foundations were laid just before mid-century by

Mathematical Theory of Communication papers of C. E. Shan

non [1948]. With unique intuition, Shannon perceived that the goals of approach
ing error-free digital communication on noisy channels and of maximally efficient
conversion of analog signals to digital form were dual facets of the same problem,
that they share a common framework and virtually a common solution. For the
"

"

the celebrated

part, this solution is presented in the original Shannon papers. The
refinement, embellishment, and reduction to practical form of the theory occupied
many researchers for the next two decades in efforts which paralleled in time

most

the technology development required to implement the techniques
which the theory dictated.

The dual problem formulated and solved by Shannon

is

and algorithms

best described in

terms of the block diagram of Fig. 1.1. The source is modeled as a random
generator of data or a stochastic signal to be transmitted. The source encoder


performs a mapping from the source output into a digital sequence (usually
binary). If the source itself generates a digital output, the encoder mapping can be
one-to-one. Ignore for the moment the channel with its encoder and decoder
(within the dashed contour in Fig. 1.1) and replace it by a direct connection called

a noiseless channel. Then

if

the source encoder

mapping

is

one-to-one, the source

decoder can simply perform the inverse mapping and thus deliver to the destina
tion the same data as was generated by the source. The purpose of the source
encoder-decoder pair is then to reduce the source output to a minimal representa

The measure of

"

"

data compression achieved is the rate in symbols
(usually binary) required per unit time to fully represent and, ultimately at the
source decoder, to reconstitute the source output sequence. This minimum rate at

tion.

the

which the stochastic digital source sequence can be transmitted over a noiseless
channel and reconstructed perfectly is related to a basic parameter of stochastic
sources called entropy.
When the source is analog, it cannot be represented perfectly by a digital
sequence because the source output sequence takes on values from an un-

countably

and thus obviously cannot be mapped one-to-one into a
1
a digital alphabet. The best that can be done in mapping the

infinite set,

discrete set,

i.e.,

source into a digital sequence

1

is

to tolerate


some

distortion at the destination after

The simplest example of an analog source encoder is an analog-to-digital converter, also called a
quantizer, for which the source decoder is a digital-to-analog converter.


DIGITAL COMMUNICATION SYSTEMS! FUNDAMENTAL CONCEPTS

AND PARAMETERS

5

the source decoder operation which now only approximates the inverse mapping.
In this case, the distortion (appropriately defined) is set at a fixed maximum, and
the goal is to minimize the rate
again defined in digital symbols per unit time
subject to the distortion limit. The solution to this problem requires the generali

zation of the entropy parameter of the source to a quantity called the rate
distortion function. This function of distortion represents the minimum rate at

which the source output can be transmitted over a noiseless channel and

still

be

reconstructed within the given distortion.

The dual to this first problem is the accurate transmission of the digital
(source encoder output) sequence over a noisy channel. Considering now only the

blocks within the dashed contour in Fig. 1.1, the noisy channel is to be regarded as
random mapping of its input defined over a given discrete set (digital alphabet)
into an output defined over an arbitrary set which is not necessarily the same as

a

the input

set.

In

most physical channels the output space is often contin
although discrete channel models are also commonly

fact, for

uous (uncountable)
considered.

The goal of

the channel encoder

and decoder

is


to

map

the input digital

sequence into a channel input sequence and conversely the channel output se
quence into an output digital sequence such that the effect of the channel noise is

minimized that is, such that the number of discrepancies (errors) between the
output and input digital sequences is minimized. The approach is to introduce
redundancy in the channel encoder and to use this redundancy at the decoder to
reconstitute the input sequence as accurately as possible. Thus in a simplistic sense
the channel coding is dual to the source coding in that the latter eliminates or
reduces redundancy while the former introduces it for the purpose of minimizing
errors.

As

will

be shown to the reader

established in a

much more

completes


quantitative and precise

this

book,

this duality

can be

Without further evolu
most remarkable conclu

sense.

we can state the single
Shannon channel coding theory: namely, that with

tion of the concepts at this point,

sion of the

who

sufficient

but

finite


redundancy properly introduced at the channel encoder, it is possible for the
channel decoder to reconstitute the input sequence to any degree of accuracy
desired. The measure of redundancy introduced is established by the rate of digital
symbols per unit time input to the channel encoder and output from the channel
decoder. This rate, which is the same as the rate at the source encoder output and
source decoder input, must be less than the rate of transmission over the noisy
channel because of the redundancy introduced. Shannon s main result here is that
provided the input rate to the channel encoder is less than a given value estab
lished by the channel capacity (a basic parameter of the channel which is a func
tion of the random mapping distribution which defines the channel), there exist
encoding and decoding operations which asymptotically for arbitrarily long se
quences can lead to error-free reconstruction of the input sequence.
As an immediate consequence of the source coding and channel coding
theories, it follows that if the minimum rate at which a digital source sequence can
be uniquely represented by the source encoder is less than the maximum rate for
which the channel output can be reconstructed error-free by the channel decoder


6 FUNDAMENTALS OF DIGITAL COMMUNICATION

AND BLOCK CODING

then the system of Fig. 1.1 can transfer digital data with arbitrarily high accuracy
from source to destination. For analog sources the same holds, but only within a
predetermined (tolerable) distortion which determines the source encoder s mini

mum

rate,


provided

this rate

is less

than the channel

maximum

rate

mentioned

above.

This text aims to present quantitatively these fundamental concepts of digital

communication system theory and to demonstrate their applicability to existing
channels and sources.
In this first chapter, two of the basic parameters, source entropy and channel
capacity, are defined and a start is made toward establishing their significance.
Entropy is shown to be the key parameter in the noiseless source coding theorem,
proved in Sec. 1.1. The similar role of the capacity parameter for channels is
partially established by the proof in Sec. 1.3 of the converse to the channel coding
theorem, which establishes that for no rate greater than the maximum determined
by capacity can error-free reconstruction be effected by any channel encoderdecoder pair. The full significance of capacity is established only in the next two
chapters. Chap. 2 defines and derives the models of the channels of greatest inter
est to the communication system designer and introduces the rudimentary con
cepts of channel encoding and decoding. In Chap. 3 the proof of the channel

coding theorem is completed in terms of a particular class of channel codes called
block codes, and thus the full significance of capacity is established.
However, while the theoretical capabilities and limitations of channel coding
are well established by the end of Chap. 3, their practical applicability and manner
of implementation is not yet clear. This situation is for the most part remedied by
Chap. 4 which describes a more practical and powerful class of codes, called
convolutional codes, for which the channel encoding operation is performed by a
digital linear filter, and for which the channel decoding operation arises in a
natural manner from the simple properties of the code. Chap. 5 establishes further
properties and limitations of these codes and compares them with those of block
codes established in Chap. 3. Then Chap. 6 explores an alternative decoding
procedure, called sequential decoding, which permits under some circumstances
and with some limitations the use of extremely powerful convolutional codes.
Finally Chap. 7 returns to the source coding problem, considering analog
sources for the first time and developing the fundamentals of rate distortion

theory for memoryless sources. Both block and convolutional source coding
techniques are treated and thereby the somewhat remarkable duality between
channel and source coding problems and solutions is established. Chap. 8 extends
the concepts of Chap. 7 to sources with
topics in rate distortion theory.

memory and

presents

more advanced

Shannon s mathematical theory of communication almost from the outset
became known as information theory. While indeed one aspect of the theory is to

define information and establish its significance in practical engineering terms, the
main contribution of the theory has been in establishing the ultimate capabilities
and limitations of digital communication systems. Nevertheless, a natural starting


DIGITAL COMMUNICATION SYSTEMS: FUNDAMENTAL CONCEPTS

AND PARAMETERS

7

point is the quantitative definition of information as required by the communica
tion engineer. This will lead us in Sec. 1.1 to the definition of entropy and the
development of its key role as the basic parameter of digital source coding.

1.1

AND THE NOISELESS CODING

SOURCES, ENTROPY,

THEOREM

"

The weather today in Los Angeles is sunny with moderate amounts of smog is a
news event that, though not surprising, contains some information, since our
"

previous uncertainty about the weather in Los Angeles is now resolved. On the

other hand, the news event, Today there was a devastating earthquake in Cali
"

much of downtown Los Angeles," is more unexpected and
more information than the first report. But what is informa
meant by the "information" contained in the above two events?

fornia which leveled
certainly contains

tion?

What

is

Certainly if we are formally to define a quantitative measure of information con
tained in such events, this measure should have some intuitive properties such as
1.

2.

Information contained

in events

ought to be defined

of the uncertainty of the events.
Less certain events ought to contain


in

:

terms of some measure

more information than more

certain

events.

In addition, assuming that weather conditions and earthquakes are unrelated
if we were informed of both news events we would expect that the total
amount of information in the two news events be the sum of the information

events,

contained

3.

in each.

Hence we have a

The information

sum


third desired property:

of unrelated events taken as a single event should equal the

of the information of the unrelated events.

A natural measure of the uncertainty of an event a is the probability of a
denoted P(a). The formal term for unrelatedness is independence; two events a
and f$ are said to be independent if
"

"

P(a

n

/?)

=

P(a)P(/J)

(1.1.1)

Once we

agree to define the information of an event a in terms of the probability
of a, the properties (2) and (3) will be satisfied if the information in event a is


defined as

/(a)= -logP(a)
from which

it

follows

P(a)P() = -log

P(a)

specifies the scaling

that,

-log

if

P(fi)

and hence

a and

=


/?

+

/(a)
the unit

are

(1.1.2)

independent,

/(a

n

ft]

=

log

/(). The base of the logarithm merely
of information we wish to use. This


×