Principles
of Digital Communication
and Coding
Andrew JViterbi
Jim K.Omura
PRINCIPLES OF
DIGITAL
COMMUNICATION
AND CODING
McGraw-Hill
Series in Electrical Engineering
Consulting Editor
Stephen
W.
Director, Carnegie-Mellon University
Networks and Systems
Communications and Information Theory
Control Theory
Electronics and Electronic Circuits
Power and Energy
Electromagnetics
Computer Engineering and Switching Theory
Introductory and Survey
Radio, Television, Radar, and Antennas
Previous Consulting Editors
Ronald M. Bracewell, Colin Cherry, James F. Gibbons, Willis W. Harman,
Hubert Heffner, Edward W. Herold, John G. Linvill, Simon Ramo, Ronald A. Rohrer,
Anthony E. Siegman, Charles Susskind, Frederick E. Terman, John G. Truxal,
Ernst Weber, and John R. Whinnery
Communications and Information Theory
Consulting Editor
Stephen
W.
Director, Carnegie-Mellon University
Abramson: Information Theory and Coding
Angelakos and Everhart: Microwave Communications
Antoniou: Digital Filters: Analysis and Design
Bennett: Introduction to Signal Transmission
Berlekamp: Algebraic Coding Theory
Carlson: Communications Systems
Davenport: Probability and
Random
Processes:
An
Introduction for Applied Scientists and
Engineers
Davenport and Root: Introduction to Random Signals and Noise
Drake: Fundamentals of Applied Probability Theory
Gold and Rader: Digital Processing of Signals
Guiasu: Information Theory with
Hancock: An Introduction
New
Applications
to Principles of
Communication Theory
Melsa and Cohn: Decision and Estimation Theory
Papoulis: Probability,
Random
Variables, and Stochastic Processes
Papoulis: Signal Analysis
Schwartz: Information Transmission, Modulation, and Noise
Schwartz, Bennett, and Stein: Communication Systems and Techniques
Schwartz and Shaw: Signal Processing
An Engineering Approach
Communication
Schilling: Principles of
Systems
Viterbi: Principles of Coherent Communication
Shooman:
Probabilistic Reliability:
Taub and
Viterbi and
Omura:
Principles of Digital Communication and Coding
PRINCIPLES OF
DIGITAL
COMMUNICATION
AND CODING
Andrew
J. Viterbi
LINK ABIT
Corporation
Jim K. Omura
University of California,
Los Angeles
McGraw-Hill,
Caracas
Inc.
San Francisco Auckland Bogota
Lisbon London Madrid Mexico City Milan
Montreal New Delhi San Juan Singapore
Sydney Tokyo Toronto
New York
St.
Louis
PRINCIPLES OF DIGITAL
Copyright
COMMUNICATION AND CODING
1979 by McGraw-Hill, Inc. All rights reserved.
Printed in the United States of America.
No
part of this publication
be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying,
may
recording, or otherwise, without the
prior written permission of the publisher.
9101112 KPKP 976543
This book was set
The
editors were
in Times Roman.
Frank J. Cerra and
W. Maisel;
J.
was designed by Albert M. Cetta;
the production supervisor was Charles Hess.
The drawings were done by Santype Ltd.
the cover
Kingsport Press was printer and binder.
Library of Congress Cataloging
Viterbi,
Andrew
in
Publication
Data
J
Principles of digital
communication and coding.
(McGraw-Hill electrical engineering
cations and information theory section)
series:
Communi
Includes bibliographical references and index.
1.
I.
Digital communications.
Omura, Jim
III.
K., joint author.
Coding theory.
Title.
Series.
TK5103.7.V57
ISBN
2.
II.
0-07-0675 16-3
621.38
78-13951
CONTENTS
Preface
Part
One
Chapter
xi
Fundamentals of Digital
Communication and Block Coding
1
1.1
1.2
1.3
1.4
Digital Communication Systems:
Fundamental Concepts and Parameters
Sources, Entropy, and the Noiseless Coding Theorem
Mutual Information and Channel Capacity
The Converse to the Coding Theorem
Summary and Bibliographical Notes
Appendix 1A
Appendix IB
Problems
Chapter 2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3
7
19
28
34
Convex Functions
35
Jensen Inequality for Convex Functions
40
42
Channel Models and Block Coding
47
Block-coded Digital Communication on the Additive
Gaussian Noise Channel
Minimum Error Probability and Maximum Likelihood
47
Decoder
54
Error Probability and a Simple Upper Bound
A Tighter Upper Bound on Error Probability
64
AWGN
58
Channel
Equal Energy Orthogonal Signals on the
Bandwidth Constraints, Intersymbol Interference, and
65
Tracking Uncertainty
69
Channel Input Constraints
Channel Output Quantization: Discrete Memoryless
Channels
Linear Codes
76
78
82
vii
VIII
CONTENTS
*2.10
Systematic Linear Codes and
Optimum Decoding
for the
BSC
*2.11
89
Examples of Linear Block Code Performance on the
Channel and Its Quantized Reductions
Other Memoryless Channels
Bibliographical Notes and References
AWGN
2.12
2.13
Appendix 2A
117
Problems
3.1
3.2
119
Block Code Ensemble Performance Analysis
128
Code Ensemble Average Error Probability: Upper Bound
The Channel Coding Theorem and Error Exponent
128
Properties for Memoryless Channels
133
3.3
Expurgated Ensemble Average Error Probability: Upper
Bound at Low Rates
143
3.4
Examples: Binary-Input, Output-Symmetric Channels, and
Very Noisy Channels
ChernorT Bounds and the Neyman-Pearson Lemma
158
3.5
Sphere- Pack ing Lower Bounds
Zero Rate Lower Bounds
*3.8
Low
*3.9
Conjectures and Converses
Ensemble Bounds for Linear Codes
184
Bibliographical Notes and References
194
*3.10
3.11
164
173
Rate Lower Bounds
178
189
Appendix 3 A Useful Inequalities and the Proofs of Lemma 3.2.1
and Corollary 3.3.2
Appendix 3B Kuhn-Tucker Conditions and Proofs of Theorems
3.2.2 and 3.2.3
Appendix 3C
Computational Algorithm
for
Capacity
Problems
Two
Chapter 4
4.1
4.2
Convolutional Coding and
Communication
4.3
4.5
4.6
207
212
227
Introduction and Basic Structure
227
Maximum
Likelihood Decoder for Convolutional Codes
Viterbi Algorithm
Distance Properties of Convolutional
235
Codes
for
Performance Bounds for Specific Convolutional Codes on
Binary-Input, Output-Symmetric Memoryless Channels
Special Cases and Examples
Structure of Rate I/A? Codes and Orthogonal Convolutional
Codes
May
202
Convolutional Codes
Binary-Input Channels
4.4
194
Digital
The
*
151
*3.7
3.6
Part
116
Gram-Schmidt Orthogonalization and Signal
Representation
Chapter 3
96
102
be omitted without loss of continuity.
239
242
246
253
CONTENTS
IX
Path Memory Truncationa, Metric Quantization, and Code
Synchronization in Viterbi Decoders
258
*4.8
Feedback Decoding
262
*4.9
Intersymbol Interference Channels
272
4.7
*4.10
4.11
Coding
for
Intersymbol Interference Channels
284
Bibliographical Notes and References
286
Problems
287
Convolutional Code Ensemble Performance
301
5.1
The Channel Coding Theorem
Convolutional Codes
301
5.2
Examples: Convolutional Coding Exponents
Channels
5.3
Expurgated Upper Bound for Binary-Input,
Output-Symmetric Channels
Lower Bound on Error Probability
Chapter 5
5.4
*5.5
5.6
5.7
*5.8
5.9
for
Time- varying
for
Very Noisy
Lengths of Error Events
Path Memory Truncation and Initial Synchronization
Critical
315
318
322
Errors
327
Error Bounds for Systematic Convolutional Codes
Time-varying Convolutional Codes on Intersymbol
328
Interference Channels
331
Bibliographical Notes and References
341
Problems
Chapter 6
313
342
Sequential Decoding of Convolutional
Codes
349
6.1
Fundamentals and a Basic Stack Algorithm
349
6.2
Distribution of Computation:
Upper Bound
Error Probability Upper Bound
Distribution of Computations: Lower Bound
The Fano Algorithm and Other Sequential Decoding
355
6.3
6.4
6.5
361
365
Algorithms
370
6.6
Complexity, Buffer Overflow, and Other System
Considerations
374
6.7
Bibliographical Notes and References
378
379
Problems
Part Three
Source Coding for Digital
Communication
Chapter 7
7.1
7.2
Rate Distortion Theory: Fundamental
Concepts for Memoryless Sources
The Source Coding Problem
Discrete Memoryless Sources
385
385
Block Codes
388
X CONTENTS
7.3
7.4
7.5
*7.6
*7.7
7.8
Relationships with Channel Coding
Discrete Memoryless Sources Trellis Codes
404
Continuous Amplitude Memoryless Sources
Evaluation of R(D) Discrete Memoryless Sources
Evaluation of R(D) Continuous Amplitude Memoryless
423
Sources
445
Bibliographical Notes and References
453
Appendix 7A
Problems
Chapter 8
Computational Algorithm
for
R(D)
411
431
454
459
Rate Distortion Theory: Memory, Gaussian
Sources, and Universal Coding
468
8.1
Memoryless Vector Sources
468
8.2
Sources with
8.3
Bounds
8.4
8.5
Gaussian Sources with Squared-Error Distortion
Symmetric Sources with Balanced Distortion Measures and
479
494
498
8.6
Fixed Composition Sequences
Universal Coding
526
8.7
Bibliographical Notes and References
534
Appendix 8A
Problems
for
Memory
R(D)
Chernoff Bounds for Distortion Distributions
513
534
541
Bibliography
547
Index
553
PREFACE
Digital
communication
is
much used term
a
with
many
shades of meaning,
widely varying and strongly dependent on the user s role and requirements.
This book is directed to the communication theory student and to the designer
of the channel, link, terminal, modem, or network used to transmit and receive
digital messages.
Within
this
community,
digital
communication theory has come
to signify the body of knowledge and techniques which deal with the two-faceted
problem of (1) minimizing the number of bits which must be transmitted over
the communication channel so as to provide a given printed, audio, or visual
record within a predetermined fidelity requirement (called source coding): and
(2) ensuring that bits transmitted over the channel are received correctly despite
the effects of interference of various types and origins (called channel coding).
The foundations of the theory which provides the solution to this twofold problem
were laid by Claude Shannon in one remarkable series of papers in 1948. In the
intervening decades, the evolution and application of this so-called information
theory have had ever-expanding influence on the practical implementation of
communication systems, although their widespread application has
required the evolution of electronic-device and system technology to a point
which was hardly conceivable in 1948. This progress was accelerated by the
digital
development of the large-scale integrated-circuit building block and the
economic incentive of communication satellite applications.
We have not attempted in this book to cover peripheral topics related to
digital communication theory when they involve a major deviation from the
basic concepts and techniques which lead to the solution of this fundamental
problem. For this reason, constructive algebraic techniques, though valuable for
developing code structures and important theoretical results of broad interest, are
specifically avoided in this book. Similarly, the peripheral, though practically
important, problems of carrier phase and frequency tracking, and time synchroni
zation are not treated here. These have been covered adequately elsewhere. On
the other hand, the equally practical subject
of intersymbol interference in
xi
xii
PREFACE
digital
communication, which
fundamentally similar to the problem of con-
is
volutional coding, is covered and new insights are developed through connections
with the mainstream topics of the text.
This book was developed over approximately a dozen years of teaching a
sequence of graduate courses at the University of California, Los Angeles, and later
at the University of California, San Diego, with partial notes being distributed
over the past few years. Our goal in the resulting manuscript has been to provide
the
most
direct routes to achieve
an understanding of this field for a variety of
some fundamental background in probability
goals and needs. All readers require
and random processes and preferably their application to communication
problems; one year s exposure to any of a variety of engineering or mathematics
courses provides this background and the resulting maturity required to start.
Given this preliminary knowledge, there are numerous approaches to utiliza
tion of this text to achieve various individual goals, as illustrated graphically
semester or quarter course for the begin
by the prerequisite structure of Fig. P-l.
A
involve only Part One, consisting of the first three
chapters (omitting starred sections) which provide, respectively, the fundamental
concepts and parameters of sources and channels, a thorough treatment of channel
models based on physical requirements, and an undiluted initiation into the eval
ning graduate student
may
uation of code capabilities based on ensemble averages.
The advanced student or
Part one
Fundamentals of
digital communication
and block coding
Part
two
Convolutional coding for
digital
communication
Part three
Source coding for
digital
communication
Introductory --
Figure P.I Organization and prerequisite structure.
Advanced
PREFACE
xiii
can then proceed with Part Two, an equally detailed exposition of
convolutional coding and decoding. These techniques are most effective in ex
ploiting the capabilities of the channel toward approaching virtually error-free
specialist
communications. It
which demonstrates
is
possible in a one-year course to cover Part Three as well,
coding techniques are derived essentially
how optimal source
One and Two.
The applications-oriented engineer or student can obtain an understanding
as the duals of the channel coding techniques of Parts
of channel coding for physical channels by tackling only Chapters 2, 4, and about
half of 6. Avoiding the intricacies of ensemble-average arguments, the reader
can learn
how
to code for noisy channels without
making
the additional effort
to understand the complete theory.
At the opposite extreme, students
with some background in digital
communications can be guided through the channel-coding material in Chapters
3 through 6 in a one-semester or one-quarter course, and advanced students,
who already have channel-coding background, can cover Part Three on source
coding
in a
course of similar duration.
Numerous problems
are provided to
furnish examples, to expand on the material or indicate related results, and
occasionally to guide the reader through the steps of lengthy alternate proofs
and derivations.
Aside from the obvious dependence of any course in this field on Shannon s
work, two important textbooks have had notable effect on the development and
organization of this book. These are Wozencraft and Jacobs [1965], which first
emphasized the physical characteristics of digital communication channels as a
basis for the development of coding theory fundamentals, and Gallager [1968].
which is the most complete and expert treatment of this field to date.
Collaboration with numerous university colleagues and students helped
establish the framework for this book. But the academic viewpoint has been
tempered in the book by the authors extensive involvement with industrial
applications. A particularly strong influence has been the close association of the
first author with the design team at LINKABIT Corporation, led by I. M. Jacobs,
J. A. Heller, A. R. Cohen, and K. S. Gilhousen, which first implemented high
speed reliable versions of all the convolutional decoding techniques treated in this
book. The final manuscript also reflects the thorough and complete reviews and
by J. L. Massey, many of whose suggested improvements
have been incorporated to the considerable benefit of the prospective reader.
Finally, those discouraged by the seemingly lengthy and arduous route to a
thorough understanding of communication theory might well recall the ancient
words attributed to Lao Tzu of twenty-five centuries ago: "The longest journey
critiques of the entire text
starts
with but a single
step."
Andrew
J.
Viterbi
Jim K.
Omura
PART
ONE
FUNDAMENTALS OF
DIGITAL COMMUNICATION
AND BLOCK CODING
CHAPTER
ONE
DIGITAL COMMUNICATION SYSTEMS:
FUNDAMENTAL CONCEPTS
AND PARAMETERS
In the field of communication system engineering, the second half of the twentieth
is destined to be recognized as the era of the evolution of digital communi
century
cation, as indeed the
first
half was the era of the evolution of radio
to the point of reliable transmission of messages, speech,
and
communication
television,
mostly
in
analog form.
The development of digital communication was given impetus by
three prime
driving needs:
1.
2.
Greatly increased demands for data transmission of every form, from computer
data banks to remote-entry data terminals for a variety of applications, with
ever-increasing accuracy requirements
Rapid evolution of synchronous artificial satellite relays which facilitate world
at very high data rates, but whose launch costs, and
and
bandwidth limitations, impose a significant economic
consequent power
incentive on the efficient use of the channel resources
Data communication networks which must simultaneously service many differ
ent users with a variety of rates and requirements, in which simple and efficient
multiplexing of data and multiple access of channels is a primary economic
wide communications
3.
concern
These requirements and the solid-state electronic technology needed to sup
port the development of efficient, flexible, and error-free digital communication
4 FUNDAMENTALS OF DIGITAL COMMUNICATION
AND BLOCK CODING
Destination
Source
Figure
1.1
Basic model of a digital communication system.
systems evolved simultaneously and in parallel throughout the third quarter of
this century, but the theoretical foundations were laid just before mid-century by
Mathematical Theory of Communication papers of C. E. Shan
non [1948]. With unique intuition, Shannon perceived that the goals of approach
ing error-free digital communication on noisy channels and of maximally efficient
conversion of analog signals to digital form were dual facets of the same problem,
that they share a common framework and virtually a common solution. For the
"
"
the celebrated
part, this solution is presented in the original Shannon papers. The
refinement, embellishment, and reduction to practical form of the theory occupied
many researchers for the next two decades in efforts which paralleled in time
most
the technology development required to implement the techniques
which the theory dictated.
The dual problem formulated and solved by Shannon
is
and algorithms
best described in
terms of the block diagram of Fig. 1.1. The source is modeled as a random
generator of data or a stochastic signal to be transmitted. The source encoder
performs a mapping from the source output into a digital sequence (usually
binary). If the source itself generates a digital output, the encoder mapping can be
one-to-one. Ignore for the moment the channel with its encoder and decoder
(within the dashed contour in Fig. 1.1) and replace it by a direct connection called
a noiseless channel. Then
if
the source encoder
mapping
is
one-to-one, the source
decoder can simply perform the inverse mapping and thus deliver to the destina
tion the same data as was generated by the source. The purpose of the source
encoder-decoder pair is then to reduce the source output to a minimal representa
The measure of
"
"
data compression achieved is the rate in symbols
(usually binary) required per unit time to fully represent and, ultimately at the
source decoder, to reconstitute the source output sequence. This minimum rate at
tion.
the
which the stochastic digital source sequence can be transmitted over a noiseless
channel and reconstructed perfectly is related to a basic parameter of stochastic
sources called entropy.
When the source is analog, it cannot be represented perfectly by a digital
sequence because the source output sequence takes on values from an un-
countably
and thus obviously cannot be mapped one-to-one into a
1
a digital alphabet. The best that can be done in mapping the
infinite set,
discrete set,
i.e.,
source into a digital sequence
1
is
to tolerate
some
distortion at the destination after
The simplest example of an analog source encoder is an analog-to-digital converter, also called a
quantizer, for which the source decoder is a digital-to-analog converter.
DIGITAL COMMUNICATION SYSTEMS! FUNDAMENTAL CONCEPTS
AND PARAMETERS
5
the source decoder operation which now only approximates the inverse mapping.
In this case, the distortion (appropriately defined) is set at a fixed maximum, and
the goal is to minimize the rate
again defined in digital symbols per unit time
subject to the distortion limit. The solution to this problem requires the generali
zation of the entropy parameter of the source to a quantity called the rate
distortion function. This function of distortion represents the minimum rate at
which the source output can be transmitted over a noiseless channel and
still
be
reconstructed within the given distortion.
The dual to this first problem is the accurate transmission of the digital
(source encoder output) sequence over a noisy channel. Considering now only the
blocks within the dashed contour in Fig. 1.1, the noisy channel is to be regarded as
random mapping of its input defined over a given discrete set (digital alphabet)
into an output defined over an arbitrary set which is not necessarily the same as
a
the input
set.
In
most physical channels the output space is often contin
although discrete channel models are also commonly
fact, for
uous (uncountable)
considered.
The goal of
the channel encoder
and decoder
is
to
map
the input digital
sequence into a channel input sequence and conversely the channel output se
quence into an output digital sequence such that the effect of the channel noise is
minimized that is, such that the number of discrepancies (errors) between the
output and input digital sequences is minimized. The approach is to introduce
redundancy in the channel encoder and to use this redundancy at the decoder to
reconstitute the input sequence as accurately as possible. Thus in a simplistic sense
the channel coding is dual to the source coding in that the latter eliminates or
reduces redundancy while the former introduces it for the purpose of minimizing
errors.
As
will
be shown to the reader
established in a
much more
completes
quantitative and precise
this
book,
this duality
can be
Without further evolu
most remarkable conclu
sense.
we can state the single
Shannon channel coding theory: namely, that with
tion of the concepts at this point,
sion of the
who
sufficient
but
finite
redundancy properly introduced at the channel encoder, it is possible for the
channel decoder to reconstitute the input sequence to any degree of accuracy
desired. The measure of redundancy introduced is established by the rate of digital
symbols per unit time input to the channel encoder and output from the channel
decoder. This rate, which is the same as the rate at the source encoder output and
source decoder input, must be less than the rate of transmission over the noisy
channel because of the redundancy introduced. Shannon s main result here is that
provided the input rate to the channel encoder is less than a given value estab
lished by the channel capacity (a basic parameter of the channel which is a func
tion of the random mapping distribution which defines the channel), there exist
encoding and decoding operations which asymptotically for arbitrarily long se
quences can lead to error-free reconstruction of the input sequence.
As an immediate consequence of the source coding and channel coding
theories, it follows that if the minimum rate at which a digital source sequence can
be uniquely represented by the source encoder is less than the maximum rate for
which the channel output can be reconstructed error-free by the channel decoder
6 FUNDAMENTALS OF DIGITAL COMMUNICATION
AND BLOCK CODING
then the system of Fig. 1.1 can transfer digital data with arbitrarily high accuracy
from source to destination. For analog sources the same holds, but only within a
predetermined (tolerable) distortion which determines the source encoder s mini
mum
rate,
provided
this rate
is less
than the channel
maximum
rate
mentioned
above.
This text aims to present quantitatively these fundamental concepts of digital
communication system theory and to demonstrate their applicability to existing
channels and sources.
In this first chapter, two of the basic parameters, source entropy and channel
capacity, are defined and a start is made toward establishing their significance.
Entropy is shown to be the key parameter in the noiseless source coding theorem,
proved in Sec. 1.1. The similar role of the capacity parameter for channels is
partially established by the proof in Sec. 1.3 of the converse to the channel coding
theorem, which establishes that for no rate greater than the maximum determined
by capacity can error-free reconstruction be effected by any channel encoderdecoder pair. The full significance of capacity is established only in the next two
chapters. Chap. 2 defines and derives the models of the channels of greatest inter
est to the communication system designer and introduces the rudimentary con
cepts of channel encoding and decoding. In Chap. 3 the proof of the channel
coding theorem is completed in terms of a particular class of channel codes called
block codes, and thus the full significance of capacity is established.
However, while the theoretical capabilities and limitations of channel coding
are well established by the end of Chap. 3, their practical applicability and manner
of implementation is not yet clear. This situation is for the most part remedied by
Chap. 4 which describes a more practical and powerful class of codes, called
convolutional codes, for which the channel encoding operation is performed by a
digital linear filter, and for which the channel decoding operation arises in a
natural manner from the simple properties of the code. Chap. 5 establishes further
properties and limitations of these codes and compares them with those of block
codes established in Chap. 3. Then Chap. 6 explores an alternative decoding
procedure, called sequential decoding, which permits under some circumstances
and with some limitations the use of extremely powerful convolutional codes.
Finally Chap. 7 returns to the source coding problem, considering analog
sources for the first time and developing the fundamentals of rate distortion
theory for memoryless sources. Both block and convolutional source coding
techniques are treated and thereby the somewhat remarkable duality between
channel and source coding problems and solutions is established. Chap. 8 extends
the concepts of Chap. 7 to sources with
topics in rate distortion theory.
memory and
presents
more advanced
Shannon s mathematical theory of communication almost from the outset
became known as information theory. While indeed one aspect of the theory is to
define information and establish its significance in practical engineering terms, the
main contribution of the theory has been in establishing the ultimate capabilities
and limitations of digital communication systems. Nevertheless, a natural starting
DIGITAL COMMUNICATION SYSTEMS: FUNDAMENTAL CONCEPTS
AND PARAMETERS
7
point is the quantitative definition of information as required by the communica
tion engineer. This will lead us in Sec. 1.1 to the definition of entropy and the
development of its key role as the basic parameter of digital source coding.
1.1
AND THE NOISELESS CODING
SOURCES, ENTROPY,
THEOREM
"
The weather today in Los Angeles is sunny with moderate amounts of smog is a
news event that, though not surprising, contains some information, since our
"
previous uncertainty about the weather in Los Angeles is now resolved. On the
other hand, the news event, Today there was a devastating earthquake in Cali
"
much of downtown Los Angeles," is more unexpected and
more information than the first report. But what is informa
meant by the "information" contained in the above two events?
fornia which leveled
certainly contains
tion?
What
is
Certainly if we are formally to define a quantitative measure of information con
tained in such events, this measure should have some intuitive properties such as
1.
2.
Information contained
in events
ought to be defined
of the uncertainty of the events.
Less certain events ought to contain
in
:
terms of some measure
more information than more
certain
events.
In addition, assuming that weather conditions and earthquakes are unrelated
if we were informed of both news events we would expect that the total
amount of information in the two news events be the sum of the information
events,
contained
3.
in each.
Hence we have a
The information
sum
third desired property:
of unrelated events taken as a single event should equal the
of the information of the unrelated events.
A natural measure of the uncertainty of an event a is the probability of a
denoted P(a). The formal term for unrelatedness is independence; two events a
and f$ are said to be independent if
"
"
P(a
n
/?)
=
P(a)P(/J)
(1.1.1)
Once we
agree to define the information of an event a in terms of the probability
of a, the properties (2) and (3) will be satisfied if the information in event a is
defined as
/(a)= -logP(a)
from which
it
follows
P(a)P() = -log
P(a)
specifies the scaling
that,
-log
if
P(fi)
and hence
a and
=
/?
+
/(a)
the unit
are
(1.1.2)
independent,
/(a
n
ft]
=
log
/(). The base of the logarithm merely
of information we wish to use. This