Tải bản đầy đủ (.pdf) (604 trang)

Econometrics statistical foundations and applications

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (19.21 MB, 604 trang )

ECONOMETRICS


economics


ECONOMETRICS
STATISTICAL FOUNDATIONS
AND APPLICATIONS

PHOEBUS J. DHRYMES
Professor of Economics
Columbia University

Springer-Verlag New York· Heidelberg· Berlin

1974


Library of Congress Cataloging in Publication Data

Dhrymes, Phoebus J
1932Econometrics: statistical foundations and applications.
Corrected reprint of the 1970 ed. published by
Harper & Row, New York.
1. Econometrics. I. Title.
330'.01'8
74-10898
[HB139.D48 1974]

Second printing: July, 1974.



First published 1970, by Harper & Row, Publishers, Inc.

Design: Peter Klemke, Berlin.

All rights reserved.
No part of this book may be translated or reproduced in any form without
written permission from Springer-Verlag.

© 1970 by Phoebus J. Dhrymes and 1974 by Springer-Verlag New York Inc.

ISBN-13:978-0-387-90095-7
001: 10.1007/978-1-4613-9383-2

e-ISBN-13:978-1-4613-9383-2


PREFACE TO SECOND PRINTING

T

he main difference between this edition by Springer-Verlag and the
earlier one by Harper & Row lies in the elimination of the inordinately
high number of misprints found in the latter. A few minor errors of
exposition have also been eliminated. The material, however, is
essentially similar to that found in the earlier version.
I wish to take this opportunity to express my thanks to all those who
pointed out misprints to me and especially to H. Tsurumi, Warren
Dent and J. D. Khazzoom.
New York

February, 1974

PHOEBUS

J.

DHRYMES

v


PREFACE TO FIRST PRINTING

T

his book was written, primarily, for the graduate student in econometrics. Its purpose is to provide a reasonably complete and rigorous
exposition of the techniques frequently employed in econometric
research, beyond what one is likely to encounter in an introductory
mathematical statistics course. It does not aim at teaching how one
can do successful original empirical research. Unfortunately, no one
has yet discovered how to communicate this skill impersonally.
Practicing econometricians may also find the integrated presentation
of simultaneous equations estimation theory and spectral analysis a
convenient reference.
I have tried, as far as possible, to begin the discussion of the various
topics from an elementary stage so that little prior knowledge of the
subject will be necessitated. It is assumed that the potential reader is
familiar with the elementary aspects of calculus and linear algebra.
Additional mathematical material is to be found in the Appendix.
Statistical competence, approximately at the level of a first-year

course in elementary mathematical statistics is also assumed on the
part of the reader.
The discussion, then,. develops certain elementary aspects of multivariate analysis, the theory of estimation of simultaneous equations
systems, elementary aspects of spectral and cross-spectral analysis,
and shows how such techniques may be applied, by a number of
examples.
It is often said that econometrics deals with the quantification of
economic relationships, perhaps as postulated by an abstract model.

vii


As such, it is a blend of economics and statistics, both presupposing a substantial degree of mathematical sophistication. Thus, to practice econometrics
compentently, one has to be well-versed in both economic and statistical theory.
Pursuant to this, I have attempted in all presentations to point out clearly the
assumptions underlying the discussion, their role in establishing the conclusions,
and hence the consequence of departures from such assumptions. Indeed, this
is a most crucial aspect of the student's training and one that is rather frequently
neglected. This is unfortunate since competence in econometrics entails, inter
alia, a very clear perception of the limitations of the conclusions one may obtain
from empirical analysis.
A number of specialized results from probability theory that are crucial for
establishing, rigorously, the properties of simultaneous equations estimators
have been collected in Chapter 3. This is included only as a convenient reference,
and its detailed study is not essential in understanding the remainder of the
book. It is sufficient that the reader be familiar with the salient results presented
in Chapter 3, but it is not essential that he master their proof in detail. I have
used various parts of the book, in the form of mimeographed notes, as the basis
of discussion for graduate courses in econometrics at Harvard University and,
more recently, at the University of Pennsylvania.

The material in Chapters I through 6 could easily constitute a one-semester
course, and the remainder may be used in the second semester. The instructor
who may not wish to delve into spectral analysis quite so extensively may include
alternative material, e.g., the theory of forecasting.
Generally, I felt that empirical work is easily accessible in journals and
similar publications, and for this reason, the number of empirical examples is
small. By now, the instructor has at his disposal a number of pUblications on
econometric models and books of readings in empirical econometric research,
from which he can easily draw in illustrating the possible application of various
techniques.
J have tried to write this book in a uniform style and notation and preserve
maximal continuity of presentation. For this reason explicit references to
individual contributions are minimized; on the other hand, the great cleavage
between the Dutch and Cowles Foundation notation is bridged so that one can
follow the discussion of 2SLS, 3SLS, and maximum likelihood estimation in a
unified notational framework. Of course, absence of references from the discussions is not meant to ignore individual contributions, but only to insure the
continuity and unity of exposition that one commonly finds in scientific, mathematical, or statistical textbooks.
Original work relevant to the subject covered appears in the references at
the end of each chapter; in several instances a brief comment on the work is
inserted. This is only meant to give the reader an indication of the coverage and
does not pretend to be a review of the contents.
Finally, it is a pleasure for me to acknowledge my debt to a number of
viii

PREFACE


individuals who have contributed directly or indirectly in making this book
what it is.
I wish to express my gratitude to H. Theil for first introducing me to the

rigorous study of econometrics, and to I. Olkin from whose lucid lectures I
first learned about multivariate analysis. T. Amemiya, L. R. Klein, J. Kmenta,
B. M. Mitchell, and A. Zellner read various parts of the manuscript and offered
useful suggestions. V. Pandit and A. Basu are chiefly responsible for compiling
the bibliography. Margot Keith and Alix Ryckoff have lightened my burden by
their expert typing.
PHOEBUS J. DHRYMES
January, 1970

PREFACE

ix


CONTENTS

1.

ELEMENTARY ASPECTS OF
MULTIVARIATE ANALYSIS
1.1

1

Preliminaries

1.2 Joint, Marginal, and Conditional Distributions
1.3 A Mathematical Digression 9
1.4 The Multivariate Normal Distribution
1.5


5

12

Correlation Coefficients and Related Topics

20

1.6 Estimators of the Mean Vector and Covariance Matrix and their
Distribution 25
1.7 Tests of Significance

2.

34

APPLICATIONS OF
MULTIVARIATE ANALYSIS
2.1

Canonical Correlations and Canonical Variables

2.2

Principal Components

53

2.3


Discriminant Analysis

65

2.4

Factor Analysis

42
42

77

xi


3.

PROBABILITY LIMITS, ASYMPTOTIC
DISTRIBUTIONS, AND PROPERTIES OF
MAXIMUM LIKELIHOOD ESTIMATORS

84

3.1 Introduction 84
3.2 Estimators and Probability Limits 84
3.3 Convergence to a Random Variable: Convergence in Distribution
and Convergence of Moments 90
3.4 Central Limit Theorems and Related Topics

100
3.5 Miscellaneous Useful Convergence Results
110
3.6 Properties of Maximum Likelihood (ML) Estimators 114
3.7 Estimation for Distribution Admitting of Sufficient Statistics
130
3.8 Minimum Variance Estimation and Sufficient Statistics 136

4.

ESTIMATION OF
SIMULTANEOUS EQUATIONS SYSTEMS
4.1
4.2
4.3
4.4
4.5
4.6
4.7

5.

5.2
5.3
5.4
5.5
5.6
5.7

xii


Review of Classical Methods
145
Asymptotic Distribution of Aitken Estimators
161
Two-Stage Least Squares (2SLS)
167
2SLS as Aitken and as OLS Estimator 183
Asymptotic Properties of 2SLS Estimators 190
The General k-Class Estimator 200
Three-Stage Least Squares (3SLS) 209

APPLICATIONS OF CLASSICAL AND
SIMULTANEOUS EQUATIONS TECHNIQUES
AND RELATED PROBLEMS
5.1

145

222

Estimation of Production and Cost Functions and Specification
Error Analysis 222
An Example of Efficient Estimation of a Set of General Linear
(Regression) Models 234
An Example of 2SLS and 3SLS Estimation 236
Measures of Goodness of Fit in Multiple Equations Systems:
Coeficient of (Vector) Alienation and Correlation 240
Canonical Correlations and Goodness of Fit in Econometric
Systems 261

Applications of Principal Component Theory In Econometric
Systems 264
Alternative Asymptotic Tests of Significance for 2SLS Estimated
Parameters 272

CONTENTS


6.

ALTERNATIVE ESTIMATION METHODS;
RECURSIVE SYSTEMS
6.1
6.2
6.3
6.4
6.5

7.

8.

10.

CONTENTS

296

MAXIMUM LIKELIHOOD METHODS


314

7.1 Formulation of the Problem and Assumptions 314
7.2 Reduced Form (RF) and Full Information Maximum Likelihood
(FIML) Estimation 316
7.3 Limited Information (LlML) Estimation 328

RELATIONS AMONG ESTIMATORS;
MONTE CARLO METHODS
8.1
8.2
8.3
8.4
8.5
8.6

9.

Introduction 279
Indirect Least Squares (ILS) 279
The Identification Problem 289
Instrumental Variables Estimation
Recursive Systems 303

279

358

Introduction 358
Relations Among Double k-Class Estimators 359

I.V., ILS, and Double k-Class Estimators 364
Limited Information Estimators and Just Identification 365
Relationships Among Full Information Estimators 367
Monte Carlo Methods 372

382

SPECTRAL ANALYSIS
9.1 Stochastic Processes 382
9.2 Spectral Representation of Covariance Stationary Series
9.3 Estimation of the Spectrum 419

397

CROSS-SPECTRAL ANALYSIS
10.1 Introduction 444
10.2 Cross Spectrum: Cospectrum, Quadrature Spectrum,
Coherency 456
10.3 Estimation of the Cross Spectrum 474
10.4 An Empirical Application of Cross-Spectral Analysis 479

444
and

xiii


11.

APPROXIMATE SAMPLING DISTRIBUTIONS

AND OTHER STATISTICAL ASPECTS
OF SPECTRAL ANALYSIS
11.1

Aliasing

485

485

11.2 " Prewhitening," "Recoloring," and Related Issues

488

11.3 Approximate Asymptotic Distributions; Considerations of
Design and Analysis 492

12.

APPLICATIONS OF SPECTRAL ANALYSIS
TO SIMULTANEOUS EQUATIONS SYSTEMS
12.1

Generalities

12.2 Lag Operators

507

507

509

12.3 An Operator Representation of the Final Form
12.4 Dynamic Multipliers and the Final Form
12.5 Spectral Properties of the Final Form
12.6 An Empirical Application 533

517

521

525

MATHEMATICAL APPENDIX
A.l Complex Numbers and Complex-Valued Functions
A.2 The Riemann-StieItjes Integral 551
A.3

xiv

545

Monotonic Functions and Functions of Bounded Variation

552

A.4 Fourier Series 557
A.5 Systems of Difference Equations with Constant Coefficients

567


A.6 Matrix Algebra

INDEX

545

570.

585

CONTENTS


ECONOMETRICS


ELEMENTARY
ASPECTS OF
MULTIVARIATE ANALYSIS

1

1.1

PRELIMINARIES

In elementary mathematical statistics, one studies in some detail
various characteristics of the distribution of a scalar random variable.
Thus its density and various parameters are considered and statements are made regarding this variable. For example, given the

information above, we can compute the probability that the variable will exceed some value, say ex, or that it will assume a value
in the interval (ex, ~) and so on.
Still, in this elementary context the correlation of two variables
was introduced and interpreted as a measure of the degree to which
the two variables tend to move in the same or opposite direction.
In econometric work, however, it is often necessary to deal with a
number of relations simultaneously. Thus we typically have an
econometric model containing more than one equation. Such a
system may be simultaneously determined in the sense that the
interaction of all variables as specified by the model determines
simultaneously the behavior of the entire set of (jointly) dependent
variables.
Generally, the equations of an econometric model are, except for
identities, stochastic ones and hence the problem arises of how to
specify the (joint) stochastic character of a number of random variables simultaneously. This leads us to consider the problem of the
distribution of vector random variables, that is, the characteristics
1


of the joint distribution of a number of random variables simultaneously
and not" one at a time."
When the problem is considered in this context, it is apparent that there are
more complexities than in the study of the distribution of scalar random variables. Therefore, if we are dealing with the (joint) distribution of m variables,
we might wish to say something about the distribution of a subset of k (k < m)
variables given the (m - k) remaining ones.
This is a problem that cannot arise in the study of univariate distributions.
In the following material we shall study in some detail certain elementary
aspects of multivariate distributions, confining ourselves to the special case of
the multivariate normal distribution.
Let us now set forth the notational framework and conventions for this topic

and obtain some simple but useful results.
Definition 1: Let {xij: i = 1,2, ... , m,j = 1,2, ... , n} be a set of random
variables. Then the matrix

x

(1.1.1)

= (xij)

is said to be a random matrix and its expectation (mean) is defined by
(1.1.2)

E[X] = [E(Xi)]

Consider now any column, say the jth, of (1.1.l). Denote it by x. j ' It is clear
from (1.1.1) and (1.1.2) that the expectation of the vector x.j is given by

(1.1.3)

where
(1.1.4)

J.1.j = (1l1j, J.12j,···, J.1mj)'

Definition 2: Let Z = (Zl' Z2, ... , zm)' be a random vector; then the
covariance matrix of Z is defined by

E[(z - E(z»(z - E(z»),] =


(1 :1.5)

~

and as a matter of notation we write
Cov (z) =

(1.1.6)

~

where

i = 1,2, ... , m,

j= 1,2, ... , m

(1.1.7)

Remark 1: By convention, the mean of a random vector is denoted by the
lowercase Greek letter J.1 and its covariance matrix by the capital Greek letter ~.

2

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS


This is, of course, not a universal practice; it is, however, used quite widely.
Notice, further, that in (1.1.5) (z - E(z))(z - E(z))' is an m x m random matrix,
and thus the meaning of the expectation operator there is given by (1.1.2) of

Definition 1.
.
In general, we shall identify vectors and matrices by their dimensions. Thus
the statement" x is m xl" will mean that x is a column vector with m elements,
while "A is m x n" will mean that A is a matrix with m rows and n columns.
A simple consequence of the preceding is
Lemma 1: Let x be m x 1 and random; suppose that

E(x)

=~

Cov (x) =

(1.1.8)

~

Define
y

= Ax+ b

(1.1.9)

where A is m x m nonsingular, b is m x 1, both nonrandom. Then

+b

(1.1.10)


= A~A'

(1.1.11)

E(y) = A~
Cov (y)

PROOF:

By definition, the ith element of y is given by

m

Yi =

L aijxj + bi

i = 1, 2, ... , m

(1.1.12)

j=l

Hence

E(Yi)

m


m

j= 1

j= 1

= L aijE(xj) + bi = L aij~j + bi

i

= 1,2, ... , m

(1.1.13)

Since

(1.1.14)

we conclude

E(y) =

A~

+b

(1.1.15)

Furthermore, we have
Cov (y)


= E[(y - E(y))(y - E(y))'] = E[A(x -

The (i,j) element of (x -

~)(x

-

~)'

~)(x

-

~)' A']

(1.1.16)

is given by

x~ = (Xi - ~i)(Xj - ~j)

ELEMENTARY ASPECTS OF MULTIVARIATE ANALYSIS

(1.1.17)

3



Notice that
(1.1.18)

E(X~)=CJij

The (s, t) element of the matrix whose expectation we are taking in (1.1.16) is
m

m

L L asrx~a;r
i= 1 r= 1
It follows, therefore, that the (s, t) element of the covariance matrix of y is
given by
(1.1.19)

From (1.1.19) we conclude
Cov (y) =

Q.E.D.

A~A'

(1.1.20)

Before considering the multivariate normal distribution in some detail, we
should point out two useful facts. One, implicit in the preceding, is this: if X is
a random matrix and if A, B, C are conformable nonrandom matrices, then
E(Y) = AE(X)B


+C

(1.1.21)

where
(1.1.22)

Y=AXB+C

The second is given by
Lemma 2: Let x be m x 1, random; then

Cov (x)

~ =

(1.1.23)

is at least a positive semidefinite, symmetric matrix.
PROOF:

The symmetry of~ is obvious; thus its (i,j) element is

CJij = E[(Xi - Il;)(x j

-

11)]

(1.1.24)


while its (j, i) element is
CJ ji

= E[(xj -

1l)(Xi - Ili)]

(1.1.25)

It is clear that the right-hand sides of (1.1.24) and (1.1.25) are identical, which
establishes symmetry.
To show positive semidefiniteness, let r:t. be m x 1, nonrandom and nonnull
but otherwise arbitrary.
We shall establish that
r:t.'~r:t.

2:: 0

Define the (scalar) random variable
y=r:t.'x

4

(1.1.26)

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS


From elementary mathematical statistics we know that the variance of y is

nonnegative. Thus

o~ Var (y) = E[ex'(x -

J.1)(x - J.1),ex]

= ex'l:ex

Q.E.D.
(1.1.27)

Remark 2: Notice that if l: is not strictly positive definite (i.e., if it is a
singular matrix), then there will exist a nonnull constant vector, say y, such that
(1.1.28)

This means that there exists a scalar random variable, say y, which is a linear
combination of the elements of x and whose variance is zero. The latter fact
means that y is a constant. Hence a singular covariance matrix l: in (1.1.27)
implies that the elements of x are linearly dependent in the sense that there exists
a non null set of constants (Yl, Y2' Y3' ... , Ym) such that y = Ii'!: lYiXi is nonrandom.
If this is so, then the distribution of the random vector x is said to be
degenerate.
If this is not so-that is, if l: is strictly positive definite, symmetric-then the
distribution of x is said to be proper. In this textbook we shall only deal with
proper distributions; thus the term "proper" will be suppressed everywhere
except when the context clearly requires it.

1.2

JOINT, MARGINAL,

AND CONDITIONAL DISTRIBUTIONS

In this section we shall define what we mean by the joint distribution of a
number of random variables, derive certain features of it, and study its associated
marginal and conditional distributions. The following convention, which is
often employed in mathematical statistics, will be adhered to.
Convention: We shall denote random variables by capital Roman letters. 1
We shall denote by lowercase letters the values assumed by the random variables.
Thus Pr {X ~ x} indicates the probability that the random variable X will
assume a value equal to or less than (the real number) x.
Definition 3: Let X be m x I, random; by its joint (cumulative) distribution function we mean a function F( " " ... , .) such that

i.
ii.
iii.
iv.

0 ~ F ~ 1.
F is monotonic nondecreasing in all its arguments.
F(-oo, -00, ... , -(0) =0
F(oo, 00, ... , (0) = 1
Pr {X 1 ~ Xl' X 2 ~ x 2 , ••• , Xm ~ X m } = F(Xl' X2, •.• ,

X m)

1 Recall that a random variable is a real-valued junction defined on the relevant sample
space.

ELEMENTARY ASPECTS OF MULTIVARIATE ANALYSIS


5


In this textbook we shall always assume that F( " ...) is absolutely continuous
so that the derivative

exists almost everywhere.
Therefore we have
Definition 4: Let F(·,·, ... ) be the joint cumulative distribution function
of the m x 1 random variable X. Suppose that F is absolutely continuous; then

(1.2.1 )
is said to be the joint density function of (the elements of) X.
In the following material we shall always assume that the density function
exists.
Remark 2: It is clear from (1.2.1) and statement iv of Definition 3 that

and

(1.2.3)
Definition 5: The marginal density of (X l' X 2 ,
by

••• ,

X k), k

< m,

is defined


g(X1' X2' .•• , Xk)
m-k
~

= {YO '" {YO l(x 1, X2'
-00

-00

•.. , X k ,

~k+ l' ~k+2' ... , ~m) d~k+ 1 d~k+2' .•• , d~;"
(1.2.4)

Remark 3: It is apparent that the marginal (cumulative) distribution function of (X l' X 2, ... , X k ) is given by
m-k

(1.2.5)

6

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS


Remark 4: It should also be clear that the marginal density of any element
of X, say Xl' is given by

(1.2.6)
and is simply the density function of Xl as studied in elementary mathematical

statistics courses.
The marginal density of a subset of the elements of X simply characterizes
their probability structure after the effects of all other variables have been
allowed for (" averaged out" or "integrated out "). In particular, notice that
the marginal density of X l' X 2 , ..• , X k does not depend on Xk+ l' Xk+2' ... , X m.
In contradistinction to this, we have another associated density, namely,
the conditional one.
Recall from elementary probability that if A and B are two events, then the
conditional probability of A given B, denoted by peA I B), is defined as
P(AIB) = peA n B)
PCB)

(1.2.7)

In the multivariate context, it is often useful to ask: What is the probability
structure of one group of variables given another. Thus we are led to
Definition 6: The conditional density of (Xl' X 2 ,
(X k + 1 , X k + 2 , ... , Xm) is defined by

f(x 1 I x 2 )

••• ,

X k)

= :C~})

given

(1.2.8)


where

XI = (X1' X 2 , ... , X k )',

X 2 = (Xk + lo Xk+2, ... , Xm)',

g(X2) i= 0,

g(.) is the marginal density of X 2 andf(·) is the density of X

(1.2.9)

= (~~).

Remark 5: As the notation in (1.2.8) makes clear, the conditional density
of Xl given X 2 does depend on X2.
Whereas in the case of the marginal density of Xl the effects of the variables
in X 2 were "averaged" or "integrated" out, in the case of the conditional
density of Xl the effects of the variables in X 2 are allowed for explicitly by
"holding them constant." The meaning of this distinction will become clearer
when we study the multivariate normal distribution.
Moments are defined with respect to the densities above in the usual way.
Thus let h(·) be a function of the random variable X. The expectation of heX)
is defined by

E[h(X)] =

Jh(x)f(x) dx


ELEMENTARY ASPECTS OF MULTIVARIATE ANALYSIS

(1.2.10)

7


where the integral sign indicates the m-fold integral with respect to
Xl' X2 , ••• , x m •
If it is specified that h( .) depends only on Xl, then we can define two expectations for h(X1), one marginal and one conditiona1.
The marginal expectation of h(X1) is defined by
(1.2.11)

The conditional expectation of Xl given X 2 is defined by
(1.2.12)

Notice that in (1.2.11) we expect with respect to the marginal density of xl,
while in (1.2.12) we expect with respect to the conditional density of Xl given X2.
Example 1: Suppose that we wish to obtain the mean of one of the elements
of X, say X l' In this case, take the function h( .) as
(1.2.13)

heX) = Xl
The margina1 2 mean of Xl is thus given by
E(X 1) =

f xdex) dx

=


(1.2.14)

)11

The conditional mean of X 1 is given by
E(X11 X 2 , · · · , Xm) =

f Xd(X11 X

2 ,···,

xm) dX1 =

)11.2 ..... m

(1.2.15)

The notation )11' 2, 3 .... , m is a very informative one; the number appearing
to the left of the dot indicates the random variable being expected, while the
numbers appearing to the right of the dot indicate the conditioning variables.
Thus )12'1.3,4, ... ,m indicates the conditional mean of X 2 given Xl' X 3 ,
X 4 , ••• , X m • Of course, we can define quantities such as )11. 5.6,7, ... , m which
would indicate the conditional mean of Xl given X 5' X 6, .•• , X m • What we
mean by this is the following: Obtain the marginal density of Xl' X 5 ,
X 6, ••• , X m by "integrating out" X 2 , X 3, and X 4' Then, in terms of this
density, determine the conditional density of X 1 given X 5 , X 6, ••• , Xmas
f( Xl I X 5 ,,··, Xm) -- f(x 1 , Xs , X6 , ••• , Xm)
g(X5' X6' ... , Xm)

(1.2.16)


where the numerator is the joint (marginal) density of X l' X 5, X 6, ••• , Xm
and the denominator is the joint (marginal) density of X 5, X 6' ••• , X m •
Finally, expect Xl with respect to I(X 1 I X5 , ... , Xm).
2

8

The term" marginal" is usually omitted; one speaks only of the mean of Xl.

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS


Example 2: Suppose that h( . ) is such that is depends only on two variables.
Thus, say,
heX) = (X, - Ild(X2

-

(1.2.17)

112)

The marginal 3 expectation of heX) is given by
E[h(X)] =

f(x, -1l,)(X2 -1l2)f(x) dx

= 0"12


(1.2.18)

The expectation here simply yields the covariance between the first and second
elements of X.
As before, we can again define the conditional covariance between X, and
X 2 given X 3 , X 4 , ... , X m. Hence we have
E[(X, - 11')(X 2 -112) I X 3 ,
=

•.• ,

Xm]

f(x, - 1l1)(X2 - 112)f(x" x21 X3 , ... , Xm) dx, dX2

(1.2.19)
... ,m
where again the numbers to the left of the dot indicate the variables whose
covariance is obtained, while the numbers to the right indicate the conditioning
variables.
We leave it to the reader to compute a number of different conditional
variances and covariances.
The preceding discussion should be sufficient to render the meaning of the
notation, say O"SS'I,7,12,13, ... ,m or 0"77",2,'9,20,2', ... ,m, quite obvious.
Finally; let us conclude with
=0" '2·3,4,

Definition 7: Let X be m x 1 and random; then its elements are said to
be mutually (statistically) independent if and only if their (joint) density can be
expressed as the product of the marginal densities of the individual elements.

Remark 6: Suppose X is partitioned by Xl and X 2 as above; then Xl and

are said to be mutually independent if and only if the joint density of X can
be expressed as the product of the marginal densities of Xl and X2.
We shall now abandon the convention whereby random variables and the
values assumed by them are distinguished by the use of capital and lowercase
letters respectively. Henceforth no such distinction will be made. The meaning
will usually be clear from the context.
X2

1.3

A MATHEMATICAL DIGRESSION

Let En be the n-dimensional Euclidean space and let

h:

En~En

~

Again the term" marginal" is suppressed; we simply speak of variances and covariances.

ELEMENTARY ASPECTS OF MULTIVARIATE ANALYSIS

9


be a transformation of En into itself. Thus


x, Y E En

Y = hex)

(1.3.1 )

where the ith element of Y is the function

i = 1,2, ... , n

(1.3.2)

and x and yare not necessarily random.
Suppose that the inverse transformation also exists; that is, suppose there
exists a function g( .) such that
g[h(x)] = x

or x = g(y)

(1.3.3)

Definition 8: The Jacobian of the transformation in (1.3.1), denoted by
J(x ~ y) or simply J, is defined by
J(x~

y)

ax.'
=,-'

oYj

i, j

=1,2, ... , n

(1.3.4)

Here the determinant, written out explicitly, is

IoYj
oX 1=
i

oX 1
°Yl
oX 2
°Yl

oX 1
OY2
OX2
OY2

°Yn
OX2
°Yn

oXn oXn
°Yl °Y2


oXn
°Yn

-

oX 1

-

(1.3.5)

and thus it is expressed solely in terms of the Yi' i = 1, 2, ... , n.
Suppose now that x is random, having density f(·), and consider the problem
of determining the density of y in terms off(') and the transformation in (1.3.1).
To this effect, we prove:

lemma 3: Let x be an m x I random variable having the density f(·).
Define
(1.3.6)

Y = hex)

such that the inverse transformation

x

= g(y)

(1.3.7)


exists; thus, by (1.3.6), to each x corresponds a unique y and by (1.3.7) to each
Y corresponds a unique x.
10

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS


Moreover, suppose that h(·) and g(.) are differentiable. Then the density,
<1>( .), of y is given by

<I>(y) = f[g(y)]
where

III

(1.3.8)

111 is the absolute value of the Jacobian of the transformation. 4

PROOF:

The cumulative distribution of x is given by

(1.3.9)

Notice that F(x) in (1.3.9) gives the probability assigned by f(·) to the set
(1.3.10)

which is the Cartesian product of the intervals (- 00, x;) i = 1,2, ... , m. This

accounts for the notation employed in the last member of (1.3.9). Now, if in
(1.3.9) we make the transformation

/;; = h(~)

(1.3.11 )

we may, by the assumption in (1.3.7), solve it to obtain
~

= g(/;;)

(1.3.12)

and hence the Jacobian 1(~ -t/;;). Thus (1.3.9) may now be written ass
F[g(y)] =

f

B

![g(I;;)]

11(~ -t 1;;)1 dr,

(1.3.13)

where B is the transform of A under h.
The integral in (1.3.13) gives the probability assigned to the set B by the
functionf[g(·)] Ill. Moreover, the set B is of the form

(1.3.14)

and corresponds to the" joint event"

4 We must add, of course, the restriction that J does not vanish on every (nondegenerate)
subset of Em. We should also note that in standard mathematical terminology J is the inverse
of the Jacobian of (1.3.6). In the statistical literature, it is referred to as the Jacobian of (1.3.6).
We shall adhere to this latter usage because it is more convenient for our purposes.
5 The validity of this representation follows from the theorems dealing with change of
variables in multiple integrals. See, for example, R. C. Buck, Advanced Calculus, p. 242,
New York, McGraw-Hill, 1956.

ELEMENTARY ASPECTS OF MULTIVARIATE ANALYSIS

11


where Yi indicates the random variable and Yi the yalues assumed by it. Since
the integrand of (1.3.13) is nonnegative and its integral over the entire space is
unity, we conclude that
(1.3.15)

q,(y) = f[g(y)] IJ(x --+ y)1

is the joint density of the elements of y.

1.4

Q.E.D.


THE MULTIVARIATE NORMAL DISTRIBUTION

It is assumed that the reader is familiar with the univariate normal distribution.

This being the case, perhaps the simplest way of introducing the multivariate
normal distribution is as follows.
Let XI: i = 1,2, ... , m be random variables identically and independently distributed as N(O, I); that is, they are each normal with mean
zero and unit variance. Since they are independent, the density of the vector
X = (Xl' X2' '" ,Xm)' is given by
[(X) = (21t)-m/2 exp (-tx'x)

(1.4.1)

Consider now the transformation
(1.4.2)

y=Ax+b

where A is an m x m nonsingular matrix of constants and b is m x I and
nonrandom.
The Jacobian of the transformation is simply
(1.4.3)

J(x --+ y) = IAI- I

It follows, therefore, by Lemma 3 that the joint density of the elements of y is
given by

q,(y) =f[A-I(y - b)JIJI
= (21t)-m/2IAI- I exp [-t(y - b)'A,-IA-I(y - b)]


(1.4.4)

For notational simplicity, we have assumed in (1.4.4) that IAI > 0. We know
that
E(x) =

°

Cov (x) = I

(1.4.5)

and thus, from Lemma 1, we conclude that
E(y) = b

Cov(y) = AA'

(1.4.6)

To conform to standard usage, put as a matter of notation
~=AA'

12

(1.4.7)

ECONOMETRICS: STATISTICAL FOUNDATIONS AND APPLICATIONS



×