Greene-50240
gree50240˙FM
July 10, 2002
12:51
FIFTH EDITION
ECONOMETRIC ANALYSIS
Q
William H. Greene
New York University
Upper Saddle River, New Jersey 07458
iii
Greene-50240
gree50240˙FM
July 10, 2002
12:51
CIP data to come
Executive Editor: Rod Banister
Editor-in-Chief: P. J. Boardman
Managing Editor: Gladys Soto
Assistant Editor: Marie McHale
Editorial Assistant: Lisa Amato
Senior Media Project Manager: Victoria Anderson
Executive Marketing Manager: Kathleen McLellan
Marketing Assistant: Christopher Bath
Managing Editor (Production): Cynthia Regan
Production Editor: Michael Reynolds
Production Assistant: Dianne Falcone
Permissions Supervisor: Suzanne Grappi
Associate Director, Manufacturing: Vinnie Scelta
Cover Designer: Kiwi Design
Cover Photo: Anthony Bannister/Corbis
Composition: Interactive Composition Corporation
Printer/Binder: Courier/Westford
Cover Printer: Coral Graphics
Credits and acknowledgments borrowed from other sources and reproduced, with
permission, in this textbook appear on appropriate page within text (or on page XX).
Copyright © 2003, 2000, 1997, 1993 by Pearson Education, Inc., Upper Saddle River,
New Jersey, 07458. All rights reserved. Printed in the United States of America. This
publication is protected by Copyright and permission should be obtained from the
publisher prior to any prohibited reproduction, storage in a retrieval system, or
transmission in any form or by any means, electronic, mechanical, photocopying,
recording, or likewise. For information regarding permission(s), write to: Rights and
Permissions Department.
Pearson Education LTD.
Pearson Education Australia PTY, Limited
Pearson Education Singapore, Pte. Ltd
Pearson Education North Asia Ltd
Pearson Education, Canada, Ltd
Pearson Educación de Mexico, S.A. de C.V.
Pearson Education–Japan
Pearson Education Malaysia, Pte. Ltd
10 9 8 7 6 5 4 3 2 1
ISBN 0-13-066189-9
iv
Greene-50240
gree50240˙FM
July 10, 2002
12:51
BRIEF CONTENTS
Q
Chapter 1
Chapter 2
Introduction
1
The Classical Multiple Linear Regression Model
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Least Squares
19
Finite-Sample Properties of the Least Squares Estimator
41
Large-Sample Properties of the Least Squares and Instrumental
Variables Estimators
65
Inference and Prediction
93
Functional Form and Structural Change
116
Chapter 8
Chapter 9
Specification Analysis and Model Selection
Nonlinear Regression Models
162
Chapter 10
Chapter 11
Chapter 12
Nonspherical Disturbances—The Generalized
Regression Model
191
Heteroscedasticity
215
Serial Correlation
250
Chapter 13
Chapter 14
Models for Panel Data
283
Systems of Regression Equations
Chapter 15
Chapter 16
Chapter 17
Chapter 18
Simultaneous-Equations Models
378
Estimation Frameworks in Econometrics
425
Maximum Likelihood Estimation
468
The Generalized Method of Moments
525
Chapter 19
Chapter 20
Chapter 21
Chapter 22
Appendix A
Appendix B
Appendix C
Appendix D
Models with Lagged Variables
558
Time-Series Models
608
Models for Discrete Choice
663
Limited Dependent Variable and Duration Models
Matrix Algebra
803
Probability and Distribution Theory
845
Estimation and Inference
877
Large Sample Distribution Theory
896
7
148
339
756
vii
Greene-50240
gree50240˙FM
viii
July 10, 2002
12:51
Brief Contents
Appendix E Computation and Optimization
Appendix F Data Sets Used in Applications
Appendix G Statistical Tables
953
References
Author Index
Subject Index
959
000
000
919
946
Greene-50240
gree50240˙FM
July 10, 2002
12:51
CONTENTS
Q
CHAPTER 1 Introduction
1.1 Econometrics
1
1
1.2
1.3
Econometric Modeling
Data and Methodology
1.4
Plan of the Book
1
4
5
CHAPTER 2 The Classical Multiple Linear Regression Model
2.1 Introduction
7
2.2 The Linear Regression Model
7
2.3
2.4
Assumptions of the Classical Linear Regression Model
10
2.3.1
Linearity of the Regression Model 11
2.3.2
Full Rank 13
2.3.3
Regression 14
2.3.4
Spherical Disturbances 15
2.3.5
Data Generating Process for the Regressors 16
2.3.6
Normality 17
Summary and Conclusions
18
CHAPTER 3 Least Squares
19
3.1 Introduction
19
3.2 Least Squares Regression
19
3.2.1
The Least Squares Coefficient Vector 20
3.2.2
Application: An Investment Equation 21
3.2.3
Algebraic Aspects of The Least Squares Solution
3.2.4
Projection 24
3.3 Partitioned Regression and Partial Regression
26
3.4
3.5
3.6
7
24
Partial Regression and Partial Correlation Coefficients
28
Goodness of Fit and the Analysis of Variance
31
3.5.1
The Adjusted R-Squared and a Measure of Fit 34
3.5.2
R-Squared and the Constant Term in the Model 36
3.5.3
Comparing Models 37
Summary and Conclusions
38
ix
Greene-50240
gree50240˙FM
x
July 10, 2002
12:51
Contents
CHAPTER 4 Finite-Sample Properties of the Least Squares Estimator
4.1 Introduction
41
4.2 Motivating Least Squares
42
4.3
4.4
4.5
4.6
4.7
4.8
4.9
41
4.2.1
The Population Orthogonality Conditions 42
4.2.2
Minimum Mean Squared Error Predictor 43
4.2.3
Minimum Variance Linear Unbiased Estimation 44
Unbiased Estimation
44
The Variance of the Least Squares Estimator and the Gauss Markov
Theorem
45
The Implications of Stochastic Regressors
47
Estimating the Variance of the Least Squares Estimator
48
The Normality Assumption and Basic Statistical Inference
50
4.7.1
Testing a Hypothesis About a Coefficient 50
4.7.2
Confidence Intervals for Parameters 52
4.7.3
Confidence Interval for a Linear Combination of Coefficients:
The Oaxaca Decomposition 53
4.7.4
Testing the Significance of the Regression 54
4.7.5
Marginal Distributions of the Test Statistics 55
Finite-Sample Properties of Least Squares
55
Data Problems
56
4.9.1
Multicollinearity 56
4.9.2
Missing Observations 59
4.9.3
Regression Diagnostics and Influential Data Points
4.10 Summary and Conclusions
61
60
CHAPTER 5
5.1
5.2
5.3
5.4
5.5
Large-Sample Properties of the Least Squares and Instrumental
Variables Estimators
65
Introduction
65
Asymptotic Properties of the Least Squares Estimator
65
5.2.1
Consistency of the Least Squares Estimator of β 66
5.2.2
Asymptotic Normality of the Least Squares Estimator 67
5.2.3
Consistency of s 2 and the Estimator of Asy. Var[b] 69
5.2.4
Asymptotic Distribution of a Function of b: The Delta
Method 70
5.2.5
Asymptotic Efficiency 70
More General Cases
72
5.3.1
Heterogeneity in the Distributions of xi 72
5.3.2
Dependent Observations 73
Instrumental Variable and Two Stage Least Squares
Estimation
74
Hausman’s Specification Test and an Application to Instrumental Variable
Estimation
80
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
5.6
5.7
Measurement Error
83
5.6.1
Least Squares Attenuation 84
5.6.2
Instrumental Variables Estimation 86
5.6.3
Proxy Variables 87
5.6.4
Application: Income and Education and a Study of Twins
Summary and Conclusions
90
CHAPTER 6 Inference and Prediction
93
6.1 Introduction
93
6.2 Restrictions and Nested Models
93
6.3 Two Approaches to Testing Hypotheses
95
6.3.1
The F Statistic and the Least Squares Discrepancy 95
6.3.2
The Restricted Least Squares Estimator 99
6.3.3
The Loss of Fit from Restricted Least Squares 101
6.4 Nonnormal Disturbances and Large Sample Tests
104
6.5
6.6
Testing Nonlinear Restrictions
Prediction
111
6.7
Summary and Conclusions
108
114
CHAPTER 7 Functional Form and Structural Change
7.1 Introduction
116
7.2
7.3
7.4
7.5
7.6
116
Using Binary Variables
116
7.2.1
Binary Variables in Regression 116
7.2.2
Several Categories 117
7.2.3
Several Groupings 118
7.2.4
Threshold Effects and Categorical Variables 120
7.2.5
Spline Regression 121
Nonlinearity in the Variables
122
7.3.1
Functional Forms 122
7.3.2
Identifying Nonlinearity 124
7.3.3
Intrinsic Linearity and Identification 127
Modeling and Testing for a Structural Break
130
7.4.1
Different Parameter Vectors 130
7.4.2
Insufficient Observations 131
7.4.3
Change in a Subset of Coefficients 132
7.4.4
Tests of Structural Break with Unequal Variances 133
Tests of Model Stability
134
7.5.1
Hansen’s Test 134
7.5.2
Recursive Residuals and the CUSUMS Test 135
7.5.3
Predictive Test 137
7.5.4
Unknown Timing of the Structural Break 139
Summary and Conclusions
144
88
xi
Greene-50240
gree50240˙FM
xii
July 10, 2002
12:51
Contents
CHAPTER 8
Specification Analysis and Model Selection
8.1 Introduction
148
8.2 Specification Analysis and Model Building
148
148
8.4
8.2.1
Bias Caused by Omission of Relevant Variables 148
8.2.2
Pretest Estimation 149
8.2.3
Inclusion of Irrelevant Variables 150
8.2.4
Model Building—A General to Simple Strategy 151
Choosing Between Nonnested Models
152
8.3.1
Testing Nonnested Hypotheses 153
8.3.2
An Encompassing Model 154
8.3.3
Comprehensive Approach—The J Test 154
8.3.4
The Cox Test 155
Model Selection Criteria
159
8.5
Summary and Conclusions
8.3
160
CHAPTER 9
Nonlinear Regression Models
162
9.1 Introduction
162
9.2 Nonlinear Regression Models
162
9.2.1
Assumptions of the Nonlinear Regression Model 163
9.2.2
The Orthogonality Condition and the Sum of Squares 164
9.2.3
The Linearized Regression 165
9.2.4
Large Sample Properties of the Nonlinear Least Squares
Estimator 167
9.2.5
Computing the Nonlinear Least Squares Estimator 169
9.3 Applications
171
9.3.1
A Nonlinear Consumption Function 171
9.3.2
The Box–Cox Transformation 173
9.4 Hypothesis Testing and Parametric Restrictions
175
9.5
9.6
9.4.1
Significance Tests for Restrictions: F and Wald Statistics 175
9.4.2
Tests Based on the LM Statistic 177
9.4.3
A Specification Test for Nonlinear Regressions: The P E Test 178
Alternative Estimators for Nonlinear Regression Models
180
9.5.1
Nonlinear Instrumental Variables Estimation 181
9.5.2
Two-Step Nonlinear Least Squares Estimation 183
9.5.3
Two-Step Estimation of a Credit Scoring Model 186
Summary and Conclusions
189
CHAPTER 10
Nonspherical Disturbances—The Generalized
Regression Model
191
10.1 Introduction
191
10.2 Least Squares and Instrumental Variables Estimation
10.2.1
10.2.2
10.2.3
192
Finite-Sample Properties of Ordinary Least Squares 193
Asymptotic Properties of Least Squares 194
Asymptotic Properties of Nonlinear Least Squares 196
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
xiii
10.2.4
10.3
10.4
10.5
10.6
10.7
Asymptotic Properties of the Instrumental Variables
Estimator 196
Robust Estimation of Asymptotic Covariance Matrices
198
Generalized Method of Moments Estimation
201
Efficient Estimation by Generalized Least Squares
207
10.5.1
Generalized Least Squares (GLS) 207
10.5.2
Feasible Generalized Least Squares 209
Maximum Likelihood Estimation
211
Summary and Conclusions
212
CHAPTER 11 Heteroscedasticity
215
11.1 Introduction
215
11.2 Ordinary Least Squares Estimation
216
11.2.1
11.2.2
11.2.3
Inefficiency of Least Squares 217
The Estimated Covariance Matrix of b 217
Estimating the Appropriate Covariance Matrix for Ordinary
Least Squares 219
11.3 GMM Estimation of the Heteroscedastic Regression Model
221
11.4 Testing for Heteroscedasticity
222
11.4.1
White’s General Test 222
11.4.2
The Goldfeld–Quandt Test 223
11.4.3
The Breusch–Pagan/Godfrey LM Test 223
11.5 Weighted Least Squares When is Known
225
11.6 Estimation When Contains Unknown Parameters
227
11.6.1
Two-Step Estimation 227
11.6.2
Maximum Likelihood Estimation 228
11.6.3
Model Based Tests for Heteroscedasticity 229
11.7 Applications
232
11.7.1
Multiplicative Heteroscedasticity 232
11.7.2
Groupwise Heteroscedasticity 235
11.8 Autoregressive Conditional Heteroscedasticity
238
11.8.1
11.8.2
The ARCH(1) Model 238
ARCH(q), ARCH-in-Mean and Generalized ARCH
Models 240
11.8.3
Maximum Likelihood Estimation of the GARCH Model
11.8.4
Testing for GARCH Effects 244
11.8.5
Pseudo-Maximum Likelihood Estimation 245
11.9 Summary and Conclusions
246
CHAPTER 12 Serial Correlation
250
12.1 Introduction
250
12.2 The Analysis of Time-Series Data
12.3 Disturbance Processes
256
253
242
Greene-50240
gree50240˙FM
xiv
July 10, 2002
12:51
Contents
12.3.1
Characteristics of Disturbance Processes 256
12.3.2
AR(1) Disturbances 257
12.4 Some Asymptotic Results for Analyzing Time Series Data
259
12.4.1
Convergence of Moments—The Ergodic Theorem 260
12.4.2
Convergence to Normality—A Central Limit Theorem 262
12.5 Least Squares Estimation
265
12.5.1
Asymptotic Properties of Least Squares 265
12.5.2
Estimating the Variance of the Least Squares Estimator 266
12.6 GMM Estimation
268
12.7 Testing for Autocorrelation
268
12.7.1
Lagrange Multiplier Test 269
12.7.2
Box and Pierce’s Test and Ljung’s Refinement 269
12.7.3
The Durbin–Watson Test 270
12.7.4
Testing in the Presence of a Lagged Dependent Variables 270
12.7.5
Summary of Testing Procedures 271
12.8 Efficient Estimation When Is Known
271
12.9 Estimation When Is Unknown
273
12.9.1
AR(1) Disturbances 273
12.9.2
AR(2) Disturbances 274
12.9.3
Application: Estimation of a Model with Autocorrelation 274
12.9.4
Estimation with a Lagged Dependent Variable 277
12.10 Common Factors
278
12.11 Forecasting in the Presence of Autocorrelation
12.12 Summary and Conclusions
280
CHAPTER 13 Models for Panel Data
13.1 Introduction
283
13.2
13.3
13.4
13.5
13.6
13.7
279
283
Panel Data Models
283
Fixed Effects
287
13.3.1
Testing the Significance of the Group Effects 289
13.3.2
The Within- and Between-Groups Estimators 289
13.3.3
Fixed Time and Group Effects 291
13.3.4
Unbalanced Panels and Fixed Effects 293
Random Effects
293
13.4.1
Generalized Least Squares 295
13.4.2
Feasible Generalized Least Squares When Is Unknown
13.4.3
Testing for Random Effects 298
13.4.4
Hausman’s Specification Test for the Random Effects
Model 301
Instrumental Variables Estimation of the Random Effects Model
GMM Estimation of Dynamic Panel Data Models
307
Nonspherical Disturbances and Robust Covariance Estimation
13.7.1
Robust Estimation of the Fixed Effects Model 314
296
303
314
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
xv
13.7.2
Heteroscedasticity in the Random Effects Model 316
13.7.3
Autocorrelation in Panel Data Models 317
13.8 Random Coefficients Models
318
13.9 Covariance Structures for Pooled Time-Series Cross-Sectional
Data
320
13.9.1
Generalized Least Squares Estimation 321
13.9.2
Feasible GLS Estimation 322
13.9.3
Heteroscedasticity and the Classical Model 323
13.9.4
Specification Tests 323
13.9.5
Autocorrelation 324
13.9.6
Maximum Likelihood Estimation 326
13.9.7
Application to Grunfeld’s Investment Data 329
13.9.8
Summary 333
13.10 Summary and Conclusions
334
CHAPTER 14 Systems of Regression Equations
339
14.1 Introduction
339
14.2 The Seemingly Unrelated Regressions Model
340
14.2.1
Generalized Least Squares 341
14.2.2
Seemingly Unrelated Regressions with Identical Regressors 343
14.2.3
Feasible Generalized Least Squares 344
14.2.4
Maximum Likelihood Estimation 347
14.2.5
An Application from Financial Econometrics:
The Capital Asset Pricing Model 351
14.2.6
Maximum Likelihood Estimation of the Seemingly Unrelated
Regressions Model with a Block of Zeros in the
Coefficient Matrix 357
14.2.7
Autocorrelation and Heteroscedasticity 360
14.3 Systems of Demand Equations: Singular Systems
362
14.3.1
Cobb–Douglas Cost Function 363
14.3.2
Flexible Functional Forms: The Translog Cost Function 366
14.4 Nonlinear Systems and GMM Estimation
369
14.4.1
GLS Estimation 370
14.4.2
Maximum Likelihood Estimation 371
14.4.3
GMM Estimation 372
14.5 Summary and Conclusions
374
CHAPTER 15 Simultaneous-Equations Models
378
15.1 Introduction
378
15.2 Fundamental Issues in Simultaneous-Equations Models
378
15.2.1
Illustrative Systems of Equations 378
15.2.2
Endogeneity and Causality 381
15.2.3
A General Notation for Linear Simultaneous Equations
Models 382
15.3 The Problem of Identification
385
Greene-50240
gree50240˙FM
xvi
July 10, 2002
12:51
Contents
15.3.1
15.3.2
15.3.3
15.4
15.5
The Rank and Order Conditions for Identification 389
Identification Through Other Nonsample Information 394
Identification Through Covariance Restrictions—The Fully
Recursive Model 394
Methods of Estimation
396
Single Equation: Limited Information Estimation Methods
396
15.5.1
15.5.2
15.5.3
15.5.4
15.5.5
15.6
15.7
15.8
Ordinary Least Squares 396
Estimation by Instrumental Variables 397
Two-Stage Least Squares 398
GMM Estimation 400
Limited Information Maximum Likelihood and the k Class of
Estimators 401
15.5.6
Two-Stage Least Squares in Models That Are Nonlinear in
Variables 403
System Methods of Estimation
404
15.6.1
Three-Stage Least Squares 405
15.6.2
Full-Information Maximum Likelihood 407
15.6.3
GMM Estimation 409
15.6.4
Recursive Systems and Exactly Identified Equations 411
Comparison of Methods—Klein’s Model I
411
Specification Tests
413
15.9
Properties of Dynamic Models
415
15.9.1
Dynamic Models and Their Multipliers 415
15.9.2
Stability 417
15.9.3
Adjustment to Equilibrium 418
15.10 Summary and Conclusions
421
CHAPTER 16 Estimation Frameworks in Econometrics
425
16.1 Introduction
425
16.2 Parametric Estimation and Inference
427
16.2.1
Classical Likelihood Based Estimation 428
16.2.2
Bayesian Estimation 429
16.2.2.a Bayesian Analysis of the Classical Regression Model 430
16.2.2.b Point Estimation 434
16.2.2.c Interval Estimation 435
16.2.2.d Estimation with an Informative Prior Density 435
16.2.2.e Hypothesis Testing 437
16.2.3
Using Bayes Theorem in a Classical Estimation Problem: The
Latent Class Model 439
16.2.4
Hierarchical Bayes Estimation of a Random Parameters Model
by Markov Chain Monte Carlo Simulation 444
16.3 Semiparametric Estimation
447
16.3.1
16.3.2
GMM Estimation in Econometrics 447
Least Absolute Deviations Estimation 448
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
16.4
16.5
16.6
16.3.3
Partially Linear Regression 450
16.3.4
Kernel Density Methods 452
Nonparametric Estimation
453
16.4.1
Kernel Density Estimation 453
16.4.2
Nonparametric Regression 457
Properties of Estimators
460
16.5.1
Statistical Properties of Estimators 460
16.5.2
Extremum Estimators 461
16.5.3
Assumptions for Asymptotic Properties of Extremum
Estimators 461
16.5.4
Asymptotic Properties of Estimators 464
16.5.5
Testing Hypotheses 465
Summary and Conclusions
466
CHAPTER 17 Maximum Likelihood Estimation
17.1 Introduction
468
17.2
17.3
17.4
17.5
17.6
xvii
468
The Likelihood Function and Identification of the Parameters
468
Efficient Estimation: The Principle of Maximum Likelihood
470
Properties of Maximum Likelihood Estimators
472
17.4.1
Regularity Conditions 473
17.4.2
Properties of Regular Densities 474
17.4.3
The Likelihood Equation 476
17.4.4
The Information Matrix Equality 476
17.4.5
Asymptotic Properties of the Maximum
Likelihood Estimator 476
17.4.5.a Consistency 477
17.4.5.b Asymptotic Normality 478
17.4.5.c Asymptotic Efficiency 479
17.4.5.d Invariance 480
17.4.5.e Conclusion 480
17.4.6
Estimating the Asymptotic Variance of the Maximum
Likelihood Estimator 480
17.4.7
Conditional Likelihoods and Econometric Models 482
Three Asymptotically Equivalent Test Procedures
484
17.5.1
The Likelihood Ratio Test 484
17.5.2
The Wald Test 486
17.5.3
The Lagrange Multiplier Test 489
17.5.4
An Application of the Likelihood Based Test Procedures 490
Applications of Maximum Likelihood Estimation
492
17.6.1
17.6.2
17.6.3
17.6.4
The Normal Linear Regression Model 492
Maximum Likelihood Estimation of Nonlinear
Regression Models 496
Nonnormal Disturbances—The Stochastic Frontier Model
Conditional Moment Tests of Specification 505
501
Greene-50240
gree50240˙FM
xviii
July 10, 2002
12:51
Contents
17.7
17.8
17.9
Two-Step Maximum Likelihood Estimation
508
Maximum Simulated Likelihood Estimation
512
Pseudo-Maximum Likelihood Estimation and Robust Asymptotic
Covariance Matrices
518
17.10 Summary and Conclusions
521
CHAPTER 18 The Generalized Method of Moments
525
18.1 Introduction
525
18.2 Consistent Estimation: The Method of Moments
526
18.2.1
Random Sampling and Estimating the Parameters of
Distributions 527
18.2.2
Asymptotic Properties of the Method of Moments
Estimator 531
18.2.3
Summary—The Method of Moments 533
18.3 The Generalized Method of Moments (GMM) Estimator
533
18.3.1
Estimation Based on Orthogonality Conditions 534
18.3.2
Generalizing the Method of Moments 536
18.3.3
Properties of the GMM Estimator 540
18.3.4
GMM Estimation of Some Specific Econometric Models 544
18.4 Testing Hypotheses in the GMM Framework
548
18.4.1
Testing the Validity of the Moment Restrictions 548
18.4.2
GMM Counterparts to the Wald, LM, and LR Tests 549
18.5 Application: GMM Estimation of a Dynamic Panel Data Model of
Local Government Expenditures
551
18.6
Summary and Conclusions
555
CHAPTER 19 Models with Lagged Variables
558
19.1 Introduction
558
19.2 Dynamic Regression Models
559
19.2.1
Lagged Effects in a Dynamic Model 560
19.2.2
The Lag and Difference Operators 562
19.2.3
Specification Search for the Lag Length 564
19.3 Simple Distributed Lag Models
565
19.3.1
Finite Distributed Lag Models 565
19.3.2
An Infinite Lag Model: The Geometric Lag Model
19.4 Autoregressive Distributed Lag Models
571
19.4.1
Estimation of the ARDL Model 572
19.4.2
Computation of the Lag Weights in the ARDL
Model 573
19.4.3
Stability of a Dynamic Equation 573
19.4.4
Forecasting 576
19.5 Methodological Issues in the Analysis of Dynamic Models
19.5.1
An Error Correction Model 579
19.5.2
Autocorrelation 581
566
579
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
19.6
19.7
19.5.3
Specification Analysis 582
19.5.4
Common Factor Restrictions 583
Vector Autoregressions
586
19.6.1
Model Forms 587
19.6.2
Estimation 588
19.6.3
Testing Procedures 589
19.6.4
Exogeneity 590
19.6.5
Testing for Granger Causality 592
19.6.6
Impulse Response Functions 593
19.6.7
Structural VARs 595
19.6.8
Application: Policy Analysis with a VAR
19.6.9
VARs in Microeconomics 602
Summary and Conclusions
605
CHAPTER 20 Time-Series Models
20.1 Introduction
608
20.2
20.3
20.4
20.5
596
608
Stationary Stochastic Processes
609
20.2.1
Autoregressive Moving-Average Processes 609
20.2.2
Stationarity and Invertibility 611
20.2.3
Autocorrelations of a Stationary Stochastic Process 614
20.2.4
Partial Autocorrelations of a Stationary Stochastic
Process 617
20.2.5
Modeling Univariate Time Series 619
20.2.6
Estimation of the Parameters of a Univariate Time
Series 621
20.2.7
The Frequency Domain 624
20.2.7.a Theoretical Results 625
20.2.7.b Empirical Counterparts 627
Nonstationary Processes and Unit Roots
631
20.3.1
Integrated Processes and Differencing 631
20.3.2
Random Walks, Trends, and Spurious Regressions 632
20.3.3
Tests for Unit Roots in Economic Data 636
20.3.4
The Dickey–Fuller Tests 637
20.3.5
Long Memory Models 647
Cointegration
649
20.4.1
Common Trends 653
20.4.2
Error Correction and VAR Representations 654
20.4.3
Testing for Cointegration 655
20.4.4
Estimating Cointegration Relationships 657
20.4.5
Application: German Money Demand 657
20.4.5.a Cointegration Analysis and a Long Run
Theoretical Model 659
20.4.5.b Testing for Model Instability 659
Summary and Conclusions
660
xix
Greene-50240
gree50240˙FM
xx
July 10, 2002
12:51
Contents
CHAPTER 21 Models for Discrete Choice
21.1 Introduction
663
21.2 Discrete Choice Models
663
21.3
21.4
21.5
Models for Binary Choice
665
21.3.1
The Regression Approach 665
21.3.2
Latent Regression—Index Function Models 668
21.3.3
Random Utility Models 670
Estimation and Inference in Binary Choice Models
670
21.4.1
Robust Covariance Matrix Estimation 673
21.4.2
Marginal Effects 674
21.4.3
Hypothesis Tests 676
21.4.4
Specification Tests for Binary Choice Models 679
21.4.4.a Omitted Variables 680
21.4.4.b Heteroscedasticity 680
21.4.4.c A Specification Test for Nonnested Models—Testing
for the Distribution 682
21.4.5
Measuring Goodness of Fit 683
21.4.6
Analysis of Proportions Data 686
Extensions of the Binary Choice Model
689
21.5.1
21.6
21.7
663
Random and Fixed Effects Models for Panel Data 689
21.5.1.a Random Effects Models 690
21.5.1.b Fixed Effects Models 695
21.5.2
Semiparametric Analysis 700
21.5.3
The Maximum Score Estimator (MSCORE) 702
21.5.4
Semiparametric Estimation 704
21.5.5
A Kernel Estimator for a Nonparametric Regression
Function 706
21.5.6
Dynamic Binary Choice Models 708
Bivariate and Multivariate Probit Models
710
21.6.1
Maximum Likelihood Estimation 710
21.6.2
Testing for Zero Correlation 712
21.6.3
Marginal Effects 712
21.6.4
Sample Selection 713
21.6.5
A Multivariate Probit Model 714
21.6.6
Application: Gender Economics Courses in Liberal
Arts Colleges 715
Logit Models for Multiple Choices
719
21.7.1
The Multinomial Logit Model 720
21.7.2
The Conditional Logit Model 723
21.7.3
The Independence from Irrelevant Alternatives 724
21.7.4
Nested Logit Models 725
21.7.5
A Heteroscedastic Logit Model 727
21.7.6
Multinomial Models Based on the Normal Distribution 727
21.7.7
A Random Parameters Model 728
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
21.7.8
Application: Conditional Logit Model for Travel
Mode Choice 729
21.8 Ordered Data
736
21.9 Models for Count Data
740
21.9.1
Measuring Goodness of Fit 741
21.9.2
Testing for Overdispersion 743
21.9.3
Heterogeneity and the Negative Binomial
Regression Model 744
21.9.4
Application: The Poisson Regression Model 745
21.9.5
Poisson Models for Panel Data 747
21.9.6
Hurdle and Zero-Altered Poisson Models 749
21.10 Summary and Conclusions
752
CHAPTER 22 Limited Dependent Variable and Duration Models
22.1 Introduction
756
22.2 Truncation
756
756
22.2.1
Truncated Distributions 757
22.2.2
Moments of Truncated Distributions 758
22.2.3
The Truncated Regression Model 760
22.3 Censored Data
761
22.3.1
22.3.2
22.3.3
22.3.4
The Censored Normal Distribution 762
The Censored Regression (Tobit) Model 764
Estimation 766
Some Issues in Specification 768
22.3.4.a Heteroscedasticity 768
22.3.4.b Misspecification of Prob[y* < 0] 770
22.3.4.c Nonnormality 771
22.3.4.d Conditional Moment Tests 772
22.3.5
Censoring and Truncation in Models for Counts 773
22.3.6
Application: Censoring in the Tobit and Poisson
Regression Models 774
22.4 The Sample Selection Model
780
22.4.1
Incidental Truncation in a Bivariate Distribution 781
22.4.2
Regression in a Model of Selection 782
22.4.3
Estimation 784
22.4.4
Treatment Effects 787
22.4.5
The Normality Assumption 789
22.4.6
Selection in Qualitative Response Models 790
22.5 Models for Duration Data
790
22.5.1
Duration Data 791
22.5.2
A Regression-Like Approach: Parametric Models
of Duration 792
22.5.2.a Theoretical Background 792
22.5.2.b Models of the Hazard Function 793
22.5.2.c Maximum Likelihood Estimation 794
xxi
Greene-50240
gree50240˙FM
xxii
July 10, 2002
12:51
Contents
22.5.2.d Exogenous Variables
22.5.2.e Heterogeneity 797
22.5.3
Other Approaches 798
22.6 Summary and Conclusions
801
APPENDIX A Matrix Algebra
803
A.1 Terminology
803
A.2 Algebraic Manipulation of Matrices
A.3
A.4
A.5
A.6
796
803
A.2.1
Equality of Matrices 803
A.2.2
Transposition 804
A.2.3
Matrix Addition 804
A.2.4
Vector Multiplication 805
A.2.5
A Notation for Rows and Columns of a Matrix 805
A.2.6
Matrix Multiplication and Scalar Multiplication 805
A.2.7
Sums of Values 807
A.2.8
A Useful Idempotent Matrix 808
Geometry of Matrices
809
A.3.1
Vector Spaces 809
A.3.2
Linear Combinations of Vectors and Basis Vectors 811
A.3.3
Linear Dependence 811
A.3.4
Subspaces 813
A.3.5
Rank of a Matrix 814
A.3.6
Determinant of a Matrix 816
A.3.7
A Least Squares Problem 817
Solution of a System of Linear Equations
819
A.4.1
Systems of Linear Equations 819
A.4.2
Inverse Matrices 820
A.4.3
Nonhomogeneous Systems of Equations 822
A.4.4
Solving the Least Squares Problem 822
Partitioned Matrices
822
A.5.1
Addition and Multiplication of Partitioned Matrices 823
A.5.2
Determinants of Partitioned Matrices 823
A.5.3
Inverses of Partitioned Matrices 823
A.5.4
Deviations from Means 824
A.5.5
Kronecker Products 824
Characteristic Roots and Vectors
825
A.6.1
The Characteristic Equation 825
A.6.2
Characteristic Vectors 826
A.6.3
General Results for Characteristic Roots and Vectors 826
A.6.4
Diagonalization and Spectral Decomposition of a Matrix 827
A.6.5
Rank of a Matrix 827
A.6.6
Condition Number of a Matrix 829
A.6.7
Trace of a Matrix 829
A.6.8
Determinant of a Matrix 830
A.6.9
Powers of a Matrix 830
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
xxiii
A.6.10
Idempotent Matrices 832
A.6.11
Factoring a Matrix 832
A.6.12
The Generalized Inverse of a Matrix 833
A.7 Quadratic Forms and Definite Matrices
834
A.7.1
Nonnegative Definite Matrices 835
A.7.2
Idempotent Quadratic Forms 836
A.7.3
Comparing Matrices 836
A.8 Calculus and Matrix Algebra
837
A.8.1
Differentiation and the Taylor Series 837
A.8.2
Optimization 840
A.8.3
Constrained Optimization 842
A.8.4
Transformations 844
APPENDIX B Probability and Distribution Theory
B.1 Introduction
845
B.2 Random Variables
845
B.3
B.4
B.5
B.6
B.7
845
B.2.1
Probability Distributions 845
B.2.2
Cumulative Distribution Function 846
Expectations of a Random Variable
847
Some Specific Probability Distributions
849
B.4.1
The Normal Distribution 849
B.4.2
The Chi-Squared, t, and F Distributions 851
B.4.3
Distributions With Large Degrees of Freedom 853
B.4.4
Size Distributions: The Lognormal Distribution 854
B.4.5
The Gamma and Exponential Distributions 855
B.4.6
The Beta Distribution 855
B.4.7
The Logistic Distribution 855
B.4.8
Discrete Random Variables 855
The Distribution of a Function of a Random Variable
856
Representations of a Probability Distribution
858
Joint Distributions
860
B.7.1
Marginal Distributions 860
B.7.2
Expectations in a Joint Distribution 861
B.7.3
Covariance and Correlation 861
B.7.4
Distribution of a Function of Bivariate Random Variables 862
B.8 Conditioning in a Bivariate Distribution
864
B.8.1
Regression: The Conditional Mean 864
B.8.2
Conditional Variance 865
B.8.3
Relationships Among Marginal and Conditional
Moments 865
B.8.4
The Analysis of Variance 867
B.9 The Bivariate Normal Distribution
867
B.10 Multivariate Distributions
868
B.10.1
Moments 868
Greene-50240
gree50240˙FM
xxiv
July 10, 2002
12:51
Contents
B.10.2
Sets of Linear Functions 869
B.10.3
Nonlinear Functions 870
B.11 The Multivariate Normal Distribution
871
B.11.1
Marginal and Conditional Normal Distributions 871
B.11.2
The Classical Normal Linear Regression Model 872
B.11.3
Linear Functions of a Normal Vector 873
B.11.4
Quadratic Forms in a Standard Normal Vector 873
B.11.5
The F Distribution 875
B.11.6
A Full Rank Quadratic Form 875
B.11.7
Independence of a Linear and a Quadratic Form 876
APPENDIX C Estimation and Inference
C.1 Introduction
877
C.2 Samples and Random Sampling
C.3 Descriptive Statistics
878
877
878
C.4 Statistics as Estimators—Sampling Distributions
C.5 Point Estimation of Parameters
885
C.5.1
Estimation in a Finite Sample 885
C.5.2
Efficient Unbiased Estimation 888
C.6 Interval Estimation
890
C.7 Hypothesis Testing
892
C.7.1
Classical Testing Procedures 892
C.7.2
Tests Based on Confidence Intervals
C.7.3
Specification Tests 896
882
895
APPENDIX D Large Sample Distribution Theory
896
D.1 Introduction
896
D.2 Large-Sample Distribution Theory
897
D.2.1
Convergence in Probability 897
D.2.2
Other Forms of Convergence and Laws of Large Numbers
D.2.3
Convergence of Functions 903
D.2.4
Convergence to a Random Variable 904
D.2.5
Convergence in Distribution: Limiting Distributions 906
D.2.6
Central Limit Theorems 908
D.2.7
The Delta Method 913
D.3 Asymptotic Distributions
914
D.3.1
Asymptotic Distribution of a Nonlinear Function 916
D.3.2
Asymptotic Expectations 917
D.4 Sequences and the Order of a Sequence
918
APPENDIX E Computation and Optimization
919
E.1 Introduction
919
E.2 Data Input and Generation
920
E.2.1
Generating Pseudo-Random Numbers
920
900
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Contents
E.3
E.4
E.5
E.6
E.2.2
Sampling from a Standard Uniform Population 921
E.2.3
Sampling from Continuous Distributions 921
E.2.4
Sampling from a Multivariate Normal Population 922
E.2.5
Sampling from a Discrete Population 922
E.2.6
The Gibbs Sampler 922
Monte Carlo Studies
923
Bootstrapping and the Jackknife
924
Computation in Econometrics
925
E.5.1
Computing Integrals 926
E.5.2
The Standard Normal Cumulative Distribution Function
E.5.3
The Gamma and Related Functions 927
E.5.4
Approximating Integrals by Quadrature 928
E.5.5
Monte Carlo Integration 929
E.5.6
Multivariate Normal Probabilities and Simulated
Moments 931
E.5.7
Computing Derivatives 933
Optimization
933
E.6.1
E.6.2
E.6.3
E.6.4
E.6.5
E.6.6
Algorithms 935
Gradient Methods 935
Aspects of Maximum Likelihood Estimation 939
Optimization with Constraints 941
Some Practical Considerations 942
Examples 943
APPENDIX F
Data Sets Used in Applications
APPENDIX G
Statistical Tables
References
959
Author Index
000
Subject Index
000
953
946
xxv
926
Greene-50240
gree50240˙FM
July 10, 2002
12:51
P R E FA C E
Q
1.
THE FIFTH EDITION OF ECONOMETRIC
ANALYSIS
Econometric Analysis is intended for a one-year graduate course in econometrics for
social scientists. The prerequisites for this course should include calculus, mathematical
statistics, and an introduction to econometrics at the level of, say, Gujarati’s Basic Econometrics (McGraw-Hill, 1995) or Wooldridge’s Introductory Econometrics: A Modern
Approach [South-Western (2000)]. Self-contained (for our purposes) summaries of the
matrix algebra, mathematical statistics, and statistical theory used later in the book are
given in Appendices A through D. Appendix E contains a description of numerical
methods that will be useful to practicing econometricians. The formal presentation of
econometrics begins with discussion of a fundamental pillar, the linear multiple regression model, in Chapters 2 through 8. Chapters 9 through 15 present familiar extensions
of the single linear equation model, including nonlinear regression, panel data models,
the generalized regression model, and systems of equations. The linear model is usually
not the sole technique used in most of the contemporary literature. In view of this, the
(expanding) second half of this book is devoted to topics that will extend the linear
regression model in many directions. Chapters 16 through 18 present the techniques
and underlying theory of estimation in econometrics, including GMM and maximum
likelihood estimation methods and simulation based techniques. We end in the last four
chapters, 19 through 22, with discussions of current topics in applied econometrics, including time-series analysis and the analysis of discrete choice and limited dependent
variable models.
This book has two objectives. The first is to introduce students to applied econometrics, including basic techniques in regression analysis and some of the rich variety
of models that are used when the linear model proves inadequate or inappropriate.
The second is to present students with sufficient theoretical background that they will
recognize new variants of the models learned about here as merely natural extensions
that fit within a common body of principles. Thus, I have spent what might seem to be
a large amount of effort explaining the mechanics of GMM estimation, nonlinear least
squares, and maximum likelihood estimation and GARCH models. To meet the second
objective, this book also contains a fair amount of theoretical material, such as that on
maximum likelihood estimation and on asymptotic results for regression models. Modern software has made complicated modeling very easy to do, and an understanding of
the underlying theory is important.
I had several purposes in undertaking this revision. As in the past, readers continue
to send me interesting ideas for my “next edition.” It is impossible to use them all, of
xxvii
Greene-50240
gree50240˙FM
xxviii
July 10, 2002
12:51
Preface
course. Because the five volumes of the Handbook of Econometrics and two of the
Handbook of Applied Econometrics already run to over 4,000 pages, it is also unnecessary. Nonetheless, this revision is appropriate for several reasons. First, there are new
and interesting developments in the field, particularly in the areas of microeconometrics
(panel data, models for discrete choice) and, of course, in time series, which continues
its rapid development. Second, I have taken the opportunity to continue fine-tuning the
text as the experience and shared wisdom of my readers accumulates in my files. For this
revision, that adjustment has entailed a substantial rearrangement of the material—the
main purpose of that was to allow me to add the new material in a more compact and
orderly way than I could have with the table of contents in the 4th edition. The literature in econometrics has continued to evolve, and my third objective is to grow with it.
This purpose is inherently difficult to accomplish in a textbook. Most of the literature is
written by professionals for other professionals, and this textbook is written for students
who are in the early stages of their training. But I do hope to provide a bridge to that
literature, both theoretical and applied.
This book is a broad survey of the field of econometrics. This field grows continually, and such an effort becomes increasingly difficult. (A partial list of journals
devoted at least in part, if not completely, to econometrics now includes the Journal
of Applied Econometrics, Journal of Econometrics, Econometric Theory, Econometric
Reviews, Journal of Business and Economic Statistics, Empirical Economics, and Econometrica.) Still, my view has always been that the serious student of the field must start
somewhere, and one can successfully seek that objective in a single textbook. This text
attempts to survey, at an entry level, enough of the fields in econometrics that a student
can comfortably move from here to practice or more advanced study in one or more
specialized areas. At the same time, I have tried to present the material in sufficient
generality that the reader is also able to appreciate the important common foundation
of all these fields and to use the tools that they all employ.
There are now quite a few recently published texts in econometrics. Several have
gathered in compact, elegant treatises, the increasingly advanced and advancing theoretical background of econometrics. Others, such as this book, focus more attention on
applications of econometrics. One feature that distinguishes this work from its predecessors is its greater emphasis on nonlinear models. [Davidson and MacKinnon (1993)
is a noteworthy, but more advanced, exception.] Computer software now in wide use
has made estimation of nonlinear models as routine as estimation of linear ones, and the
recent literature reflects that progression. My purpose is to provide a textbook treatment that is in line with current practice. The book concludes with four lengthy chapters
on time-series analysis, discrete choice models and limited dependent variable models.
These nonlinear models are now the staples of the applied econometrics literature. This
book also contains a fair amount of material that will extend beyond many first courses
in econometrics, including, perhaps, the aforementioned chapters on limited dependent
variables, the section in Chapter 22 on duration models, and some of the discussions
of time series and panel data models. Once again, I have included these in the hope of
providing a bridge to the professional literature in these areas.
I have had one overriding purpose that has motivated all five editions of this work.
For the vast majority of readers of books such as this, whose ambition is to use, not
develop econometrics, I believe that it is simply not sufficient to recite the theory of
estimation, hypothesis testing and econometric analysis. Understanding the often subtle
Greene-50240
gree50240˙FM
July 10, 2002
12:51
Preface
xxix
background theory is extremely important. But, at the end of the day, my purpose in
writing this work, and for my continuing efforts to update it in this now fifth edition,
is to show readers how to do econometric analysis. I unabashedly accept the unflattering assessment of a correspondent who once likened this book to a “user’s guide to
econometrics.”
2.
SOFTWARE AND DATA
There are many computer programs that are widely used for the computations described
in this book. All were written by econometricians or statisticians, and in general, all
are regularly updated to incorporate new developments in applied econometrics. A
sampling of the most widely used packages and Internet home pages where you can
find information about them are:
E-Views
Gauss
LIMDEP
RATS
SAS
Shazam
Stata
TSP
www.eviews.com
www.aptech.com
www.limdep.com
www.estima.com
www.sas.com
shazam.econ.ubc.ca
www.stata.com
www.tspintl.com
(QMS, Irvine, Calif.)
(Aptech Systems, Kent, Wash.)
(Econometric Software, Plainview, N.Y.)
(Estima, Evanston, Ill.)
(SAS, Cary, N.C.)
(Ken White, UBC, Vancouver, B.C.)
(Stata, College Station, Tex.)
(TSP International, Stanford, Calif.)
Programs vary in size, complexity, cost, the amount of programming required of the user,
and so on. Journals such as The American Statistician, The Journal of Applied Econometrics, and The Journal of Economic Surveys regularly publish reviews of individual
packages and comparative surveys of packages, usually with reference to particular
functionality such as panel data analysis or forecasting.
With only a few exceptions, the computations described in this book can be carried
out with any of these packages. We hesitate to link this text to any of them in particular. We have placed for general access a customized version of LIMDEP, which was
also written by the author, on the website for this text, />∼wgreene/Text/econometricanalysis.htm. LIMDEP programs used for many of
the computations are posted on the sites as well.
The data sets used in the examples are also on the website. Throughout the text,
these data sets are referred to “TableFn.m,” for example Table F4.1. The F refers to
Appendix F at the back of the text, which contains descriptions of the data sets. The
actual data are posted on the website with the other supplementary materials for the
text. (The data sets are also replicated in the system format of most of the commonly
used econometrics computer programs, including in addition to LIMDEP, SAS, TSP,
SPSS, E-Views, and Stata, so that you can easily import them into whatever program
you might be using.)
I should also note, there are now thousands of interesting websites containing software, data sets, papers, and commentary on econometrics. It would be hopeless to
attempt any kind of a survey here. But, I do note one which is particularly agreeably structured and well targeted for readers of this book, the data archive for the
Greene-50240
gree50240˙FM
xxx
July 10, 2002
12:51
Preface
Journal of Applied Econometrics. This journal publishes many papers that are precisely
at the right level for readers of this text. They have archived all the nonconfidential
data sets used in their publications since 1994. This useful archive can be found at
/>
3.
ACKNOWLEDGEMENTS
It is a pleasure to express my appreciation to those who have influenced this work. I am
grateful to Arthur Goldberger and Arnold Zellner for their encouragement, guidance,
and always interesting correspondence. Dennis Aigner and Laurits Christensen were
also influential in shaping my views on econometrics. Some collaborators to the earlier
editions whose contributions remain in this one include Aline Quester, David Hensher,
and Donald Waldman. The number of students and colleagues whose suggestions have
helped to produce what you find here is far too large to allow me to thank them all
individually. I would like to acknowledge the many reviewers of my work whose careful reading has vastly improved the book: Badi Baltagi, University of Houston: Neal
Beck, University of California at San Diego; Diane Belleville, Columbia University;
Anil Bera, University of Illinois; John Burkett, University of Rhode Island; Leonard
Carlson, Emory University; Frank Chaloupka, City University of New York; Chris
Cornwell, University of Georgia; Mitali Das, Columbia University; Craig Depken II,
University of Texas at Arlington; Edward Dwyer, Clemson University; Michael Ellis,
Wesleyan University; Martin Evans, New York University; Ed Greenberg, Washington
University at St. Louis; Miguel Herce, University of North Carolina; K. Rao Kadiyala,
Purdue University; Tong Li, Indiana University; Lubomir Litov, New York University;
William Lott, University of Connecticut; Edward Mathis, Villanova University; Mary
McGarvey, University of Nebraska-Lincoln; Ed Melnick, New York University; Thad
Mirer, State University of New York at Albany; Paul Ruud, University of California at
Berkeley; Sherrie Rhine, Chicago Federal Reserve Board; Terry G. Seaks, University
of North Carolina at Greensboro; Donald Snyder, California State University at Los
Angeles; Steven Stern, University of Virginia; Houston Stokes, University of Illinois
at Chicago; Dimitrios Thomakos, Florida International University; Paul Wachtel, New
York University; Mark Watson, Harvard University; and Kenneth West, University
of Wisconsin. My numerous discussions with B. D. McCullough have improved Appendix E and at the same time increased my appreciation for numerical analysis. I
am especially grateful to Jan Kiviet of the University of Amsterdam, who subjected
my third edition to a microscopic examination and provided literally scores of suggestions, virtually all of which appear herein. Chapters 19 and 20 have also benefited from
previous reviews by Frank Diebold, B. D. McCullough, Mary McGarvey, and Nagesh
Revankar. I would also like to thank Rod Banister, Gladys Soto, Cindy Regan, Mike
Reynolds, Marie McHale, Lisa Amato, and Torie Anderson at Prentice Hall for their
contributions to the completion of this book. As always, I owe the greatest debt to my
wife, Lynne, and to my daughters, Lesley, Allison, Elizabeth, and Julianna.
William H. Greene