Tải bản đầy đủ (.pdf) (608 trang)

Modern engineering statistics by ryan

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (35.58 MB, 608 trang )


Modern Engineering Statistics
THOMAS P. RYAN
Acworth, Georgia

A JOHN WILEY & SONS, INC., PUBLICATION



Modern Engineering Statistics



Modern Engineering Statistics
THOMAS P. RYAN
Acworth, Georgia

A JOHN WILEY & SONS, INC., PUBLICATION


Copyright

C

2007 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted
under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of


the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4470, or on the web at
www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions
Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, 201-748-6011, fax 201-748-6008
or online at />Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or completeness of
the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a
particular purpose. No warranty may be created or extended by sales representatives or written sales materials.
The advice and strategies contained herein may not be suitable for your situation. You should consult with a
professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other
commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer
Care Department within the United States at 800-762-2974, outside the United States at
317-572-3993 or fax 317-572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com
Wiley Bicentennial Logo: Richard J. Pacifico
Library of Congress Cataloging-in-Publication Data:
Ryan, Thomas P., 1945–
Modern engineering statistics / Thomas P. Ryan.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-08187-7
1. Engineering–Statistical methods. I. Title.
TA340.R93 2007
620.0072–dc22
20060521558
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1



Contents

Preface

xvii

1. Methods of Collecting and Presenting Data
1.1
1.2
1.3
1.4

1.5

1.6
1.7
1.8
1.9
1.10

1

Observational Data and Data from Designed Experiments, 3
Populations and Samples, 5
Variables, 6
Methods of Displaying Small Data Sets, 7
1.4.1 Stem-and-Leaf Display, 8
1.4.2 Time Sequence Plot and Control Chart, 9
1.4.3 Lag Plot, 11

1.4.4 Scatter Plot, 12
1.4.5 Digidot Plot, 14
1.4.6 Dotplot, 14
Methods of Displaying Large Data Sets, 16
1.5.1 Histogram, 16
1.5.2 Boxplot, 20
Outliers, 22
Other Methods, 22
Extremely Large Data Sets: Data Mining, 23
Graphical Methods: Recommendations, 23
Summary, 24
References, 24
Exercises, 25

2. Measures of Location and Dispersion

45

2.1 Estimating Location Parameters, 46
2.2 Estimating Dispersion Parameters, 50
2.3 Estimating Parameters from Grouped Data, 55
v


vi

contents
2.4 Estimates from a Boxplot, 57
2.5 Computing Sample Statistics with MINITAB, 58
2.6 Summary, 58

Reference, 58
Exercises, 58

3. Probability and Common Probability Distributions
3.1 Probability: From the Ethereal to the Concrete, 68
3.1.1 Manufacturing Applications, 70
3.2 Probability Concepts and Rules, 70
3.2.1 Extension to Multiple Events, 73
3.2.1.1 Law of Total Probability and Bayes’ Theorem, 74
3.3 Common Discrete Distributions, 76
3.3.1 Expected Value and Variance, 78
3.3.2 Binomial Distribution, 80
3.3.2.1 Testing for the Appropriateness of the
Binomial Model, 86
3.3.3 Hypergeometric Distribution, 87
3.3.4 Poisson Distribution, 88
3.3.4.1 Testing for the Appropriateness of the
Poisson Model, 90
3.3.5 Geometric Distribution, 91
3.4 Common Continuous Distributions, 92
3.4.1 Expected Value and Variance, 92
3.4.2 Determining Probabilities for Continuous Random
Variables, 92
3.4.3 Normal Distribution, 93
3.4.3.1 Software-Aided Normal Probability
Computations, 97
3.4.3.2 Testing the Normality Assumption, 97
3.4.4 t-Distribution, 97
3.4.5 Gamma Distribution, 100
3.4.5.1 Chi-Square Distribution, 100

3.4.5.2 Exponential Distribution, 101
3.4.6 Weibull Distribution, 102
3.4.7 Smallest Extreme Value Distribution, 103
3.4.8 Lognormal Distribution, 104
3.4.9 F Distribution, 104
3.5 General Distribution Fitting, 106
3.6 How to Select a Distribution, 107
3.7 Summary, 108
References, 109
Exercises, 109

68


contents
4. Point Estimation

vii
121

4.1 Point Estimators and Point Estimates, 121
4.2 Desirable Properties of Point Estimators, 121
4.2.1 Unbiasedness and Consistency, 121
4.2.2 Minimum Variance, 122
4.2.3 Estimators Whose Properties Depend on the Assumed
Distribution, 124
4.2.4 Comparing Biased and Unbiased Estimators, 124
4.3 Distributions of Sampling Statistics, 125
4.3.1 Central Limit Theorem, 126
4.3.1.1 Illustration of Central Limit Theorem, 126

4.3.2 Statistics with Nonnormal Sampling Distributions, 128
4.4 Methods of Obtaining Estimators, 128
4.4.1 Method of Maximum Likelihood, 128
4.4.2 Method of Moments, 130
4.4.3 Method of Least Squares, 131
4.5 Estimating σθˆ , 132
4.6 Estimating Parameters Without Data, 133
4.7 Summary, 133
References, 134
Exercises, 134
5. Confidence Intervals and Hypothesis Tests—One Sample
5.1 Confidence Interval for µ: Normal Distribution, σ Not Estimated
from Sample Data, 140
5.1.1 Sample Size Determination, 142
5.1.2 Interpretation and Use, 143
5.1.3 General Form of Confidence Intervals, 145
5.2 Confidence Interval for µ: Normal Distribution, σ Estimated from
Sample Data, 146
5.2.1 Sample Size Determination, 146
5.3 Hypothesis Tests for µ: Using Z and t, 147
5.3.1 Null Hypotheses Always False?, 147
5.3.2 Basic Hypothesis Testing Concepts, 148
5.3.3 Two-Sided Hypothesis Tests Vis-`a-Vis Confidence Intervals, 152
5.3.4 One-Sided Hypothesis Tests Vis-`a-Vis One-Sided
Confidence Intervals, 153
5.3.5 Relationships When the t-Distribution is Used, 155
5.3.6 When to Use t or Z (or Neither)?, 155
5.3.7 Additional Example, 156
5.4 Confidence Intervals and Hypothesis Tests for a Proportion, 157
5.4.1 Approximate Versus Exact Confidence Interval

for a Proportion, 158

140


viii

contents
5.5 Confidence Intervals and Hypothesis Tests for σ 2 and σ , 161
5.5.1 Hypothesis Tests for σ 2 and σ , 163
5.6 Confidence Intervals and Hypothesis Tests for the Poisson Mean, 164
5.7 Confidence Intervals and Hypothesis Tests When Standard
Error Expressions are Not Available, 166
5.8 Type I and Type II Errors, 168
5.8.1 p-Values, 170
5.8.2 Trade-off Between Error Risks, 172
5.9 Practical Significance and Narrow Intervals: The Role of n, 172
5.10 Other Types of Confidence Intervals, 173
5.11 Abstract of Main Procedures, 174
5.12 Summary, 175
Appendix: Derivation, 176
References, 176
Exercises, 177

6. Confidence Intervals and Hypothesis Tests—Two Samples

189

6.1 Confidence Intervals and Hypothesis Tests for Means:
Independent Samples, 189

6.1.1 Using Z , 190
6.1.2 Using t, 192
6.1.3 Using Neither t nor Z , 197
6.2 Confidence Intervals and Hypothesis Tests for Means:
Dependent Samples, 197
6.3 Confidence Intervals and Hypothesis Tests for Two Proportions, 200
6.3.1 Confidence Interval, 202
6.4 Confidence Intervals and Hypothesis Tests for Two Variances, 202
6.5 Abstract of Procedures, 204
6.6 Summary, 205
References, 205
Exercises, 205

7. Tolerance Intervals and Prediction Intervals
7.1 Tolerance Intervals: Normality Assumed, 215
7.1.1 Two-Sided Interval, 216
7.1.1.1 Approximations, 217
7.1.2 Two-Sided Interval, Possibly Unequal Tails, 218
7.1.3 One-Sided Bound, 218
7.2 Tolerance Intervals and Six Sigma, 219

214


contents

ix

7.3 Distribution-Free Tolerance Intervals, 219
7.3.1 Determining Sample Size, 221

7.4 Prediction Intervals, 221
7.4.1 Known Parameters, 222
7.4.2 Unknown Parameters with Normality Assumed
(Single Observation), 223
7.4.2.1 Sensitivity to Nonnormality, 223
7.4.2.2 Width of the Interval, 224
7.4.3 Nonnormal Distributions: Single Observation, 224
7.4.4 Nonnormal Distributions: Number of Failures, 225
7.4.5 Prediction Intervals for Multiple Future Observations, 225
7.4.6 One-Sided Prediction Bounds, 225
7.4.6.1 One-Sided Prediction Bounds for Certain Discrete
Distributions, 226
7.4.7 Distribution-Free Prediction Intervals, 226
7.5 Choice Between Intervals, 227
7.6 Summary, 227
References, 228
Exercises, 229
8. Simple Linear Regression, Correlation, and Calibration
8.1 Introduction, 232
8.2 Simple Linear Regression, 232
8.2.1 Regression Equation, 234
8.2.2 Estimating β0 and β1 , 234
8.2.3 Assumptions, 237
8.2.4 Sequence of Steps, 237
8.2.5 Example with College Data, 239
8.2.5.1 Computer Output, 240
8.2.6 Checking Assumptions, 245
8.2.6.1 Testing for Independent Errors, 245
8.2.6.2 Testing for Nonconstant Error Variance, 246
8.2.6.3 Checking for Nonnormality, 247

8.2.7 Defect Escape Probability Example (Continued), 248
8.2.8 After the Assumptions Have Been Checked, 249
8.2.9 Fixed Versus Random Regressors, 249
8.2.10 Transformations, 249
8.2.10.1 Transforming the Model, 249
8.2.10.2 Transforming Y and/or X , 250
8.2.11 Prediction Intervals and Confidence Intervals, 250
8.2.12 Model Validation, 254

232


x

contents
8.3 Correlation, 254
8.3.1 Assumptions, 256
8.4 Miscellaneous Uses of Regression, 256
8.4.1 Calibration, 257
8.4.1.1 Calibration Intervals, 262
8.4.2 Measurement Error, 263
8.4.3 Regression for Control, 263
8.5 Summary, 264
References, 264
Exercises, 265

9. Multiple Regression
9.1
9.2
9.3

9.4

9.5
9.6

9.7
9.8
9.9
9.10
9.11
9.12

How Do We Start?, 277
Interpreting Regression Coefficients, 278
Example with Fixed Regressors, 279
Example with Random Regressors, 281
9.4.1 Use of Scatterplot Matrix, 282
9.4.2 Outliers and Unusual Observations: Model Specific, 283
9.4.3 The Need for Variable Selection, 283
9.4.4 Illustration of Stepwise Regression, 284
9.4.5 Unusual Observations, 287
9.4.6 Checking Model Assumptions, 288
9.4.6.1 Normality, 289
9.4.6.2 Constant Variance, 290
9.4.6.3 Independent Errors, 290
9.4.7 Summary of Example, 291
Example of Section 8.2.4 Extended, 291
Selecting Regression Variables, 293
9.6.1 Forward Selection, 294
9.6.2 Backward Elimination, 295

9.6.3 Stepwise Regression, 295
9.6.3.1 Significance Levels, 295
9.6.4 All Possible Regressions, 296
9.6.4.1 Criteria, 296
Transformations, 299
Indicator Variables, 300
Regression Graphics, 300
Logistic Regression and Nonlinear Regression Models, 301
Regression with Matrix Algebra, 302
Summary, 302
References, 303
Exercises, 304

276


contents

xi

10. Mechanistic Models

314

10.1 Mechanistic Models, 315
10.1.1 Mechanistic Models in Accelerated Life Testing, 315
10.1.1.1 Arrhenius Model, 316
10.2 Empirical–Mechanistic Models, 316
10.3 Additional Examples, 324
10.4 Software, 325

10.5 Summary, 326
References, 326
Exercises, 327
11. Control Charts and Quality Improvement
11.1
11.2
11.3
11.4
11.5
11.6
11.7

11.8
11.9

11.10

Basic Control Chart Principles, 330
Stages of Control Chart Usage, 331
Assumptions and Methods of Determining Control Limits, 334
Control Chart Properties, 335
Types of Charts, 336
Shewhart Charts for Controlling a Process Mean and Variability
(Without Subgrouping), 336
Shewhart Charts for Controlling a Process Mean and Variability
(With Subgrouping), 344
11.7.1 X -Chart, 344
11.7.1.1 Distributional Considerations, 344
11.7.1.2 Parameter Estimation, 347
11.7.2 s-Chart or R-Chart?, 347

Important Use of Control Charts for Measurement Data, 349
Shewhart Control Charts for Nonconformities and
Nonconforming Units, 349
11.9.1 p-Chart and np-Chart, 350
11.9.1.1 Regression-Based Limits, 350
11.9.1.2 Overdispersion, 351
11.9.2 c-Chart, 351
11.9.2.1 Regression-Based Limits, 352
11.9.2.2 Robustness Considerations, 354
11.9.3 u-Chart, 354
11.9.3.1 Regression-Based Limits, 355
11.9.3.2 Overdispersion, 355
Alternatives to Shewhart Charts, 356
11.10.1 CUSUM and EWMA Procedures, 357
11.10.1.1 CUSUM Procedures, 357
11.10.1.2 EWMA Procedures, 358
11.10.1.3 CUSUM and EWMA Charts
with MINITAB, 359

330


xii

contents
11.11 Finding Assignable Causes, 359
11.12 Multivariate Charts, 362
11.13 Case Study, 362
11.13.1 Objective and Data, 362
11.13.2 Test for Nonnormality, 362

11.13.3 Control Charts, 362
11.14 Engineering Process Control, 364
11.15 Process Capability, 365
11.16 Improving Quality with Designed Experiments, 366
11.17 Six Sigma, 367
11.18 Acceptance Sampling, 368
11.19 Measurement Error, 368
11.20 Summary, 368
References, 369
Exercises, 370

12. Design and Analysis of Experiments
12.1 Processes Must be in Statistical Control, 383
12.2 One-Factor Experiments, 384
12.2.1 Three or More Levels, 385
12.2.1.1 Testing for Equality of Variances, 386
12.2.1.2 Example with Five Levels, 386
12.2.1.3 ANOVA Analogy to t-Test, 388
12.2.2 Assumptions, 389
12.2.3 ANOVA and ANOM, 390
12.2.3.1 ANOM with Unequal Variances, 391
12.3 One Treatment Factor and at Least One Blocking Factor, 392
12.3.1 One Blocking Factor: Randomized Block Design, 392
12.3.2 Two Blocking Factors: Latin Square Design, 395
12.4 More Than One Factor, 395
12.5 Factorial Designs, 396
12.5.1 Two Levels, 397
12.5.1.1 Regression Model Interpretation, 398
12.5.1.2 Large Interactions, 399
12.5.2 Interaction Problems: 23 Examples, 400

12.5.3 Analysis of Unreplicated Factorial Experiments, 403
12.5.4 Mixed Factorials, 404
12.5.5 Blocking Factorial Designs, 404
12.6 Crossed and Nested Designs, 405
12.7 Fixed and Random Factors, 406

382


contents

xiii

12.8 ANOM for Factorial Designs, 407
12.8.1 HANOM for Factorial Designs, 408
12.9 Fractional Factorials, 409
12.9.1 2k−1 Designs, 409
12.9.2 Highly Fractionated Designs, 412
12.10 Split-Plot Designs, 413
12.11 Response Surface Designs, 414
12.12 Raw Form Analysis Versus Coded Form Analysis, 415
12.13 Supersaturated Designs, 416
12.14 Hard-to-Change Factors, 416
12.15 One-Factor-at-a-Time Designs, 417
12.16 Multiple Responses, 418
12.17 Taguchi Methods of Design, 419
12.18 Multi-Vari Chart, 420
12.19 Design of Experiments for Binary Data, 420
12.20 Evolutionary Operation (EVOP), 421
12.21 Measurement Error, 422

12.22 Analysis of Covariance, 422
12.23 Summary of MINITAB and Design-Expert® Capabilities for
Design of Experiments, 422
12.23.1 Other Software for Design of Experiments, 423
12.24 Training for Experimental Design Use, 423
12.25 Summary, 423
Appendix A Computing Formulas, 424
Appendix B Relationship Between Effect Estimates and
Regression Coefficients, 426
References, 426
Exercises, 428
13. Measurement System Appraisal
13.1 Terminology, 442
13.2 Components of Measurement Variability, 443
13.2.1 Tolerance Analysis for Repeatability
and Reproducibility, 444
13.2.2 Confidence Intervals, 445
13.2.3 Examples, 445
13.3 Graphical Methods, 449
13.4 Bias and Calibration, 449
13.4.1 Gage Linearity and Bias Study, 450
13.4.2 Attribute Gage Study, 452
13.4.3 Designs for Calibration, 454

441


xiv

contents

13.5 Propagation of Error, 454
13.6 Software, 455
13.6.1 MINITAB, 455
13.6.2 JMP, 455
13.7 Summary, 456
References, 456
Exercises, 457

14. Reliability Analysis and Life Testing
14.1 Basic Reliability Concepts, 461
14.2 Nonrepairable and Repairable Populations, 463
14.3 Accelerated Testing, 463
14.3.1 Arrhenius Equation, 464
14.3.2 Inverse Power Function, 465
14.3.3 Degradation Data and Acceleration Models, 465
14.4 Types of Reliability Data, 466
14.4.1 Types of Censoring, 467
14.5 Statistical Terms and Reliability Models, 467
14.5.1 Reliability Functions for Series Systems and Parallel
Systems, 468
14.5.2 Exponential Distribution, 469
14.5.3 Weibull Distribution, 470
14.5.4 Lognormal Distribution, 471
14.5.5 Extreme Value Distribution, 471
14.5.6 Other Reliability Models, 471
14.5.7 Selecting a Reliability Model, 472
14.6 Reliability Engineering, 473
14.6.1 Reliability Prediction, 473
14.7 Example, 474
14.8 Improving Reliability with Designed Experiments, 474

14.8.1 Designed Experiments with Degradation
Data, 477
14.9 Confidence Intervals, 477
14.10 Sample Size Determination, 478
14.11 Reliability Growth and Demonstration
Testing, 479
14.12 Early Determination of Product Reliability, 480
14.13 Software, 480
14.13.1 MINITAB, 480
14.13.2 JMP, 480
14.13.3 Other Software, 481

460


contents

xv

14.14 Summary, 481
References, 481
Exercises, 482
15. Analysis of Categorical Data

487

15.1 Contingency Tables, 487
15.1.1 2 × 2 Tables, 491
15.1.2 Contributions to the Chi-Square Statistic, 492
15.1.3 Exact Analysis of Contingency Tables, 493

15.1.4 Contingency Tables with More than Two Factors, 497
15.2 Design of Experiments: Categorical Response Variable, 497
15.3 Goodness-of-Fit Tests, 498
15.4 Summary, 500
References, 500
Exercises, 501
16. Distribution-Free Procedures

507

16.1 Introduction, 507
16.2 One-Sample Procedures, 508
16.2.1 Methods of Detecting Nonrandom Data, 509
16.2.1.1 Runs Test, 509
16.2.2 Sign Test, 510
16.2.3 Wilcoxon One-Sample Test, 511
16.3 Two-Sample Procedures, 512
16.3.1 Mann–Whitney Two-Sample Test, 512
16.3.2 Spearman Rank Correlation Coefficient, 513
16.4 Nonparametric Analysis of Variance, 514
16.4.1 Kruskal–Wallis Test for One Factor, 514
16.4.2 Friedman Test for Two Factors, 516
16.5 Exact Versus Approximate Tests, 519
16.6 Nonparametric Regression, 519
16.7 Nonparametric Prediction Intervals and Tolerance Intervals, 521
16.8 Summary, 521
References, 521
Exercises, 522
17. Tying It All Together
17.1 Review of Book, 525

17.2 The Future, 527
17.3 Engineering Applications of Statistical Methods, 528
Reference, 528
Exercises, 528

525


xvi

contents

Answers to Selected Excercises

533

Appendix: Statistical Tables

562

Table A
Table B
Table C
Table D
Table E

Table F

Random Numbers, 562
Normal Distribution, 564

t-Distribution, 566
F-Distribution, 567
Factors for Calculating Two-Sided 99% Statistical
Intervals for a Normal Population to Contain at Least
100 p% of the Population, 570
Control Chart Constants, 571

Author Index

573

Subject Index

579


Preface

Statistical methods are an important part of the education of any engineering student.
This was formally recognized by the Accreditation Board for Engineering and Technology
(ABET) when, several years ago, education in probability and statistics became an ABET
requirement for all undergraduate engineering majors. Specific topics within the broad field
of probability and statistics were not specified, however, so colleges and universities have
considerable latitude regarding the manner in which they meet the requirement. Similarly,
ABET’s Criteria for Accrediting Engineering Programs, which were to apply to evaluations
during 2001–2002, were not specific regarding the probability and statistics skills that
engineering graduates should possess.
Engineering statistics courses are offered by math and statistics departments, as well
as being taught within engineering departments and schools. An example of the latter is
The School of Industrial and Systems Engineering at Georgia Tech, whose list of course

offerings in applied statistics rivals that of many statistics departments.
Unfortunately, many engineering statistics courses have not differed greatly from mathematical statistics courses, and this is due in large measure to the manner in which many
engineering statistics textbooks have been written. This textbook makes no pretense of being
a “math stat book.” Instead, my objective has been to motivate an appreciation of statistical
techniques, and to do this as much as possible within the context of engineering, as many
of the datasets that are used in the chapters and chapter exercises are from engineering
sources. I have taught countless engineering statistics courses over a period of two decades
and I have formulated some specific ideas of what I believe should be the content of an
engineering statistics course. The contents of this textbook and the style of writing follow
accordingly.
Statistics books have been moving in a new direction for the past fifteen years, although
books that have beaten a new path have often been overshadowed by the sheer number of
books that are traditional rather than groundbreaking.
The optimum balance between statistical thinking and statistical methodology can certainly be debated. Hoerl and Snee’s book, Statistical Thinking, which is basically a book
on business statistics, stands at one extreme as a statistics book that emphasizes the “big
picture” and the use of statistical tools in a broad way rather than encumbering the student
with an endless stream of seemingly unrelated methods and formulas.
This book might be viewed as somewhat of an engineering statistics counterpart to the
Hoerl and Snee book, as statistical thinking is emphasized throughout, but there is also a
solid dose of contemporary statistical methodology.
xvii


xviii

preface

This book has many novel features, including the connection that is frequently made (but
hardly ever illustrated) between hypothesis tests and confidence intervals. This connection
is illustrated in many places, as I believe that the point cannot be overemphasized.

I have also written the book under the assumption that statistical software will be used
(extensively). A somewhat unusual feature of the book is that computing equations are kept
to a minimum, although some have been put in chapter appendixes for readers interested
in seeing them. MINITAB is the most frequently used statistical software for college and
university courses. Minitab, Inc. has been a major software component of the Six Sigma
movement and has made additions to the MINITAB software to provide the necessary capabilities for Six Sigma work. Such work has much in common with the field of engineering
statistics and with the way that many engineers use statistics. Therefore, MINITAB is heavily relied on in this book for illustrating various statistical analyses, although JMP from
SAS Institute, Inc. is also used.
This is not intended, however, to be a book on how to use MINITAB or JMP, since books
have been written for that purpose. Nevertheless, some MINITAB code is given in certain
chapters and especially at the textbook Website to benefit users who prefer to use MINITAB
in command mode. Various books, including the MINITAB User’s Guide, have explained
how to use MINITAB in menu mode, but not in command mode. The use of menu mode is
of course appropriate for beginners and infrequent users of MINITAB, but command mode
is much faster for people who are familiar with MINITAB and there are many users who
still use command mode. Another advantage of command mode is that when the online
help facility is used to display a command, all of the subcommands are also listed, so the
reader sees all of the options, whereas this view is not available when menu mode is used.
Rather, the user has to navigate through the various screens and mentally paste everything
together in order to see the total capability relative to a particular command.
There are, however, some MINITAB routines for which menu mode is preferable, due in
part to the many subcommands that will generally be needed just to do a standard analysis.
Thus, menu mode does have its uses.
Depending on how fast the material is covered, the book could be used for a two-semester
course as well as for a one-semester course. If used for the latter, the core material would
likely be all or parts of Chapters 1–6, 8, 11, 12, and 17. Some material from Chapters 7 and
14 might also be incorporated, depending on time constraints and instructor tastes.
For the second semester of a two-semester course, Chapters 7, 9, 10, 13, 14, and 15 and/or
16 might be covered, perhaps with additional material from Chapters 11 and 12 that could not
be covered in the first semester. The material in Chapter 12 on Analysis of Means deserves

its place in the sun, especially since it was developed for the express purpose of fostering
communication with engineers on the subject of designed experiments. Although Chapter
10 on mechanistic models and Chapter 7 on tolerance intervals and prediction intervals
might be viewed as special topics material, it would be more appropriate to elevate these
chapters to “core material chapters,” as this is material that is very important for engineering
students. At least some of the material in Chapters 15 and 16 might be covered, as time
permits. Chapter 16 is especially important as it can help engineering students and others
realize that nonparametric (distribution-free) methods will often be viable alternatives to
the better-known parametric methods.
There are reasons for the selected ordering of the chapters. Standard material is covered
in the first six chapters and the sequence of those chapters is the logical one. Decisions
had to be made starting with Chapter 7, however. Although instructors might view this
as a special topics chapter as stated, there are many subject matter experts who believe


preface

xix

that tolerance intervals and prediction intervals should be taught in engineering statistics
courses. Having a chapter on tolerance intervals and prediction intervals follow a chapter on
confidence intervals is reasonable because of the relationships between the intervals and the
need for this to be understood. Chapter 9 is an extension of Chapter 8 into multiple linear
regression and it is reasonable to have these chapters followed by Chapter 10 since nonlinear
regression is used in this chapter. In some ways it would be better if the chapter followed
Chapter 14 since reliability models are used, but the need to have it follow Chapters 8 and
9 seems more important. The regression chapters should logically precede the chapter on
design of experiments, Chapter 12, since regression methods should be used in analyzing
data from designed experiments. Processes should ideally be in a state of statistical control
when designed experiments are performed, so the chapter on control chart methods, Chapter

11, should precede Chapter 12. Chapters 13 and 14 contain subject matter that is important
for engineering and Chapters 15 and 16 consider topics that are generally covered in a wide
variety of introductory type statistics texts. It is useful for students to be able to demonstrate
that they have mastered the tools they have learned in any statistics course by knowing
which tool(s) to use in a particular application after all of the material has been presented.
The exercises in Chapter 17 provide students with the opportunity to demonstrate that they
have acquired such skill.
The book might also be used for self-study, aided by the Answers to Selected Exercises,
which is sizable and detailed. A separate Solutions Manual with solutions to all of the
chapter exercises is also available. The data in the exercises, including data in MINITAB
files (i.e., the files with the .MTW extension), can be found at the website for the text: ftp://
ftp.wiley.com/public/ sci med/engineering statistics.
I wish to gratefully acknowledge the support and assistance of my editor, Steve Quigley,
associate editor Susanne Steitz, and production editor Rosalyn Farkas, plus various others,
including the very large number of anonymous reviewers who reviewed all or parts of the
manuscript at various stages and made helpful comments.
Thomas P. Ryan
Acworth, Georgia
May 2007


CHAPTER 1

Methods of Collecting
and Presenting Data

People make decisions every day, with decision-making logically based on some form of
data. A person who accepts a job and moves to a new city needs to know how long it will
take him/her to drive to work. The person could guess the time by knowing the distance
and considering the traffic likely to be encountered along the route that will be traveled, or

the new employee could drive the route at the anticipated regular time of departure for a
few days before the first day of work.
With the second option, an experiment is performed, which if the test run were performed
under normal road and weather conditions, would lead to a better estimate of the typical
driving time than by merely knowing the distance and the route to be traveled.
Similarly, engineers conduct statistically designed experiments to obtain valuable information that will enable processes and products to be improved, and much space is devoted
to statistically designed experiments in Chapter 12.
Of course, engineering data are also available without having performed a designed
experiment, but this generally requires a more careful analysis than the analysis of data
from designed experiments. In his provocative paper, “Launching the Space-Shuttle Challenger—Disciplinary Deficiencies in the Analysis of Engineering Data,” F. F. Lighthall
(1991) contended that “analysis of field data and reasoning were flawed” and that “staff
engineers and engineering managers . . . were unable to frame basis questions of covariation among field variables, and thus unable to see the relevance of routinely gathered
field data to the issues they debated before the Challenger launch.” Lighthall then states
“Simple analyses of field data available to both Morton Thiokol and NASA at launch time
and months before the Challenger launch are presented to show that the arguments against
launching at cold temperatures could have been quantified. . . .” The author’s contention is
that there was a “gap in the education of engineers.” (Whether or not the Columbia disaster
will be similarly viewed by at least some authors as being a deficiency in data analysis
remains to be seen.)
Perhaps many would disagree with Lighthall, but the bottom line is that failure to
properly analyze available engineering data or failure to collect necessary data can endanger

Modern Engineering Statistics By Thomas P. Ryan
Copyright C 2007 John Wiley & Sons, Inc.

1


2


METHODS OF COLLECTING AND PRESENTING DATA

lives—on a space shuttle, on a bridge that spans a river, on an elevator in a skyscraper, and
in many other scenarios.
Intelligent analysis of data requires much thought, however, and there are no shortcuts.
This is because analyzing data and solving associated problems in engineering and other
areas is more of an art than a science. Consequently, it would be impractical to attempt
to give a specific step-by-step guide to the use of the statistical methods presented in
succeeding chapters, although general guidelines can still be provided and are provided
in subsequent chapters. It is desirable to try to acquire a broad knowledge of the subject
matter and position oneself to be able to solve problems with powers of reasoning coupled
with subject matter knowledge.
The importance of avoiding the memorization of rules or steps for solving problems
is perhaps best stated by Professor Emeritus Herman Chernoff of the Harvard Statistics
Department in his online algebra text, Algebra 1 for Students Comfortable with Arithmetic
( Chernoff/
Herman Chernoff Algebra 1.pdf).
Memorizing rules for solving problems is usually a way of avoiding understanding. Without
understanding, great feats of memory are required to handle a limited class of problems, and there
is no ability to handle new types of problems.

My approach to this issue has always been to draw a rectangle on a blackboard and then
make about 15–20 dots within the rectangle. The dots represent specific types of problems;
the rectangle represents the body of knowledge that is needed to solve not only the types of
problems represented by the dots, but also any type of problem that would fall within the
rectangle. This is essentially the same as what Professor Chernoff is saying.
This is an important distinction that undoubtedly applies to any quantitative subject and
should be understood by students and instructors, in general.
Semiconductor manufacturing is one area in which statistics is used extensively. International SEMATECH (SEmiconductor MAnufacturing TECHnology), located in Austin,
Texas, is a nonprofit research and development consortium of the following 13 semiconductor manufacturers: Advanced Micro Devices, Conexant, Hewlett-Packard, Hyundai,

Infineon Technologies, IBM, Intel, Lucent Technologies, Motorola, Philips, STMicroelectronics, TSMC, and Texas Instruments. Intel, in particular, uses statistics extensively.
The importance of statistics in these and other companies is exemplified by the
NIST/SEMATECH e-Handbook of Statistical Methods (Croarkin and Tobias, 2002), a joint
effort of International SEMATECH and NIST (National Institute of Standards and Technology), with the assistance of various other professionals. The stated goal of the handbook,
which is the equivalent of approximately 3,000 printed pages, is to provide a Web-based
guide for engineers, scientists, businesses, researchers, and teachers who use statistical
techniques in their work. Because of its sheer size, the handbook is naturally much more
inclusive than this textbook, although there is some overlap of material. Of course, the former is not intended for use as a textbook and, for example, does not contain any exercises
or problems, although it does contain case studies. It is a very useful resource, however,
especially since it is almost an encyclopedia of statistical methods. It can be accessed at
www.itl.nist.gov/div898/handbook and will henceforth often be referred to as
the e-Handbook of Statistical Methods or simply as the e-Handbook.
There are also numerous other statistics references and data sets that are available on
the Web, including some general purpose Internet statistics textbooks. Much information,


1.1 OBSERVATIONAL DATA AND DATA FROM DESIGNED EXPERIMENTS

3

including many links, can be found at the following websites: xas.
edu/cc/stat/world/softwaresites.html and />∼helberg/statistics.html. The Journal of Statistics Education is a free, online
statistics publication devoted to statistics education. It can be found at http://www.
amstat.org/publications/jse.
Statistical education is a two-way street, however, and much has been written about
how engineers view statistics relative to their work. At one extreme, Brady and Allen
(2002) stated: “There is also abundant evidence—for example, Czitrom (1999)—that most
practicing engineers fail to consistently apply the formal data collection and analysis
techniques that they have learned and in general see their statistical education as largely
irrelevant to their professional life.” (It is worth noting that the first author is an engineering

manager in industry.) The Accreditation Board for Engineering and Technology (ABET)
disagrees with this sentiment and several years ago decreed that all engineering majors must
have training in probability and statistics. Undoubtedly, many engineers would disagree
with Brady and Allen (2002), although historically this has been a common view.
One relevant question concerns the form in which engineers and engineering students
believe that statistical exposition should be presented to them. Lenth (2002), in reviewing a
book on experimental design that was written for engineers and engineering managers and
emphasizes hand computation, touches on two extremes by first stating that “. . . engineers
just will not believe something if they do not know how to calculate it . . .,” and then stating
“After more thought, I realized that engineers are quite comfortable these days—in fact,
far too comfortable—with results from the blackest of black boxes: neural nets, genetic
algorithms, data mining, and the like.”
So have engineers progressed past the point of needing to see how to perform all calculations that produce statistical results? (Of course, a world of black boxes is undesirable.)
This book was written with the knowledge that users of statistical methods simply do
not perform hand computation anymore to any extent, but many computing formulas are
nevertheless given for interested readers, with some formulas given in chapter appendices.

1.1 OBSERVATIONAL DATA AND DATA FROM DESIGNED EXPERIMENTS
Sports statistics are readily available from many sources and are frequently used in teaching
statistical concepts. Assume that a particular college basketball player has a very poor free
throw shooting percentage, and his performance is charted over a period of several games
to see if there is any trend. This would constitute observational data—we have simply
observed the numbers. Now assume that since the player’s performance is so poor, some
action is taken to improve his performance. This action may consist of extra practice,
visualization, and/or instruction from a professional specialist. If different combinations of
these tasks were employed, this could be in the form of a designed experiment. In general,
if improvement is to occur, there should be experimentation. Otherwise, any improvement
that seems to occur might be only accidental and not be representative of any real change.
Similarly, W. Edwards Deming (1900–1993) coined the terms analytic studies and
enumerative studies and often stated that “statistics is prediction.” He meant that statistical

methods should be used to improve future products, processes, and so on, rather than simply
“enumerating” the current state of affairs as is exemplified, for example, by the typical use
of sports statistics. If a baseball player’s batting average is .274, does that number tell
us anything about what the player should do to improve his performance? Of course not,


×