Undergraduate Texts in Mathematics
Editorial Board
S. Axler
K.A. Ribet
For other titles Published in this series, go to
www.springer.com/series/666
www.pdfgrip.com
www.pdfgrip.com
James J. Callahan
Advanced Calculus
A Geometric View
www.pdfgrip.com
James J. Callahan
Department of Mathematics and Statistics
Smith College
Northampton, MA 01063
USA
Editorial Board
S. Axler
Mathematics Department
San Francisco State University
San Francisco, CA 94132
USA
K.A. Ribet
Mathematics Department
University of California at Berkeley
Berkeley, CA 94720-3840
USA
ISSN 0172-6056
ISBN 978-1-4419-7331-3
e-ISBN 978-1-4419-7332-0
DOI 10.1007/978-1-4419-7332-0
Springer New York Dordrecht Heidelberg London
Library of Congress Control Number: 2010935598
Mathematics Subject Classification (2010): 26-01, 26B12, 26B15, 26B10, 26B20, 26A12
© Springer Science+Business Media, LLC 2010
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY
10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection
with any form of information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are
not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject
to proprietary rights.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
www.pdfgrip.com
To my teacher, Linus Richard Foy
www.pdfgrip.com
www.pdfgrip.com
Preface
A half-century ago, advanced calculus was a well-defined subject at the core
of the undergraduate mathematics curriulum. The classic texts of Taylor [19],
Buck [1], Widder [21], and Kaplan [9], for example, show some of the ways it
was approached. Over time, certain aspects of the course came to be seen as more
significant—those seen as giving a rigorous foundation to calculus—and they became the basis for a new course, an introduction to real analysis, that eventually
supplanted advanced calculus in the core.
Advanced calculus did not, in the process, become less important, but its role in
the curriculum changed. In fact, a bifurcation occurred. In one direction we got calculus on n-manifolds, a course beyond the practical reach of many undergraduates;
in the other, we got calculus in two and three dimensions but still with the theorems
of Stokes and Gauss as the goal.
The latter course is intended for everyone who has had a year-long introduction
to calculus; it often has a name like Calculus III. In my experience, though, it does
not manage to accomplish what the old advanced calculus course did. Multivariable
calculus naturally splits into three parts: (1) several functions of one variable, (2) one
function of several variables, and (3) several functions of several variables. The first
two are well-developed in Calculus III, but the third is really too large and varied
to be treated satisfactorily in the time remaining at the end of a semester. To put it
another way: Green’s theorem fits comfortably; Stokes’ and Gauss’ do not.
I believe the common view is that any such limitations of Calculus III are at
worst only temporary because a student will eventually progress to the study of
general k-forms on n-manifolds, the proper modern setting for advanced calculus.
But in the last half-century, undergraduate mathematics has changed in many ways,
not just in the flowering of rigor and abstraction. Linear algebra has been brought
forward in the curriculum, and with it an introduction to important multivariable
functions. Differential equations now have a larger role in the first calculus course,
too; students get to see something of their power and necessity. The computer vastly
expands the possibilities for computation and visualization.
The premise of this book is that these changes create the opportunity for a new
geometric and visual approach to advanced calculus.
vii
www.pdfgrip.com
viii
Preface
*
*
*
More than forty years ago—and long before the curriculum had evolved to its
present state—Andrew Gleason outlined a modern geometric approach in a series
of lectures, “The Geometric Content of Advanced Calculus” [8]. (In a companion
piece [17], Norman Steenrod made a similar assessment of the earlier courses in
the calculus sequence.) Because undergraduate analysis bifurcated around the same
time, Gleason’s insights have not been implemented to the extent that they might
have been; nevertheless, they fit naturally into the approach I take in this book.
Let me try to describe this geometric viewpoint and to indicate how it hangs
upon recent curricular and technological developments. Geometry has always been
bound up with the teaching of calculus, of course. Everyone associates the derivative
of a function with the slope of its graph. But when the function becomes a map
f : Rn → R p with n, p ≥ 2, we must ask: Where is the graph? What is its slope at a
point? Even in the simplest case n = p = 2, the graph (a two-dimensional surface)
lies in R4 and thus cannot be visualized directly. Nevertheless, we can get a picture if
we turn our attention from the graph to the image, because the image of f lies in the
R2 target. Computer algebra systems now make such pictures a practical possibility.
For example, the Mathematica command ParametricPlot produces a nonlinear
grid that is the image under a given map of a uniform coordinate grid from its source.
We can train ourselves to learn as much about a map from its image grid as we learn
about a function from its graph.
How do we picture the derivative in this setting? When we dealt with graphs,
the derivative of a nonlinear function f at the point a was the linear function whose
graph was tangent to the graph of f at a. Tangency implies that, under progressive
magnification at the point (a, f (a)), the two graphs look more and more alike. At
some stage the nonlinear function becomes indistinguishable from the linear one.
There are two subtly different concepts at play here, depending on what we mean
by “indistinguishable.” One is local linearity (or differentiability): f (a + ∆x) − f (a)
and f ′ (a)∆x are indistinguishable in the technical sense that their difference vanishes to greater than first order in ∆x. The other is looking linear locally: the graphs
themselves are indistinguishable under sufficient magnification. For our function f ,
there is no difference: f is locally linear precisely where it looks linear locally.
There is a real and important difference, though, when we replace graphs by
image grids, as we must do to visualize a map f : R2 → R2 and its derivative dfa .
We say f is locally linear (or differentiable) at a if f(a + ∆x) − f(a) and dfa (∆x) are
indistinguishable in the sense that their difference vanishes to greater than first order
in ∆x . By contrast, we say f looks linear locally at a if the image grid of f near a
is indistinguishable from the image grid of dfa under sufficient magnification. To
make the difference clear, consider the quadratic map q and its derivative at a point
a = (a, b):
u = x2 − y 2 ,
2a −2b
q:
dqa =
.
2b 2a
v = 2xy;
Because the derivative exists everywhere, q is locally linear everywhere. Moreover,
q also looks like its derivative under sufficient magnification as long as a = 0. But
www.pdfgrip.com
Preface
ix
at the origin, q doubles angles and squares distances, and continues to do so at any
magnification. No linear map does this. Thus in no open neighborhood of the origin
does q look like any linear map, and certainly not its derivative, which is the zero
map. (There is no contradiction, of course, because the difference between q and its
derivative vanishes to second order at the origin.)
Quite generally, a locally linear map f : Rn → Rn need not look linear at a point;
however (as our example suggests), if the derivative is invertible at that point, the
map will look linear there. In fact, this is the essential geometric content of the inverse function theorem. Here is why. By hypothesis, a linear coordinate change will
transform the derivative into the identity map. The local inverse for f that is provided
by the theorem can be viewed as another coordinate change, one that transforms f
itself into the identity map, at least locally. Thus f must look like its derivative locally because a suitable (composite) coordinate change will transform one into the
other. This leads us, in effect, to gather maps into geometric equivalence classes:
two maps are equivalent if a coordinate change transforms one into another. In other
words, a class consists of different coordinate descriptions of the same geometric
action. The invertible maps together make up a single class. (Geometrically, there is
only one invertible map!)
For parametrized surfaces f : R2 → R3 , or more generally for maps in which the
source and target have different dimensions, invertibility of the derivative is out of
the question. The appropriate notion here is maximal rank. Then, at a point where
the derivative has maximal rank, the implicit function theorem implies that the map
and its derivative once again look alike in a neighborhood of that point. Coordinate
changes convert both into the standard form of either a linear injection or a linear
projection. For each pair of source–target dimensions, maps whose derivatives have
maximal rank at a point make up a single local geometric class.
A nonlinear map can certainly have other local geometric forms; for example,
a plane map can fold the plane on itself or it can wrap it doubly on itself (like q,
above). The inverse and implicit function theorems imply that all such local geometric forms must therefore occur at points where the derivative fails to have maximal
rank. Such points are said to be singular. The analysis of the singularities of a differentiable map is an active area of current research that was initiated by Hassler
Whitney half a century ago [20] and guided to a mature form by Ren´e Thom in the
following decades. Although this book is not about map singularities, its geometric
approach reflects the way singularities are analyzed. There are further connections.
In 1975, I wrote a survey article on singularities of plane maps [2]; one of my aims
here is to provide more detailed background for that article.
We do analyze singularities in one familiar setting: a real-valued function f . The
target dimension is now 1, so only the zero derivative fails to have maximal rank.
This happens precisely at a critical point, where all the linear terms in the Taylor
expansion of f vanish. So we turn to the quadratic terms, that is, to the quadratic
form Q defined by the Hessian matrix of the function at that point. Taylor’s theorem
assures us that the Hessian form approximates f near the critical point (up to terms
that vanish to third order). We ask: does f also look like its Hessian form near that
point?
www.pdfgrip.com
x
Preface
Some condition is needed; for example, f (x, y) = x2 − y4 does not look like its
quadratic part Q(x, y) = x2 near the origin. Morse’s lemma provides the condition:
f does look like Q near a critical point if the Hessian matrix has maximal rank.
That is to say, a local coordinate change in a neighborhood of the critical point will
transform the original function into its Hessian form, in effect, removing all higherorder terms in the Taylor expansion of f .
A nondegenerate Hessian therefore has an invariant geometric meaning, but only
at a critical point. At a noncritical point, even concavity, for example, fails to be
preserved under all coordinate changes. More generally, if linear terms are present
and “robust” in the Taylor expansion of f at a point (i.e., they define a linear map
that has maximal rank), quadratic and higher terms have no invariant geometric
meaning. This is the implicit function theorem speaking once again.
By asking whether a map looks like the beginning of its Taylor series, we are
led to see the underlying geometric character of the inverse and implicit function
theorems and Morse’s lemma. The question thus provides a way to organize and
unify much of our subject and, in so doing, to bring out its simple beauty.
Let me now describe the geometric approach this book takes to another of its central
themes: the change of variables formula for integrals.
To fix ideas, suppose we have a double integral, so the change of variables is
an invertible map of (a portion of) the plane. Locally, that map looks linear. Each
linear map has a characteristic factor by which it magnifies areas. To a nonlinear
map we can therefore assign a local area magnification factor at each point, the
area magnification factor of its local linear approximation at that point. This is the
Jacobian.
In the simplest case, the integrand is identically equal to 1, and the value of the
integral is just the area of the domain of integration. A change of variables maps
that domain to a new one with, in general, a different area. If the map is linear, and
has area multiplication factor M, the new area is just M times the old (or the integral
of the constant M over the old domain). However, if the map is nonlinear, then we
need to proceed in steps. First subdivide the old domain into small regions on each
of which the local area magnification factor M (the Jacobian) is essentially constant.
The area of the image of one small region is then approximately the product of its
own area and the local value of M, and the area of the entire image is approximately
the sum of those individual products. To get better approximations, make finer and
finer subdivisions; in the limit, we have the area of the new region as the integral of
the local area multiplication function M over the original domain. For an arbitrary
integrand, transform the integral the same way: multiply the integrand by M. All of
this is easily generalized from two to n variables; areas become n-volumes.
A typical proof of the change of variables formula proceeds one dimension at a
time; this tends to submerge the geometric force and meaning of the Jacobian M. By
contrast, my proof in Chapter 9 follows the geometric argument above. I found it in
an article by Jack Schwartz ([16]), who remarks that his proof appears to be new; he
could not find a similar argument in any of the standard calculus texts of the time.
*
*
*
www.pdfgrip.com
Preface
xi
One way I have chosen to stress the geometric is by concentrating on what happens
in two and three dimensions, where we can construct—with the help of a computer
algebra system as needed—illustrations that help us “see” theorems. And this is not
a bad thing: the words theorem and theatre stem from the same Greek root θ εα ,
“the act of seeing.” In a literal sense, a theorem is “that which is seen.” But the eye,
and the mind’s eye not less, can play tricks. To be certain a theorem is true, we know
we must test what we see. Here is where proof comes in: to prove means “to test.”
The cognate form to probe makes this more evident; probate tests the validity of a
will. Ordinary language supports this meaning, too: yeast is “proofed” before it is
used to leaven bread dough, “the proof of the pudding is in the eating,” and “the
exception proves the rule” because it tests how widely the rule applies.
In much of mathematical exposition, proving is given more weight than seeing.
Jean Dieudonn´e’s seminal Foundations of Modern Analysis [4] is a good example. In
the preface he argues for the “necessity of a strict adherence to axiomatic methods,
with no appeal to ‘geometric intuition’, at least in the formal proofs: a necessity
which we have emphasized by deliberately abstaining from introducing any diagram
in the book.” As prevalent as it is, the axiomatic tradition is not the only one. Ren´e
Thom, a contemporary of Dieudonn´e and Bourbaki, followed a distinctly different
geometric tradition in framing the study of map singularities, a study whose outlines
have guided the development of this book. Although proof may be given a different
weight in the geometric tradition, it still has a crucial role. I believe that a student
who sees a theorem more fully has all the more reason to test its validity.
But there is, of course, usually no reason to restrict the proofs themselves to
low dimensions. For example, my proof of the inverse function theorem (Chapter 5,
p. 169ff.) is for maps on Rn . It elaborates upon Serge Lang’s proof for maps on
infinite-dimensional Banach spaces [10, 11]. Incidentally, Lang points out that, in
finite dimensions, the inverse function theorem is often proven using the implicit
function theorem, but that does not work in infinite dimensions. Lang gives the
proofs the other way around, and I do the same. Furthermore, because there is so
much instructive geometry associated with implicit functions, I provide not just a
general proof but a sequence of more gradually complicated ones (Chapter 6) that
fold in the growing geometric complexity that additional variables entail. I think
the student benefits from seeing all this put together. Other important examples of
n-dimensional proofs of theorems that are visualized primarily in R2 are Taylor’s
theorem (Chapter 3), the chain rule (Chapter 4), and Morse’s lemma (Chapter 7).
The definition of the derivative gets the same kind of treatment as the proof of
the implicit function theorem, and for the same reason. Unlike the other topics,
integral proofs are mainly restricted to two dimensions. One reason is that the many
technical details about Jordan content are easiest to see there. Another reason is that
the extension to higher dimensions is straightforward and can be carried out by the
student.
At a couple of points in the text, I provide brief Mathematica commands that
generate certain 3D images. Because programs like Mathematica are always being
updated (and the Mathematica 5 code I have used in the text has already been superceded), details are bound to change. My aim has simply been to indicate how
www.pdfgrip.com
xii
Preface
easy it is to generate useful images. I have also included a simple BASIC program
that calculates a Riemann sum for a particular double integral. Again, it is not my
aim to advocate for a particular computational tool. Nevertheless, I do think it is
important for students to see that programs do have a role—integrals arise out of
computations—and that even a simple program can increase our power to estimate
the value of an integral.
To help keep the focus on geometry, I have excluded proofs of nearly all the
theorems that are associated with introductory real analysis (e.g., those concerning
uniform continuity, convergence of sequences of functions, or equality of mixed partial derivatives). I consider real analysis to be a different course, one that is treated
thoroughly and well in a variety of texts at different levels, including the classics of
Rudin [15] and Protter and Morrey [14]. To be sure, I am recalibrating the balance
here between that which is seen and that which is tested.
This book does not attempt to be an exhaustive treatment of advanced calculus. Even
so, it has plenty of material for a year-long course, and it can be used for a variety
of semester courses. (As I was writing, it occurred to me that a course is like a walk
in the woods—a personal excursion—but a text must be like a map of the whole
woodland, so that others can take walks of their own choosing.) My own course
goes through the basics in Chapters 2–4 and then draws mainly on Chapters 9–11.
A rather different one could go from the basics to inverse and implicit functions
(Chapters 5 and 6), in preparation for a study of differentiable manifolds. The pace
of the book, with its numerous visual examples to introduce new ideas and topics,
is particularly suited for independent study. From start to finish, illustrations carry
the same weight as text and the two are thoroughly interwoven. The eye has an
important role to play.
In addition to the CUPM Proceedings [12] that contain the lectures of Gleason
and Steenrod, I have been strongly influenced by the content and tone of the beautiful three-volume Introduction to Calculus and Analysis [3] by Richard Courant and
Fritz John. In particular, I took their approach to integration via Jordan content. At
a different level of detail, I adopted their phrase order of vanishing as a replacement for the less apt order of magnitude for vanishing quantities. For the theorems
connecting Riemann and Darboux integrals in Chapter 8, I relied on Protter and
Morrey [14]; my own contribution was a number of figures to illustrate their proofs.
It was Gleason who argued that the Morse lemma has a place in the undergraduate
advanced calculus course. I was fully persuaded after my student Stephanie Jakus
(Smith ’05) wrote her senior honors thesis on the subject.
The Feynman Lectures on Physics [6] have had a pervasive influence on this
book. First of all, Feynman’s vision of his subject, and his flair for explanation, is
awe-inspiring. I felt I could find no better introduction to surface integrals than the
context of fluid flux. Because physics works with two-dimensional surfaces in R3 ,
I also felt justified in concentrating my treatment of surface integrals on this case.
I believe students will have learned all they need in order to deal with the integral
of a k-form over a k-dimensional parametrized surface patch in Rn , for arbitrary
k < n. In providing a physical basis for the curl, the Lectures prodded me to try to
www.pdfgrip.com
Preface
xiii
understand it geometrically. The result is a discussion of the curl (in Chapter 11)
that—like the discussion of the Morse lemma—has not previously appeared in an
advanced calculus text, as far as I am aware.
I thank my students over the last decade for their curiosity, their perseverance,
their interest in the subject, and their support. I especially thank Anne Watson
(Smith ’09), who worked with me to produce and check exercises. My editor at
Springer, Kaitlin Leach, makes the rough places smooth; I am most fortunate to
have worked with her. I am grateful to Smith College for its generous sabbatical
policy; I wrote much of the book while on sabbatical during the 2005–2006 academic year. My deepest debt is to my teacher, Linus Richard Foy, who stimulated
my interest in both mathematics and teaching. In his advanced calculus course, I
often caught myself trying to follow him along two tracks simultaneously: what he
was saying, and how he was saying it.
Amherst, MA
June 2010
James Callahan
www.pdfgrip.com
www.pdfgrip.com
Contents
1
Starting Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Work and path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2
Geometry of Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 Maps from R2 to R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Maps from Rn to Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Maps from Rn to R p , n = p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.1 Mean-value theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.2 Taylor polynomials in one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.3 Taylor polynomials in several variables . . . . . . . . . . . . . . . . . . . . . . . . 90
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4
The Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.1 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2 Maps of the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.3 Parametrized surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5
Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.1 Solving equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.2 Coordinate changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.3 The inverse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
29
29
42
46
57
xv
www.pdfgrip.com
xvi
Contents
6
Implicit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.1 A single equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.2 A pair of equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.3 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7
Critical Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
7.1 Functions of one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
7.2 Functions of two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.3 Morse’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
8
Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8.1 Example: gravitational attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8.2 Area and Jordan content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.3 Riemann and Darbou integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
9
Evaluating Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
9.1 Iterated integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
9.2 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
9.3 The change of variables formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
9.4 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
9.5 Green’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
10 Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
10.1 Measuring flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
10.2 Surface area and scalar integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
10.3 Differential forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
11 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
11.1 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
11.2 Circulation and vorticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
11.3 Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
11.4 Closed and exact forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
www.pdfgrip.com
Chapter 1
Starting Points
Abstract Our goal in this book is to understand and work with integrals of functions
of several variables. As we show, the integrals we already know from the introductory calculus courses give us a basis for the understanding we need. The key idea
for our future work is change of variables. In this chapter, we review how we use
a change of variables to compute many one-variable integrals as well as path integrals and certain double integrals that can be evaluated by making a change from
Cartesian to polar coordinates.
1.1 Substitution
There are two kinds of integral substitutions. As an example of the first kind, consider the familiar integral
b dx
.
2
0 1+x
We know that the substitution x = tan s is helpful here because 1 + x2 = 1 + tan2 s =
sec2 s and dx = sec2 s ds. Therefore,
dx
=
1 + x2
and we then have
b
0
sec2 s ds
=
1 + tan2 s
dx
= arctan x
1 + x2
ds = s = arctan x,
b
= arctan b.
0
As an example of the second kind of substitution, take the apparently similar
integral
b
x dx
, p = 1.
(1
+
x2 ) p
0
J.J. Callahan, Advanced Calculus: A Geometric View, Undergraduate Texts in Mathematics,
DOI 10.1007/978-1-4419-7332-0_1, © Springer Science+Business Media, LLC 2010
www.pdfgrip.com
1
Two kinds of
substitutions
2
1 Starting Points
The factor x in the numerator suggests the substitution u = 1 + x2 . Then du = 2x dx
and
−1
1 du 1 u−p+1
x dx
=
=
=
.
2
p
(1 + x )
2 up
2 (−p + 1) 2(p − 1)(1 + x2) p−1
Thus,
b
0
Integral as
antiderivative
1
1
x dx
1−
.
=
(1 + x2) p
2(p − 1)
(1 + b2) p−1
In these examples, integration is done with the fundamental theorem of calculus.
That is, we use the fact that the indefinite integral of a given function f ,
F=
Pullback
f (x) dx,
is an antiderivative of f : F ′ (x) = f (x). However, the substitutions we used to find
the two antiderivatives are different in important ways.
We call the first an example of a pullback substitution, for reasons that become
clear in a moment. In a pullback, we express the variable x itself as some differentiable function x = ϕ (s) of a new variable s. Then dx = ϕ ′ (s) ds and we get
f (x) dx =
f (ϕ (s)) ϕ ′ (s) ds = Φ ,
where Φ (s) is an antiderivative of f (ϕ (s)) ϕ ′ (s). Here the aim is to choose the
function ϕ so the antiderivative Φ becomes evident. The indefinite integral we want
is then F(x) = Φ (ϕ −1 (x)), where s = ϕ −1 (x) is the inverse of the function x = ϕ (s).
(In our example, ϕ is the tangent function and ϕ −1 is the arctangent function; Φ (s)
is just s.) We also use ϕ −1 to get the upper and lower endpoints of the definite
integral:
b
a
Push-forward
f (x) dx =
ϕ −1 (b)
ϕ −1 (a)
f (ϕ (s)) ϕ ′ (s) ds.
In the second example, a push-forward substitution, we replace some functional
expression g(x) involving x with a new variable u. As with ϕ (s), it takes practice
and experience to make an effective choice of g(x): the aim is to be able to write
f (x) = G′ (g(x)) · g′ (x) or f (x) dx = G′ (u) du
for a suitable function G′ (u). That is, du = g′ (x) dx and
f (x) dx =
G′ (g(x)) g′ (x) dx =
G′ (u) du = G,
and the antiderivative is F(x) = G(g(x)). In our example,
G′ (u) =
1
−1
and G(u) =
.
2u p
2(p − 1)u p−1
www.pdfgrip.com
1.1 Substitution
3
Note that we use g ( and not g−1 ) to get the endpoints of the transformed definite
integral:
b
a
f (x) dx =
g(b)
g(a)
G′ (u) du.
To see how the substitutions using ϕ and g are different, and also to see how they
got their names, let us think of them as maps:
Why pull back and
push forward?
ϕ
g
s −−−−→ x −−−−→ u
Then we can say g pushes forward information about the value of x to the variable u,
and ϕ pulls back that information to s. Note that the pullback needs to be invertible:
without a well-defined ϕ −1 , a given value of x may pull back to two or more different
values of s or to none at all. This problem does not arise with g, though.
To complete this section, let us review why the differential changes the way it
does in a substitution. For example, in the pullback x = ϕ (s), why is dx = ϕ ′ (s) ds?
The answer might seem obvious: because dx/ds is just another notation for the
derivative—that is, dx/ds = ϕ ′ (s)—we simply multiply by ds to get dx = ϕ ′ (s) ds.
This is a good mnemonic; however, it is not an explanation, because the expressions
dx and ds have no independent meaning, at least as far as derivatives are concerned.
We must look more carefully at the link between differentials and derivatives.
In a linear function, x = ϕ (s) = ms + b, we usually interpret the coefficient m as
the slope of the graph: ∆x/∆s = m. However, if we rewrite the slope equation in the
form ∆x = m ∆s, it becomes natural to interpret m instead as a multiplier. That is, our
linear map ϕ : s → x multiplies lengths by the factor m: an interval of length ∆s on
the s-axis is mapped to an interval of length ∆x = m ∆s on the x-axis. Furthermore,
when m < 0, ∆s and ∆x have opposite orientations, so ϕ also carries out a “flip.”
(The role of the coefficient as a multiplier rather than as a slope suggests why it is
so commonly represented by the letter “m”.)
When x = ϕ (s) is nonlinear, the slope of the graph (or the slope of its tangent line)
varies from point to point. Nevertheless, by fixing our attention on a small neighborhood of a particular point s = s0 , we still have a way to interpret the derivative as
a multiplier. To see how this happens, recall first that we assume ϕ is differentiable,
so
∆x
ϕ (s) − ϕ (s0 )
= lim
.
ϕ ′ (s0 ) = lim
s→s0
∆s→0 ∆s
s − s0
According to the meaning of a limit, we can make ∆x/∆s as close to ϕ ′ (s0 ) as
we wish by making ∆s = s − s0 sufficiently small; in other words,
∆x ≈ ϕ ′ (s0 ) ∆s when ∆s ≈ 0.
To see what this means, focus a microscope at the point (s0 , x0 ) and use coordinates
∆s = s − s0 and ∆x = x − x0 centered in this window. Then, under sufficient magnification (i.e., with ∆s ≈ 0), ϕ looks like ∆x ≈ ϕ ′ (s0 ) ∆s. We call this the microscope
www.pdfgrip.com
Transformation
of differentials
Slope as
length multiplier
x
∆x
x = ms + b
∆s
∆x = m ∆s
s
The microscope
equation and linear
approximations
4
1 Starting Points
equation for x = ϕ (s) at s0 ; it is linear, and defines the linear approximation of
the function ϕ at s0 .
x
ϕ (s0)
x0
s0
∆x ≈ ϕ ′(s0) ∆s
∆x
x = ϕ (s)
∆s
s0
microscope view
s
Finally, we can say that ϕ is locally linear, in the sense that x = ϕ (s) comes as
close as we wish to its linear approximation ∆x ≈ ϕ ′ (s0 ) ∆s when s is restricted to
a sufficiently small neighborhood of s0 . Thus, because the map ϕ : s → x is locally
linear at s0 , it multiplies lengths (approximately) by ϕ ′ (s0 ) in any sufficiently small
neighborhood of s = s0 .
With the microscope equation, we can now see why the differential transforms
the way it does when we make a change of variables in an integral. First of all, a
definite integral is defined as a limit of Riemann sums. In the simplest case—a leftendpoint Riemann sum with n equal subintervals—we can set ∆x = (b − a)/n and
xi = a + (i − 1)∆x and write
ϕ ′ is the local
length multiplier
Integral as a limit
of Riemann sums
b
a
n
∑ f (xi ) ∆x.
n→∞
f (x) dx = lim
i=1
We think of each term in the sum as the area of a rectangle with height f (xi ) and
base ∆x, as in the figure at the left, below.
x
y
y = f(x)
f(xi)
x1 x2 x3
a
The pullback creates
a new Riemann sum
x = ϕ (s)
b = xn+1
…
∆x
xi xi+1 … xn+1 x
xi+1
xi
..
.
x3
x2
a = x1
b
∆x
∆s1 ∆s2
s1 s2 s3
ϕ −1(a)
∆si
…
si si+1 sn+1 s
ϕ −1(b)
The figure at the right shows how the substitution x = ϕ (s) pulls back our partition of the interval a ≤ x ≤ b to a partition of ϕ −1 (a) ≤ s ≤ ϕ −1 (b). We set
si = ϕ −1 (xi ) (i = 1, . . . , n + 1) and ∆si = si+1 − si (i = 1, . . . , n). Note that the subintervals ∆si are generally unequal when ϕ is nonlinear. In fact, ∆si ≈ ∆x/ϕ ′ (si ), by
www.pdfgrip.com
1.1 Substitution
5
the microscope equation. The pullback allows us to write
n
n
i=1
i=1
∑ f (xi ) ∆x ≈ ∑ f (ϕ (si )) ϕ ′ (si ) ∆si .
By choosing n sufficiently large, we can make every ∆si arbitrarily small and thus
can make these two sums arbitrarily close. Notice that the right-hand side is also a
Riemann sum, in this case for the function f (ϕ (s)) ϕ ′ (s). Therefore, in the limit as
n → ∞, the Riemann sums become integrals and we have the equality
b
a
f (x) dx =
ϕ −1 (b)
ϕ −1 (a)
f (ϕ (s)) ϕ ′ (s) ds.
Thus we see that the justification for the transformation dx = ϕ ′ (s) ds of differentials
in integration lies in the transformation ∆x ≈ ϕ ′ (si ) ∆si that the microscope equation
provides for the Riemann sums.
The microscope equation ∆x ≈ ϕ ′ (si ) ∆si has one further geometric consequence.
In our Riemann sum for the second integral, the standard way to think about each
term is as the area of a rectangle with height f (ϕ (si )) ϕ ′ (si ) and base ∆si . However,
if we change the proportions and make the height f (ϕ (si )) and the base ϕ ′ (si ) ∆si ,
then we have a rectangle that matches (as closely as we wish) the shape of the
original rectangle with height f (xi ) and base ∆x, because f (xi ) = f (ϕ (si )) and ∆x ≈
ϕ ′ (si ) ∆si .
y
y
dx = ϕ ′ (s) ds
Rectangles in the
Riemann sums
y
f(ϕ (si)) ϕ ′(si)
f(ϕ (si))
∆si
s
ϕ ′(si) ∆si
f(xi)
s
∆x
x
To understand why differentials transform the way they do, we worked with a
pullback substitution. We get the same result with a push-forward, though the details are different. Our work has led us to several questions that we ask again when
we turn to more general integrals that involve functions of several variables: what
different kinds of substitutions occur? What role do inverses play? What is the form
of a linear approximation? What is the analogue of the local length multiplier? What
are differentials and how do they transform? What is the geometric interpretation of
that transformation?
www.pdfgrip.com
Some questions raised
6
1 Starting Points
1.2 Work and path integrals
Path integrals are one of the centerpieces of the first multivariable calculus course,
and they are often treated, as we do here, in the context of work.
Force, displacement,
and work
F
∆x
F
∆x
By definition, a force moving a body from one place to another produces work,
and the work done is proportional to both the force applied and to the displacement
caused. The simplest formula that captures this idea is
work = force × displacement.
Although work is a scalar quantity, force and displacement are actually both vectors, and the force is a field, that is, a variable function of position. We must elaborate
our simple formula to reflect these facts. Consider a straight-line displacement along
some vector ∆x and a constant force field F that acts the same way at every point
along ∆x. Only the component of the force that lies in the direction of the displacement does any work; this is the effective force Feff . We can take all this into account
in the new formula
work = Feff ∆x .
The scalar Feff is the length of the perpendicular projection of F on ∆x. Now, in
general, for arbitrary vectors A and B = 0,
Feff
length of projection of A onto B =
A·B
.
B
Rewriting the length Feff this way, we see work is still a product; it is the dot (or
scalar) product of force F and displacement ∆x, now regarded as vectors:
work = F · ∆x
work = W =
Work is additive
F
∆x1
∆x2
Orientation matters
Components of work:
W = P ∆x + Q ∆y
F · ∆x
∆x = F · ∆x.
∆x
In our new formula, W can take negative values (e.g., if F makes an obtuse angle
with ∆x). To see why “negative work” must arise, consider a constant force F that
displaces an object along a path consisting of two straight segments ∆x1 and ∆x2 ,
one immediately followed by the other. We want the total work to be the sum of the
work done on the separate segments:
total work = F · ∆x1 + F · ∆x2.
We say that work is additive on displacements. In particular, if ∆x2 = −∆x1 , then
the total work done is 0. Consequently, the work done by F along −∆x must be the
negative of the work done by the same F along +∆x. Orientation matters: reversing
the displacement reverses the sign of the work done.
Let us introduce coordinates into the plane containing the vectors F and ∆x and
write F = (P, Q) and ∆x = (∆x, ∆y). Then
www.pdfgrip.com
1.2 Work and path integrals
7
W = F · ∆x = P ∆x + Q ∆y.
This formula gives the coordinate components of work. It says that, in the
x-direction, there is a force of size P acting along a displacement of size ∆x, doing
work Wx = P ∆x. Similarly, in the y-direction the work done is Wy = Q ∆y. We call
Wx and Wy the components of W in the x- and y-directions. The following definition
summarizes our observations to this point.
Definition 1.1 The work done by the constant force F = (P, Q) in displacing an
object along the line segment ∆x = (∆x, ∆y) is
W = F · ∆x = P ∆x + Q ∆y = Wx + Wy .
Ultimately, we need to deal with variable forces and displacements along curved
paths. The prototype is a smooth simple curve C in the plane. We say C is smooth
if it is the image of a map (an example of a vector-valued function)
Displacement along
a curved path
x : [a, b] → R2 : t → (x(t), y(t))
x′(t)
(a parametrization) whose coordinate functions x(t) and y(t) have continuous
derivatives on a ≤ t ≤ b. We call t the parameter. In addition, C is simple if it
has no self-intersections, that is, if x is 1–1. The parametrization orders the points
on C in the following sense: x(t1 ) precedes x(t2 ) if t1 < t2 (i.e., t1 precedes t2 in
[a, b]). The ordering gives C an orientation; we write C to indicate C is oriented. At
any point on C where the tangent vector x′ (t) is nonzero, it points in the direction
of increasing t, and thus also indicates the orientation of C. We can immediately
extend these ideas to paths in Rn .
x(t)
→
C
Definition 1.2 A smooth, simple, oriented curve C in Rn is the image of a smooth
1–1 map,
x : [a, b] → Rn : t → x(t),
Parametrizing a
smooth simple curve
The simple formula W = F · ∆x for work assumes that the force F is constant,
so the location of the base point a of the displacement ∆x is irrelevant. However,
if F varies, then the work done will depend on a. We must, in fact, treat a linear
displacement as we would any displacement, and provide it with a parametrization.
A natural one is
x(t) = a + t · ∆x, 0 ≤ t ≤ 1.
Linear displacements
as oriented curves
We are now in a position to estimate the work done by a variable force as it
displaces an object along a smooth, simple, oriented curve C in R3 . Force is now a
(continuous) vector field—that is, a vector-valued function F(x) that varies (continuously) with position x. To estimate the work done, chop the curve into small pieces.
When a piece is small enough, it is essentially straight and the force is essentially
constant along it. On this piece, the linear formula for work (Definition 1.1) gives a
good approximation. By additivity, the sum of these contributions will approximate
Work done by
a variable force
where x′ (t) = 0 for all a < t < b. The point x(a) is the start of C and x(b) is its end.
www.pdfgrip.com
1
0
a
t
∆x
8
Partition the curve
1 Starting Points
the total work done along the whole curve. To get a better estimate, chop the curve
into even smaller pieces.
In more detail, let x1 , x2 , . . . , xk+1 be an ordered sequence of points on C, with
x1 at the start of C and xk+1 at the end. We say {xi } is a partition that respects the
orientation of C . Let the oriented curve Ci be the portion of C from xi to xi+1 , and
let Wi be the work done by F along Ci ; then, by the additivity of work,
k
total work done by F = ∑ Wi .
i=1
F(xi)
→
Ci
xi
F(xi)
xi +1
F
∆xi
→
C x
1
Approximate the work
along each segment
x2
x3
xi
xk +1
Let ∆xi = xi+1 − xi be the linear displacement with base point xi . When ∆xi is sufficiently small, ∆xi will be as close to the curved segment Ci as we wish, because Ci
is smooth. Moreover, F will be nearly constant along ∆xi , because F is continuous.
In particular, F(x) will differ by an arbitrarily small amount from its value F(xi ) at
the base point of ∆xi . Therefore, Wi ≈ F(xi ) · ∆xi . If we choose k large enough and
make each ∆xi sufficiently small, then the sum
k
∑ F(xi) · ∆xi
i=1
will approximate the total work as closely as we wish. In fact, this last expression
is a Riemann sum for a new kind of integral, called a path, or line, integral that
we now define quite generally for smooth, simple, oriented paths in any dimension.
Note that the definition does not depend on the parametrization of the path.
Smooth path integral
Definition 1.3 (Smooth path integral) The integral of the continuous vectorvalued function F(x) over the smooth, simple, oriented curve C in Rn is
C
F · dx = lim
k
∑ F(xi ) · ∆xi,
k→∞
mesh→0 i=1
if the limit exists when taken over all ordered partitions x1 , x2 , . . . , xk+1 of C with
mesh = maxi ∆xi and ∆xi = xi+1 − xi , i = 1, . . . , k.
More general paths
We can now define a more general collection of integration paths. If we allow the
start and end of C to coincide (the tangent directions need not agree) and there are
www.pdfgrip.com