5
Detection of Regions of Interest
Although a physician or a radiologist, of necessity, will carefully examine
an image on hand in its entirety, more often than not, diagnostic features of
interest manifest themselves in local regions. It is uncommon that a condition
or disease will alter an image over its entire spatial extent. In a screening
situation, the radiologist scans the entire image and searches for features
that could be associated with disease. In a diagnostic situation, the medical
expert concentrates on the region of suspected abnormality, and examines its
characteristics to decide if the region exhibits signs related to a particular
disease.
In the CAD environment, one of the roles of image processing would be to
detect the region of interest (ROI) for a given, speci c, screening or diagnostic
application. Once the ROIs have been detected, the subsequent tasks would
relate to the characterization of the regions and their classi cation into one
of several categories. A few examples of ROIs in di erent biomedical imaging
and image analysis applications are listed below.
Cells in cervical-smear test images (Papanicolaou or Pap-smear test)
272, 273].
Calci cations in mammograms 274].
Tumors and masses in mammograms 275, 276, 277].
The pectoral muscle in mammograms 278].
The breast outline or skin-air boundary in mammograms 279].
The broglandular disc in mammograms 280].
The air-way tree in lungs.
The arterial tree in lungs.
The arterial tree of the left ventricle, and constricted parts of the same
due to plaque development.
Segmentation is the process that divides an image into its constituent parts,
objects, or ROIs. Segmentation is an essential step before the description,
recognition, or classi cation of an image or its constituents. Two major approaches to image segmentation are based on the detection of the following
characteristics:
© 2005 by CRC Press LLC
363
364
Biomedical Image Analysis
Discontinuity | Abrupt changes in gray level (corresponding to edges)
are detected.
Similarity | Homogeneous parts are detected, based on gray-level
thresholding, region growing, and region splitting/merging.
Depending upon the nature of the images and the ROIs, we may attempt to
detect the edges of the ROIs (if distinct edges are present), or we may attempt
to grow regions to approximate the ROIs. It should be borne in mind that,
in some cases, an ROI may be composed of several disjoint component areas
(for example, a tumor that has metastasized into neighboring regions and
calci cations in a cluster). Edges that are detected may include disconnected
parts that may have to be matched and joined. We shall explore several
techniques of this nature in the present chapter.
Notwithstanding the stated interest in local regions as above, applications
do exist where entire images need to be analyzed for global changes in patterns: for example, changes in the orientational structure of collagen bers in
ligaments (see Figure 1.8), and bilateral asymmetry in mammograms (see Section 8.9). Furthermore, in the case of clustered calci cations in mammograms,
cells in cervical smears, and other examples of images with multicomponent
ROIs, analysis may commence with the detection of single units of the pattern
of interest, but several such units present in a given image may need to be
analyzed, separately and together, in order to reach a decision regarding the
case.
5.1 Thresholding and Binarization
If the gray levels of the objects of interest in an image are known from prior
knowledge, or can be determined from the histogram of the given image, the
image may be thresholded to detect the features of interest and reject other
details. For example, if it is known that the objects of interest in the image
have gray-level values greater than L1 , we could create a binary image for
display as
n) L1
(5.1)
g(m n) = 0255 ifif ff ((m
m n) L
1
where f (m n) is the original image g(m n) is the thresholded image to be
displayed and the display range is 0 255]. See also Section 4.4.1.
Methods for the derivation of optimal thresholds are described in Sections
5.4.1, 8.3.2, and 8.7.2.
Example: Figure 5.1 (a) shows a TEM image of a ligament sample demonstrating collagen bers in cross-section see Section 1.4. Inspection of the histogram of the image (shown in Figure 2.12) shows that the sections of the
© 2005 by CRC Press LLC
Detection of Regions of Interest
365
collagen bers in the image have gray-level values less than about 180 values
greater than this level represent the brighter background in the image. The
histogram also indicates the fact that the gray-level ranges of the collagenber regions and the background overlap signi cantly. Figure 5.1 (b) shows a
thresholded version of the image in (a), with all pixels less than 180 appearing
in black, and all pixels above this level appearing in white. This operation
is the same as the thresholding operation given by Equation 5.1, but in the
opposite sense. Most of the collagen ber sections have been detected by the
thresholding operation. However, some of the segmented regions are incomplete or contain holes, whereas some parts that appear to be separate and
distinct in the original image have been merged in the result. An optimal
threshold derived using the methods described in Sections 5.4.1, 8.3.2, and
8.7.2 could lead to better results.
5.2 Detection of Isolated Points and Lines
Isolated points may exist in images due to noise or due to the presence of
small particles in the image. The detection of isolated points is useful in noise
removal and the analysis of particles. The following convolution mask may
be used to detect isolated points 8]:
2
;1
4 ;1
;1
3
;1 ;1
8 ;1 5
;1 ;1
:
(5.2)
The operation computes the di erence between the current pixel at the center
of the mask and the average of its 8-connected neighbors. (The mask could
also be seen as a generalized version of the Laplacian mask in Equation 2.83.)
The result of the mask operation could be thresholded to detect isolated pixels
where the di erence computed would be large.
Straight lines or line segments oriented at 0o 45o 90o , and 135o may be
detected by using the following 3 3 convolution masks 8]:
2
;1
4 2
3 2
;1
2 5 4 ;1
;1 ;1
2
;1 ;1 ;1
2
;1
4 ;1
3 2
;1
3
2
2 ;1 5
2 ;1 ;1
3
2 ;1
2 ;1 ;1
2 ;1 5 4 ;1 2 ;1 5 :
(5.3)
;1
2 ;1
;1 ;1
2
A line may be said to exist in the direction for which the corresponding mask
provides the largest response.
© 2005 by CRC Press LLC
366
Biomedical Image Analysis
(a)
FIGURE 5.1
(b)
(a) TEM image of collagen bers in a scar-tissue sample from a rabbit ligament
at a magni cation of approximately 30 000. See also Figure 1.5. Image
courtesy of C.B. Frank, Department of Surgery, University of Calgary. See
Figure 2.12 for the histogram of the image. (b) Image in (a) thresholded at
the gray level of 180.
© 2005 by CRC Press LLC
Detection of Regions of Interest
367
5.3 Edge Detection
One of the approaches to the detection of an ROI is to detect its edges. The
HVS is particularly sensitive to edges and gradients, and some theories and
experiments indicate that the detection of edges plays an important role in
the detection of objects and analysis of scenes 122, 281, 282].
In Section 2.11.1 on the properties of the Fourier transform, we saw that
the rst-order derivatives and the Laplacian relate to the edges in the image. Furthermore, we saw that these space-domain operators have equivalent
formulations in the frequency domain as highpass lters with gain that is
proportional to frequency in a linear or quadratic manner. The enhancement
techniques described in Sections 4.6 and 4.7 further strengthen the relationship between edges, gradients, and high-frequency spectral components. We
shall now explore how these approaches may be extended to detect the edges
or contours of objects or regions.
(Note: Some authors consider edge extraction to be a type of image enhancement.)
5.3.1 Convolution mask operators for edge detection
An edge is characterized by a large change in the gray level from one side
to the other, in a particular direction dependent upon the orientation of the
edge. Gradients or derivatives measure the rate of change, and hence could
serve as the basis for the development of methods for edge detection.
The rst derivatives in the x and y directions, approximated by the rst
di erences, are given by (using matrix notation)
fyb (m n) f (m n) ; f (m ; 1 n)
fxb (m n) f (m n) ; f (m n ; 1)
0
0
(5.4)
where the additional subscript b indicates a backward-di erence operation.
Because causality is usually not a matter of concern in image processing, the
di erences may also be de ned as
fyf (m n) f (m + 1 n) ; f (m n)
fxf (m n) f (m n + 1) ; f (m n)
0
0
(5.5)
where the additional subscript f indicates a forward-di erence operation. A
limitation of the operators as above is that they are based upon the values
of only two pixels this makes the operators susceptible to noise or spurious
pixel values. A simple approach to design robust operators and reduce the
sensitivity to noise is to incorporate averaging over multiple measurements.
© 2005 by CRC Press LLC
368
Biomedical Image Analysis
Averaging the two de nitions of the derivatives in Equations 5.4 and 5.5, we
get
fya (m n) 0:5 f (m + 1 n) ; f (m ; 1 n)]
fxa (m n) 0:5 f (m n + 1) ; f (m n ; 1)]
0
0
(5.6)
where the additional subscript a indicates the inclusion of averaging.
In image processing, it is also desirable to express operators in terms of
odd-sized masks that may be centered upon the pixel being processed. The
Prewitt operators take these considerations into account with the following
3 3 masks for the horizontal and vertical derivatives Gx and Gy , respectively:
2
3
;1 0 1
Gx : 4 ;1 0 1 5 :
;1
0 1
(5.7)
2
3
;1 ;1 ;1
Gy : 4 0 0 0 5 :
(5.8)
1 1 1
The Prewitt operators use three di erences across pairs of pixels in three
rows or columns around the pixel being processed. Due to this fact, and due
to the scale factor of 0:5 in Equation 5.6, in order to derive the exact gradient,
the results of the Prewitt operators should be divided by 3 2 , where
is the sampling interval in x and y however, this step could be ignored if the
result is scaled for display or thresholded to detect edges.
In order to accommodate the orientation of the edge, a vectorial form of
the gradient could be composed as
Gf (m n) = Gfx (m n) + j Gfy (m n)
kGf (m
6
where
and
q
n)k = G2fx (m n) + G2fy (m n)
n)
Gf (m n) = tan;1 GGfy ((m
fx m n)
Gfx (m n) = (f Gx )(m n)
Gfy (m n) = (f Gy )(m n):
(5.9)
(5.10)
(5.11)
If the magnitude is to be scaled for display or thresholded for the detection of edges, the square-root operation may be dropped, or the magnitude
approximated as jGfx j + jGfy j in order to save computation.
© 2005 by CRC Press LLC
Detection of Regions of Interest
369
The Sobel operators are similar to the Prewitt operators, but include larger
weights for the pixels in the row or column of the pixel being processed as
2
3
;1 0 1
Gx : 4 ;2 0 2 5
(5.12)
;1 0 1
2
3
;1 ;2 ;1
Gy : 4 0 0 0 5 :
(5.13)
1 2 1
Edges oriented at 45o and 135o may be detected by using rotated versions
of the masks as above. The Prewitt operators for the detection of diagonal
edges are
2
3
0 ;1 ;1
G45o : 4 1 0 ;1 5
(5.14)
1 1 0
and
2
3
;1 ;1
0
G135o : 4 ;1 0 1 5 :
(5.15)
0 1 1
Similar masks may be derived for the Sobel operator.
(Note: The positive and negative signs of the elements in the masks above
may be interchanged to obtain operators that detect gradients in the opposite
directions. This step is not necessary if directions are considered in the range
0o ; 180o only, or if only the magnitudes of the gradients are required.)
Observe that the sum of all of the weights in the masks above is zero.
This indicates that the operation being performed is a derivative or gradient
operation, which leads to zero output values in areas of constant gray level,
and the loss of intensity information.
The Roberts operator uses 2 2 neighborhoods to compute cross-di erences
as
;1 0
0 ;1 :
(5.16)
0 1 and
1 0
The masks are positioned with the upper-left element placed on the pixel
being processed. The absolute values of the results of the two operators are
added to obtain the net gradient:
g(m n) = jf (m + 1 n + 1) ; f (m n)j + jf (m + 1 n) ; f (m n + 1)j (5.17)
with the indices in matrix-indexing notation. The individual di erences may
also be squared, and the square root of their sum taken to be the net gradient.
The advantage of the Roberts operator is that it is a forward-looking operator,
as a result of which the result may be written in the same array as the input
image. This was advantageous when computer memory was expensive and in
short supply.
© 2005 by CRC Press LLC
370
Biomedical Image Analysis
Examples: Figure 5.2 (a) shows the Shapes text image. Part (b) of the
gure shows the gradient magnitude image, obtained by combining, as in
Equation 5.9, the horizontal and vertical derivatives, shown in parts (c) and
(d) of the gure, respectively. The image in part (c) presents high values (positive or negative) at vertical edges only horizontally oriented edges have been
deleted by the horizontal derivative operator. The image in part (d) shows
high output at horizontal edges, with the vertically oriented edges having been
removed by the vertical derivative operator. The test image has strong edges
for most of the objects present, which are clearly depicted in the derivative
images however, the derivative images show the edges of a few objects that
are not readily apparent in the original image as well. Parts (e) and (f) of the
gure show the derivatives at 45o and 135o , respectively the images indicate
the diagonal edges present in the image.
Figures 5.3, 5.4, and 5.5 show similar sets of results for the clock, the
knee MR, and the chest X-ray test images, respectively. In the derivatives
of the clock image, observe that the numeral \1" has been obliterated by
the vertical derivative operator Figure 5.3 (d)], but gives rise to high output
values for the horizontal derivative Figure 5.3 (c)]. The clock image has the
minute hand oriented at approximately 135o with respect to the horizontal
this feature has been completely removed by the 135o derivative operator, as
shown in Figure 5.3 (f), but has been enhanced by the 45o derivative operator,
as shown in Figure 5.3 (e). The knee MR image contains sharp boundaries
that are depicted well in the derivative images in Figure 5.4. The derivative
images of the chest X-ray image in Figure 5.5 indicate large values at the
boundaries of the image, but depict the internal details with weak derivative
values, indicative of the smooth nature of the image.
5.3.2 The Laplacian of Gaussian
Although the Laplacian is a gradient operator, it should be recognized that it
is a second-order di erence operator. As we observed in Sections 2.11.1 and
4.6, this leads to double-edged outputs with positive and negative values at
each edge this property is demonstrated further by the example in Figure 5.6
(see also Figure 4.26). The Laplacian has the advantage of being omnidirectional, that is, being sensitive to edges in all directions however, it is not
possible to derive the angle of an edge from the result. The operator is also
sensitive to noise because there is no averaging included in the operator the
gain in the frequency domain increases quadratically with frequency, causing signi cant ampli cation of high-frequency noise components. For these
reasons, the Laplacian is not directly useful in edge detection.
The double-edged output of the Laplacian indicates an important property
of the operator: the result possesses a zero-crossing in between the positive
and negative outputs across an edge the property holds even when the edge
in the original image is signi cantly blurred. This property is useful in the
development of robust edge detectors. The noise sensitivity of the Laplacian
© 2005 by CRC Press LLC
Detection of Regions of Interest
371
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.2
(a) Shapes test image. (b) Gradient magnitude, display range 0 400] out of
0 765]. (c) Horizontal derivative, display range ;200 200] out of ;765 765].
(d) Vertical derivative, display range ;200 200] out of ;765 765]. (e) 45o
derivative, display range ;200 200] out of ;765 765]. (f) 135o derivative,
display range ;200 200] out of ;765 765].
© 2005 by CRC Press LLC
372
Biomedical Image Analysis
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.3
(a) Clock test image. (b) Gradient magnitude, display range 0 100] out of
0 545]. (c) Horizontal derivative, display range ;100 100] out of ;538 519].
(d) Vertical derivative, display range ;100 100] out of ;446 545]. (e) 45o
derivative, display range ;100 100] out of ;514 440]. (f) 135o derivative,
display range ;100 100] out of ;431 535].
© 2005 by CRC Press LLC
Detection of Regions of Interest
373
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.4
(a) Knee MR image. (b) Gradient magnitude, display range 0 400] out of
0 698]. (c) Horizontal derivative, display range ;200 200] out of ;596 496].
(d) Vertical derivative, display range ;200 200] out of ;617 698]. (e) 45o
derivative, display range ;200 200] out of ;562 503]. (f) 135o derivative,
display range ;200 200] out of ;432 528].
© 2005 by CRC Press LLC
374
Biomedical Image Analysis
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.5
(a) Part of a chest X-ray image. (b) Gradient magnitude, display range
0 50] out of 0 699]. (c) Horizontal derivative, display range ;50 50] out of
;286 573]. (d) Vertical derivative, display range ;50 50] out of ;699 661].
(e) 45o derivative, display range ;50 50] out of ;452 466]. (f) 135o derivative, display range ;50 50] out of ;466 442].
© 2005 by CRC Press LLC
Detection of Regions of Interest
375
profile of edge
first derivative
second derivative
FIGURE 5.6
Top to bottom: A pro le of a blurred object showing two edges, the rst
derivative, and the second derivative (see also Figure 4.26).
© 2005 by CRC Press LLC
376
Biomedical Image Analysis
may be reduced by including a smoothing operator. A scalable smoothing operator could be de ned in terms of a 2D Gaussian function, with the variance
controlling the spatial extent or width of the smoothing function. Combining
the Laplacian and the Gaussian, we obtain the popular Laplacian-of-Gaussian
or LoG operator 8, 122, 281, 282].
Consider the Gaussian speci ed by the function
2
2
(5.18)
g(x y) = ; exp ; x 2 + 2y :
The usual normalizing scale factor has been left out. Taking partial derivatives
with respect to x and y, we obtain
@ 2 g = ; x2 ; 2 exp
4
@x2
2
@ g = ; y2 ; 2 exp
4
@y2
which leads to
r2 g (x
y) = LoG(r) = ; r
; x2 +y2
2
2
; x2 +y2
2
2;2 2
4
2
r2
(5.20)
2 2
p
where r = x2 + y2 . The LoG function is isotropic and has positive and
negative values. Due to its shape, it is often referred to as the Mexican hat
or sombrero.
Figure 5.7 shows the LoG operator in image and mesh-plot formats the
basic Gaussian used to derive the LoG function is also shown for reference.
The Fourier magnitude spectra of the Gaussian and LoG functions are also
shown in the gure. It should be observed that, whereas the Gaussian is a
lowpass lter (which is also a 2D Gaussian in the frequency domain), the LoG
function is a bandpass lter. The width of the lters is controlled by the
parameter of the Gaussian.
Figure 5.8 shows pro les of the LoG and the related Gaussian for two values
of . Figure 5.9 shows the pro les of the Fourier transforms of the functions
in Figure 5.8. The pro les clearly demonstrate the nature of the functions
and their ltering characteristics.
An approximation to the LoG operator is provided by taking the di erence
between two Gaussians of appropriate variances: this operator is known as
the di erence-of-Gaussians or DoG operator 282].
Examples: Figure 5.10 shows the Shapes test image, the LoG of the image
with = 1 pixel, and the locations of the zero-crossings of the LoG of the
image with = 1 pixel and = 2 pixels. The zero-crossings indicate the
locations of the edges in the image. The use of a large value for reduces the
e ect of noise, but also causes smoothing of the edges and corners, as well as
the loss of the minor details present in the image.
© 2005 by CRC Press LLC
exp
(5.19)
;
Detection of Regions of Interest
377
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.7
The Laplacian of Gaussian in (b) image format and (d) as a mesh plot. The
related Gaussian functions are shown in (a) and (c). The size of the arrays
is 51 51 pixels standard deviation = 4 pixels. The Fourier magnitude
spectra of the functions are shown in (e) and (f).
© 2005 by CRC Press LLC
378
Biomedical Image Analysis
0.9
normalized LoG(r) and Gaussian(r)
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
−0.1
−25
−20
−15
−10
−5
0
r(x, y)
5
10
15
20
25
5
10
15
20
25
(a)
0.9
normalized LoG(r) and Gaussian(r)
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
−0.1
−25
−20
−15
−10
FIGURE 5.8
−5
0
r(x, y)
(b)
Pro les of the Laplacian of Gaussian (solid line) and the related Gaussian
(dashed line) in Figure 5.7. The functions have been normalized to a maximum value of unity. The unit of r is pixels. (a) = 4 pixels. (b) = 2 pixels.
© 2005 by CRC Press LLC
Detection of Regions of Interest
379
(a)
FIGURE 5.9
(b)
Pro les of the Fourier magnitude spectra of the Laplacian of Gaussian (solid
line) and the related Gaussian (dashed line) in Figure 5.7. Both functions
have been normalized to have a maximum value equal to unity. (a) = 4
pixels. (b) = 2 pixels. The zero-frequency point is at the center of the
horizontal axis.
© 2005 by CRC Press LLC
380
Biomedical Image Analysis
Figure 5.11 shows the clock image, its LoG, and the zero-crossings of the
LoG with = 1 pixel and = 2 pixels. The results illustrate the performance
of the LoG operator in the presence of noise.
Figures 5.12, 5.13, and 5.14 show similar sets of results for the myocyte
image, the knee MR image, and the chest X-ray test images. Comparative
analysis of the scales of the details present in the images and the zero-crossings
of the LoG for di erent values of indicates the importance of selecting values
of the parameter in accordance with the scale of the details to be detected.
5.3.3 Scale-space methods for multiscale edge detection
Marr and Hildreth 281, 282] suggested that physical phenomena may be detected simultaneously over several channels tuned to di erent spatial sizes
or scales, with an approach known as the spatial coincidence. An intensity
change that is due to a single physical phenomenon is indicated by zerocrossing segments present in independent channels over a certain range of
scales, with the segments having the same position and orientation in each
channel. A signi cant intensity change indicates the presence of a major event
that is registered as a physical boundary, and is recognized as a single physical phenomenon. The boundaries of a signi cant physical pattern should be
present over several channels, suggesting that the use of techniques based on
zero-crossings generated from lters of di erent scales could be more e ective than the conventional (single-scale) methods for edge detection see, for
example, Figure 5.13.
Zero-crossings and scale-space: The multichannel model for the HVS
283] and the Marr-Hildreth spatial coincidence assumption 281] led to the
development of methods for the detection of edges based upon multiscale
analysis performed with lters of di erent scales. Marr and Hildreth proposed heuristic rules to combine information from the di erent channels in a
multichannel vision model they suggested the use of a bank of LoG lters
with several values of , which may be represented as fr2 g(x y )g, with
> 0.
A method for obtaining information in images across a continuum of scales
was suggested by Witkin 284], who introduced the concept of scale-space. The
method rapidly gained considerable interest, and has been explored further by
several researchers in image processing and analysis 285, 286, 287, 288, 289].
The scale-space (x y ) of an image f (x y) is de ned as the set of all zerocrossings of its LoG:
f (x y )g = f(x y )g j (x y ) = 0
@ 2
@ 2
@x + @y 6= 0
where
(x y ) = fr2 g(x y )
© 2005 by CRC Press LLC
>0
f (x y)g:
(5.21)
(5.22)
Detection of Regions of Interest
381
(a)
(b)
(c)
(d)
FIGURE 5.10
(a) The Shapes test image. (b) The LoG of the image in (a) with = 1 pixel.
(c) Locations of the zero-crossings in the LoG in (b). (d) Locations of the
zero-crossings in the LoG with = 2 pixels.
© 2005 by CRC Press LLC
382
Biomedical Image Analysis
(a)
(b)
(c)
(d)
FIGURE 5.11
(a) The clock test image. (b) The LoG of the image in (a) with = 1 pixel.
(c) Locations of the zero-crossings in the LoG in (b). (d) Locations of the
zero-crossings in the LoG with = 2 pixels.
© 2005 by CRC Press LLC
Detection of Regions of Interest
383
(a)
(b)
(c)
(d)
FIGURE 5.12
(a) Image of a myocyte. (b) The LoG of the image in (a) with = 2 pixels.
(c) Locations of the zero-crossings in the LoG in (b). (d) Locations of the
zero-crossings in the LoG with = 4 pixels.
© 2005 by CRC Press LLC
384
Biomedical Image Analysis
(a)
(b)
(c)
(d)
(e)
(f)
FIGURE 5.13
(a) MR image of a knee. (b) The LoG of the image in (a) with = 2 pixels.
(c) { (f) Locations of the zero-crossings in the LoG with = 1 2 3 and 4
pixels.
© 2005 by CRC Press LLC
Detection of Regions of Interest
385
(a)
(b)
(c)
(d)
FIGURE 5.14
(a) Part of a chest X-ray image. (b) The LoG of the image in (a) with = 2
pixels display range ;150 150] out of ;231 956]. (c) Locations of the zerocrossings in the LoG in (b). (d) Locations of the zero-crossings in the LoG
with = 4 pixels.
© 2005 by CRC Press LLC
386
Biomedical Image Analysis
As the scale varies from 0 to 1, the set f (x y )g forms continuous
surfaces in the (x y ) scale-space.
Several important scale-space concepts apply to 1D and 2D signals. It has
been shown that the scale-space of almost all signals ltered by a Gaussian
determines the signal uniquely up to a scaling constant 285] (except for noisecontaminated signals and some special functions 290]). The importance of
this property lies in the fact that, theoretically, for almost all signals, no
information is lost by working in the scale-space instead of the image domain.
This property plays an important role in image understanding 291], image
reconstruction from zero-crossings 285, 292], and image analysis using the
scale-space approach 288]. Furthermore, it has also been shown that the
Gaussian does not create additional zero-crossings as the scale increases
beyond a certain limit, and that the Gaussian is the only lter with this
desirable scaling behavior 285].
Based on the spatial-coincidence assumption, Witkin 284] proposed a 1D
stability analysis method for the extraction of primitive events that occur over
a large range of scales. The primitive events were organized into a qualitative
signal description representing the major events in the signal. Assuming that
zero-crossing curves do not cross one another (which was later proven to be
incorrect by Katz 293]), Witkin de ned the stability of a signal interval as
the scale range over which the signal interval exists major events could then
be captured via stability analysis. However, due to the complex topological
nature of spatial zero-crossings, it is often di cult to directly extend Witkin's
1D stability analysis method to 2D image analysis. The following problems
a ect Witkin's method for stability analysis:
It has been shown that zero-crossing curves do cross one another 293].
It has been shown that real (authentic) zero-crossings could turn into
false (phantom) zero-crossings as the scale increases 294]. Use of
the complete scale-space (with ranging from 0 to 1) may introduce
errors in certain applications an appropriate scale-space using only a
nite range of scales could be more e ective.
For 2D signals (images), the scale-space consists of zero-crossing surfaces
that are more complex than the zero-crossing curves for 1D signals. The
zero-crossing surfaces may split and merge as the scale varies (decreases
or increases, respectively).
As a consequence of the above, there is no simple topological region associated with a zero-crossing surface, and tracing a zero-crossing surface
across scales becomes computationally di cult.
Liu et al. 295] proposed an alternative de nition of zero-crossing surface
stability in terms of important spatial boundaries. In this approach, a spatial
boundary is de ned as a region of steep gradient and high contrast, and is
well-de ned if it has no neighboring boundaries within a given range. This
© 2005 by CRC Press LLC
Detection of Regions of Interest
387
de nition of spatial boundaries is consistent with the Marr-Hildreth spatialcoincidence assumption. Furthermore, stability maps 288] associated with
the scale-space are used. A relaxation algorithm is included in the process to
generate zero-crossing maps.
In the method of Liu et al. 295], the discrete scale-space approach is used to
construct a representation of a given image in terms of a stability map, which
is a measure of pattern boundary persistence over a range of lter scales. For
a given image f (x y), a set of zero-crossing maps is generated by convolving
the image with the set of isotropic functions r2 g(x y i ), 1 i N . It
was indicated that N = 8 sampled i values ranging from 1 to 8 pixels were
adequate for the application considered. Ideally, one would expect a pattern
boundary to be accurately located over all of the scales. However, it has been
shown 296, 297] that the accuracy of zero-crossing localization dependspupon
the width of the central excitatory region of the lter (de ned as wi = 2 2 i
298]). Chen and Medioni 299] proposed a 1D method for localization of
zero-crossings that works well for ideal step edges and image patterns with
sharp contrast however, the method may not be e ective for the construction
of the spatial scale-space for real-life images with poor and variable contrast.
Instead of directly matching all the zero-crossing locations at a point (x y)
over the zero-crossing maps, Liu et al. proposed a criterion C ( i ) that is
a function of the scale i to de ne a neighborhood in which the matching
procedure is performed at a particular scale:
C ( i) = f(x0 y0 )g j x ; i x0 x + i
y ; i y0 y + i
1
(5.23)
0
0
where (x y ) are the actual locations of the zero-crossings, (x y) is the pixel
location at which the lters are being applied, and is a constant to be
determined experimentally ( = 1 was used by Liu et al.). Therefore, if a
zero-crossing (x y i ) is found in the neighborhood de ned by C ( i ), an
arbitrary constant is assigned to a function Si (x y), which otherwise is
assigned a zero, that is,
if (x y i ) 2 C ( i )
Si (x y) = 0 otherwise
(5.24)
where the subscript i corresponds to the ith scale i .
Applying Equations 5.23 and 5.24 to the set of zero-crossings detected, a set
of adjusted zero-crossing maps fS1 (x y) S2 (x y) : : : SN (x y)g is obtained,
where N is the number of scales. The adjusted zero-crossing maps are used
to construct the zero-crossing stability map (x y) as
(x y) =
N
X
i=1
Si (x y):
(5.25)
The values of (x y) are, in principle, a measure of boundary stability through
the lter scales. Marr and Hildreth 281] and Marr and Poggio 300] suggested
© 2005 by CRC Press LLC