Tải bản đầy đủ (.pdf) (8 trang)

Tài liệu tiếng anh chuyên ngành

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (823.99 KB, 8 trang )

Visualizing the Loss of Diversity in Genetic Programming
Jason M. Daida, David J. Ward, Adam M. Hilss, Stephen L. Long, Mark R. Hodges, and
Jason T. Kriesel
Center for the Study of Complex Systems and Space Physics Research Laboratory
2455 Hayward Avenue
Ann Arbor, Michigan 48109-2143

Abstract- This paper introduces visualization
techniques that allow for a multivariate approach in
understanding the dynamics that underlie genetic
programming (GP). Emphasis is given toward
understanding the relationship between problem
difficulty and the loss of diversity. The visualizations
raise questions about diversity and problem solving
efficacy, as well as the role of the initial population in
determining solution outcomes.
I. INTRODUCTION
There has been a significant amount of evolutionary
computation on the matter of diversity. Part of this interest
extends well into some of the earliest investigations (e.g.,
[1-4]). Work such as these has helped to shape our current
understanding of the need for diversity in genetic and
evolutionary computation: namely, that diversity helps to
escape premature convergence. It suggests that increasing
diversity could lead to more robust problem solving.
However, there are a few instances that argue for a
nuanced understanding of the role of diversity in problem
solving with GP. These include the following:
• Burke et al. [5] systematically examined various measures
of diversity, some of which have been devised in support
of a particular diversity-promoting technique. They


concluded that the relationship between diversity and run
performance is not straightforward. More robust problem
solving is not necessarily a result of more diversity,
depending which definition is used.
• Luke et al. [6] demonstrated that dynamically decreasing
the size of a population over the course of a run might be
a computationally effective method. Their idea and
subsequent method run counter to the assumption that a
large, presumably diverse population needs to be
maintained at all times. It could be argued that Luke et
al.’s work shows that less diversity could lead to more
robust diversity.
Unfortunately, a nuanced understanding of diversity has
been difficult to achieve. This difficulty exists partly
because such studies require large amounts of data for
examination. Typical diversity studies require capturing data
on each individual that is created over the course of a GP
run. Even modest studies that involve relatively small
populations of 500 individuals can generate on the order of
one terabyte of data, if only because multiple trials need to
be run. This difficulty exists also because the dynamics
associated with GP also tends to be multivariate and not
well understood. There are few standard methods for
conducting multivariate analyses over large data sets in
which some of the key variables and interactions are
unknown.
In situations like these, it is subsequently common to
employ data exploration methods—quantitative multivariate
visualization techniques—to glean possible linkages
between possible key variables and interactions.

Unfortunately, such visualization techniques are non-
standard for more than a handful of variables and have not
been available for GP.
This paper builds upon the methods and results of two
key papers [5, 7] that have been published on the subject of
diversity. We do so by incorporating newly developed
techniques for data visualizing in GP. Some of these
techniques have been borrowed from other work (i.e., [8])
for visualizing populations of trees. Other techniques have
been developed specifically by us and are reported here. In
using these visualization techniques, we corroborate what
these two papers have found and reveal for the first time
some of the temporal dynamics that underlie population
histories in GP.
The organization of this paper is as follows. Section 2
details our motivation and our subsequent definitions.
Section 3 summarizes the procedure that was used to
generate our data. Section 4 addresses methods and results
for visualizing data from single runs of GP, while Section 5
addresses methods and results for multiple runs. Section 6
discusses implications of the data visualized. Section 7
concludes this paper.
II. BACKGROUND
A. Motivation
This paper builds from the findings and suggestions of
two studies [5, 7] that focused on understanding diversity in
the context of population dynamics in GP. Our motivations
are described here.
Both works indirectly suggest the possibility for a
multivariate approach to understanding diversity. McPhee

and Hopper’s work suggested that population dynamics
would be of interest, but their analysis featured one-variable
statistics and did not figure in time as a variable. Burke et
al.’s work was almost exclusively based on two-variable
statistics, but they concluded that such an approach did not
0-7803-8515-2/04/$20.00 ©2004 IEEE 1225
allow for them to understand the dynamics of a population.
Consequently, this paper describes quantitative multivariate
visualization techniques to identify possible linkages
between multiple variables. It would serve as a precursor to
a multivariate statistical analysis.
McPhee and Hopper described a rigorous method, but
replication and extension of their method has been difficult.
In comparison to the other measures of diversity reported in
Burke et al.’s work, McPhee and Hopper’s method is
arguably the most rigorous. One could derive other
measures of diversity from a data set generated using
McPhee and Hopper’s method, which is not possible to do
so the other way around. Unfortunately, the analysis
conducted by McPhee and Hopper was based on 20-trial
statistics—it was perhaps unwieldy at the time to do more.
It did not help that their experiments used non-standard GP
settings (i.e., steady-state and the use of a size
normalization threshold). Consequently, this paper used a
form of McPhee and Hopper’s method on a substantially
larger study that allows for 200-trial statistics.
Burke et al. partially replicated McPhee and Hopper’s
work, but not the part that tracked the history of initial
population individuals. The omission is noteable, because
they were looking for early indicators for run performance

and because McPhee and Hopper indicated that only a few
initial population individuals contribute to a final
population. The omission is also understandable, because
the amount of work required to do the bookkeeping of
nodes to initial population individuals is nontrivial.
Nevertheless, we focused on the initial population
individuals because of their potential importance to
understanding GP dynamics.
B. Definitions
There are several terms that are used throughout this
paper. They are all nonstandard, given that much of the
previous literature has been influenced by a Markovian type
of analysis (i.e., focus on the transition from time t
i
to t
i+1
).
In contrast, our analysis has been motivated with the intent
of using the initial population as a predictor of solution
outcomes. Our terms and their associated definitions are
subsequently reflective of this particular intent. These
definitions presume that standard GP is being discussed,
although extensions to other methods in evolutionary
computation are also possible.
• V
0
. Given an initial population P
0
, let every node / vertex
in this population be uniquely identified and labeled to

form a set V
0
. For example, an initial population P
0
of
500 individuals could consists of well over 14,000 nodes,
depending on the type of population initalization scheme
that was used. The membership of V
0
presumes that each
of these 14,000+ nodes is considered as uniquely labeled.
• Ancestor. An ancestor A
i
specifically refers to an
individual from the initial population P
0
. Since all nodes
in the initial population are unique, it follows that an
ancestor A
i
consists of a unique set of nodes A
i
⊇ V
0
that
is mutually exclusive from the set of nodes A
j
, which
characterizes an ancestor A
j

, where i ≠ j. For example, a
three-node tree in P
0
is presumed to be uniquely identified
by a set A = {a
1
, a
2
, a
3
}, given that A ⊇ V
0
and the
elements a
1
, a
2
, and a
3
are not a part of any other tree in
P
0
,.
• Lineage. Lineage L
B
for an individual B at some
generation t
i
, i ≠ 0, specifically refers to the set of
ancestors A that contribute to that individual.

Consequently,
L
B
≡∀A ∈ P
0
: ∃a ∈ A ← A and a ∈ B ← B
{}
.(1)
For example, an individual B can have a lineage that
consists of {A
1
, A
2
, A
3
} if there is some node from each
of these ancestors that exists in B.
• Frequency of Ancestor Occurrence. Frequency of ancestor
occurrence n
A
specifically refers to the number of
individuals in a given population that have ancestor A as
part of their lineage. For example, in the initial
population P
0
, each individual A
i
has a frequency of
ancestor occurance n
A

= 1.
• Surviving Number of Ancestors. Surviving number of
ancestors n
s
for a given population refers to the number of
ancestors that contribute at least one node to that
population. For example, in the initial population P
0
that
is of population size M, n
s
= M.
III. EXPERIMENTAL PROCEDURE
We analyzed GP on a particular, well documented,
tunably difficult test problem (i.e., binomial-3). The
problem been designed as a probe for understanding GP
dynamics and represents an instance of data modeling.
The binomial-3 is discussed in detail in [9, 10]. In brief,
the problem is an instance taken from symbolic regression
and involves solving for the function f(x) = 1 + 3x + 3x
2
+
x
3
. Fitness cases are 50 equidistant points generated from
f(x) over the interval [-1, 0). The function set is {+, –, ×,
÷}, which corresponds to arithmetic operators of addition,
subtraction, multiplication, and protected division. A
terminal set was {
X, R}, where X is the symbolic variable

and R is the set of ephemeral random constants that are
distributed uniformly over the interval [-
α
,
α
]. The tuning
parameter is
α
, which is a real number. The binomial-3 can
be tuned from a relatively easy problem (e.g.,
α
= 1) to a
difficult one (e.g.,
α
= 1000).
We used a modified version of lilgp [11] similar to that
used in [9]. Most of the modifications were for bug fixes
and for the replacement of the random number generator
with the Mersenne Twister [12]. Other significant
modifications included augmenting the data structure
associated with each node to include an integer ID that
serves as that node’s serial number. Each ID is unique to a
node and is generated once during population initialization.
We configured lilgp to run as a single thread.
The ID labeling scheme was a result of [9], but is similar
to that desccribed in [7]. McPhee and Hopper’s scheme
called for tagging each node in the initial population with
1226
integer label pairs (ID:memID). The ID part of their label is
assigned just once at population initialization and consists

of an integer that is unique to a node relative to the set of
nodes that make up the initial population (i.e., V
0
). ID is
used as a serial number that can be used to track individual
nodes. memID is used for providing an audit trail for
subtree memberships. For our work, we implemented what
amounts to just the ID portion of their integer pair.
Table 1 lists the parameter settings considered in this
paper. Most of the GP parameters were similar to those
mentioned in [13], Chapter 7.
TABLE 1. PARAMETER SETTINGS
Parameter Setting
Selection Tournament q=7 or Proportionate
Population Size M 500
Initialization Method Ramped Half-and-Half
Initialization Depths 2–6 Levels
Max Generations G 200
Maximum Depth 26
Internal Node Bias 90% internal, 10% terminals
Termination Criteria Run reaches G
Binomial-3
α
1 or 1000
Number of Trials 200
We used four different experimental configurations, given
that we considered two different selection methods and two
different difficulty settings. There were 200 trials taken per
configuration for a total of 800 trials.
Although the number of trials is modest, what was

unusual were the data requirements specified for each trial.
Each trial resulted in recording each individual in every
population that was generated during a trial for a total of
201 generations (i.e., 200 specified generations and the
initial population). The total amount of data that was
generated by these four configurations was about 0.5 TB.
Post-processing of this population data was done in two
stages: the first stage reduced individuals to their
appropriate lineages and structures, while the second stage
visualized the reduced data. Both stages were custom-coded:
the first stage was coded in PERL; the second stage, in
Mathematica. There were 80,400,000 trees that were parsed
and analyzed in this manner (i.e., 4 configurations, 200
trials per configuration, 201 generations per trial, 500
individual trees per generation). Post-processing time was
about one CPU-month per configuration.
GP trials and data reduction were run on a Linux
workstation. Visualization was done on a Power Mac.
IV. SINGLE-TRIAL METHODS AND RESULTS
The visualization principles employed in this work were
those described by Tufte in [14-16]. Tufte’s work has been
influential in the design of data graphics. He is known for a
particular, minimalist style of visualization that we use in
the construction of nonstandard graphics. This particular
style is not commonly used in the evolutionary
computation community. For that reason, it is worthwhile
quoting several of his design principles from [14] (pp. 105,
121):
• Above all else show the data
• Maximize the data-ink ratio

• Erase non-data-ink
• Erase redundant data-ink
We have used these principles to generate visualizations
of single- and multiple-trial dynamics.
In comparison to the one- and two-variable techniques in
[5, 7], our technique is a ten-variable visualization:
population structure (three variables), lineage mapping (i.e.,
ancestors to individuals in a current population), frequency
of ancestor occurrence n
A
, rank ordering of individuals in the
current generation, rank ordering of ancestors, surviving
number of ancestors n
s
, time, and problem difficulty.
Of these ten variables, visualizing the three variables
associated with population structure is based on a relatively
new technique that was introduced in [8]. The technique
calls for each tree to be layed out on a circular grid, as
shown in Figures 1 and 2.
1
12 13
14 15
3
6
7
89
10 11
2
4

5
1
7
3
2
6
5
4
11
10
9
8
15
14
13
12
(a) (b)
Figure 1. Mapping a full binary tree to a circular gird. (a) Full binary tree
of depth 3. (b) Corresponding circular grid.
1
3
89
10 11
2
4
5
1
3
2
5

4
11
10
9
8
(a)
(b)
Figure 2. Mapping a partial binary tree to a circular grid. (a) Binary tree
of depth 3. (b) Corresponding mapping of this tree on a circular grid.
Cumulative distributions for a population were then
computed for each grid point. This step is analogous to
overlaying tree structures one on top of the other. As a
result, darker lines correspond to links that are more
frequently used in a population. The variables that this
visualization would encode for are structure (two
dimensions) and cumulative distribution. One typically uses
this method to see which structures are more frequently used
in a population. The method is useful inasmuch as structure
has been an integral part of GP theory (e.g., [17-19]) Figure
3a shows an example of visualization of a population of
500.
1227
A visualization that shows every occupied grid point for
a population can detract away from understanding which
structures are most typically used. This happens because
visible lines can be made only so thin and unintentional
occlusion takes place. Consequently, we displayed only
those structures that are common to a majority of a
population. Figure 3b shows the same example as given in
Figure 3a, except that only the structures that are in the

majority is shown.
(a) (b)
Figure 3. Visualization of Population Structure. The data is from
generation 200 for the configuration
α
= 1000 and tournament selection.
(a) Visualization of a population of 500 trees. Darker lines indicate
greater frequencies of those structures that appear in that population. The
gray circle is for reference only and corresponds to depth level 26. (b)
Majority-only view. This view shows the same data as in (a), but only for
those structures in which 50% or more of population uses are shown.
Visualizing the next five variables was accomplished by
using a modified form of a histogram. The five variables
that were considered in this part of the visualization
included the following: lineage mapping (i.e., ancestors to
individuals in a current population), n
A
, rank ordering of
individuals in the current generation, rank ordering of
ancestors, and n
s
. These variables address the issue of
content use at the level of individual and population. We
note that structure is implicit at these levels of examination,
since individuals are identified by their nodes and nodes are
organized by their associated tree structure.
Figure 4 gives an example of the modified histogram.
This modified histogram consists of two parts: the upper
part, which shows the n
A

; and the lower part, which shows
the associated lineage mapping from ancestors to currrent
generation individuals. Concerning the upper part, the x-axis
of the frequency-of-ancestor-occurrence histogram is rank
ordered. This means that ancestors are positionally
identified from the least fit (left) to the most fit (right). The
ordering of ancestors is determined just once at generation 0
and does not change during subsequent generations. The y-
axis of the histogram is normalized according to population
size M, since that is the maximum value that is attainable
by any one ancestor. The frequency data that is displayed is
specific to the given current generation. An absence of a bar
in the histogram part of this visualization means that an
ancestor has not survived. This absense is indicative of the
complement of the variable that describes the surviving
number of ancestors n
s
.
The lower part of this modified histogram shows the
associated lineage mapping from ancestors to the current
generation. As in the upper part, the ancestors are
positionally identified from the least fit (left) to the most fit
(right). The current-generation individuals are ordered
likewise. Figure 4 shows an instance where an ancestor at
rank position a has a lineage that maps to multiple
individuals in the current generation. Arrows at b and c
identify a few of those mappings The lines that link an
ancestor to a current-generation individual are darker as the
rank of the current-generation individual increases.
Both frequency-of-ancestor-occurrence histogram and

lineage mapping are correlated. For example, the ancestor at
position a in Figure 4 shows a frequency of occurance that
is low. Likewise, the number of mappings from that
ancestor to the current-generation individuals is sparse.
Likewise, if the frequency of ancestor occurrence were at
maximum (i.e., M ), it would mean that parts of that
ancestor are distributed throughout the entire population.
Figure 5 shows the complete ten-variable visualization
for an “easy” problem (which featured tournament selection
and
α
= 1). To show the first eight variables, the
visualizations for population structure and the modified
histogram for content were shown in tandem for a given
generation.
To show the variable of time, we used the technique of
small multiples (see [14, 15]). The number that is
embedded in the frequency-of-ancestor-occurence part of the
modified histogram corresponds to the current generation.
Consequently, Figure 5 shows the dynamics of population
structure and content for the first 22 generations, which were
Lineage Mapping
Frequency of Ancestor Occurrence (Histogram)
Ancestors A
Current Generation Individuals B
Rank
a
bc
Figure 4. Visualization of Content at the Level of Individual and Population. There are two main parts to this visualization: frequency of ancestor
occurrence histogram and lineage mapping. The visualization that is shown is from generation 3 of

α
= 1, fitness proportionate selection.
1228
sampled for visualization every two generations.
To show the variable of problem difficulty, we used a
simple thermometer icon at the bottom of the figure. The
particular metric that is displayed is the percentage of
successful trials that produced a “perfect” solution.
Consequently, the easier a problem is to solve, the greater
the number of trials out of the total number of trials that
end up producing a “perfect” solution. The icon that is
shown in Figure 5 corresponds to a relatively easy problem:
about 7 out of every 10 trials produced a “perfect” solution.
We note that the data that was used to produce the
thermometer icon was based off of results from [10]. This
was done in part because the metrics associated with
difficult problems correspond to low probabilities (i.e., <
1%). The results given in were based on 600 trials, instead
of the 200 trials given here. However, the statistics
concerning this metric between data sets were comparable.
Figure 6 shows contrasting results for a “difficult”
problem.
0
2
4
6
8
10
12
14

16
18
20
22
70%
Figure 5. Ten-Variable Visualization of Structure and Content for an “Easy” Problem (i.e., tournament selection and
α
= 1). The data that is shown is fo
r
the first 22 generations of a successful trial.
1229
V. MULTI-TRIAL METHODS AND RESULTS
There is a trade-off in visualizing single versus multiple
trials. Although single-trial results provide insight on
nuances in GP dynamics that occur during the course of a
run, it is difficult to place those nuances in an appropriate
statistical context. Unfortunately, the ten-variable
visualization of Section IV does not scale easily to account
for multiple trials.
For that reason, we reduced the number of variables that
can be examined at once. In doing so, we chose to
emphasize just a handful of variables so that more of the
data can be shown. We used visualizations that were
generated using methods from Section IV to identify
possible variables for further study.
We settled on a five-variable visualization: surviving
number of ancestors n
s
, time, problem difficulty, selection
method, and cumulative distribution. The visualization is

constructed from common methods in data visualization,
0
2
4
6
8
10
12
14
16
18
20
22
1%
Figure 6. Ten-Variable Visualization of Structure and Content for a “Very Difficult” Problem (i.e., fitness proportionate selection and
α
= 1000). The
data that is shown is for the first 22 generations of a successful trial.
1230
whereby the each of the four major elements of this
visualization correspond to a density plot. The x-axis is
time (in generations) and the y-axis is the surviving number
of ancestors. The maximum surviving number of ancestors
is M, which occurs at generation 0.
Oftentimes in evolutionary computation, the plot most
commonly employed to display quantities that vary as a
function of time is a line graph. In this case, the cumulative
distributions are inferred by the number of lines that occur
per square unit in a plot. Instead of resorting to this
technique, however, cumulative distributions were

computed for every nonzero tuple of surviving number of
ancestors and time. A density plot was then used to display
this data instead of line graphs. The convention used in this
paper has darker tones indicating higher cumulative
distributions. Consequently, each density plot describes
three variables (i.e., surviving number of ancestors n
s
, time,
cumulative distribution).
The remaining two variables—selection method and
problem difficulty—were displayed using the method of
small multiples. Each density plot corresponds to a
variation of one of these variables. Density plots are
subsequently arranged as a two-dimensional matrix.
However, because problem difficulty resulted in values that
were not binary quantities, we used the thermometer icon
(as was described in Section IV).
Figure 7 shows the results of the data described in
Section III. This visualization is a complete (non-sampled)
summary of approximately 80 million trees that were parsed
into lineages of 400,000 ancestors. It represents a view of
approximately 0.5 TB of data. (Some data is not visualized
since only the range [0, 150] for the surviving number of
ancestors is shown. The complete range is [0, M], where
M = 500.)
VI. DISCUSSION
Our results do corroborate many of the findings and
speculations in [5, 7]. For example, the results bear out the
speculations in [5], which suggested a nuanced approach in
understanding diversity. Not only do the results show

signficant temporal behaviors as was shown in [5], but also
show that other previously neglected factors—like problem
difficulty and selection method—can signficantly influence
GP dynamics (i.e., Figure 7). Furthermore, the results also
corroborate some of the earliest finding concerning diversity
in GP [7], even though that work was based on a limited
statistical sample and used nonstandard GP settings. Even
though the argument in [7] for an “Eve” is perhaps
overstated, the results in Figure 7 do show that for certain
configurations, the amount of individuals that contribute to
a population is but a fraction of the original population.
Figures 5 and 6 also corroborate what is suggested in [7]
0
50
100
150
200
0
20
40
60
80
100
120
140
0
50
100
150
200

0
20
40
60
80
100
120
140
0
50
100
150
200
0
20
40
60
80
100
120
140
0
50
100
150
200
0
20
40
60

80
100
120
140
Time (Generations) Time (Generations)
Surviving Number of Ancestors
Surviving Number of Ancestors
Time (Generations) Time (Generations)
Surviving Number of Ancestors
Surviving Number of Ancestors
70%
34%
37%
1%
(a)
(b)
(c)
(d)
Figure 7. Five-Variable Visualization of the Loss in Diversity by Problem Difficulty and Selection Method. Each plot corresponds to a particular
selection method and tuning parameter setting. (a) Fitness proportionate,
α
= 1. (b) Tournament,
α
= 1. (c) Fitness proportionate,
α
= 1000. (d)
Tournament,
α
= 1000.
1231

concerning structure—i.e., that there is a convergence of
both structure and content.
The visualizations shown in Figures 5, 6, and 7 raise
signficant issues about the dynamics that occur during the
course of a GP run. Although it is beyond the scope of this
paper to describe these issues at length, we summarize two
of them here.
• What is the relationship between diversity and problem
solving ability? Fitness proportionate selection was one
of the earliest methods that were used to enhance diversity
for the purpose of robust problem solving [2, 3]. On one
hand, the results in Figure 7 do support the contention
that fitness proportionate selection does enhance diversity,
since there is much more of V
0
that was retained.
However, the results indicated that simply enhancing
diversity during the course of a GP run was not
intrinsically helpful. In both cases, “hard” selection
resulted in significantly better performances (i.e.,
problems became significantly easier to solve under
tournament selection). This occurred in spite of large
losses in diversity. This finding supports previous, albeit
controversial work in [20, 21].
• What are the dynamics between population size and
diversity? On one hand, the theory that describes the
dynamics under fitness proportionate selection is
complex, which makes it difficult to make comparisons
of this work with existing theory. On the other hand,
there have been some investigations in tournament

selection (e.g., [22, 23]) that point to dynamics that are
independent of fitness distributions for a given
population. Given an analysis of ancestors and lineage,
their work would only apply to predicting the number of
ancestors in generation 1. However, Figures 5 and 6
indicate that mixing between ancestors eventually
becomes so thorough that many of the individuals in a
current population can trace its lineage to most of the
surviving ancestors. For that reason, it could be that the
loss of diversity measures in [22, 23] might also apply to
the steady state that occurs in Figures 7b and 7d. The
existence of this steady state under tournament selection
would also indicate why a population implosion method,
as described in [6] would work. In particular, if only a
fraction of ancestors are used to form a solution, then it
might make sense to dynamically decrease the population
size over the course of a GP run.
VII. CONCLUSIONS
This paper introduced several visualization techniques
that enabled a multivariate exploration of GP dynamics and
diversity. The resulting visualizations produced detailed
patterns of temporal behaviors that occurred in what
amounted to one of of the largest studies of its kind to date.
This study required the tracking the lineage of 80,000,000
to some 400,000 individuals in initial populations. The
visualizations summarize approximately 0.5 TB of data.
The results corroborate previous theoretical and empirical
studies that have been published on the subject of diversity
in GP. The visualizations also raise further questions on the
role of diversity in GP.

REFERENCES
[1] D. B. Fogel, Evolutionary Computation: The Fossil Record,
Piscataway: IEEE Press, 1998, pp. 641.
[2] M. Pincus, “An Evolutionary Strategy,” Journal of Theoretical
Biology, vol. 28, pp. 483–488, 1970.
[3] R. Galar, “Handicapped Individua in Evolutionary Processes,”
Biological Cybernetics, vol. 53, pp. 1–9, 1985.
[4] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann
Arbor: University of Michigan Press, 1975.
[5] E. Burke, et al., “A Survey and Analysis of Diversity Measures in
Genetic Programming,” in GECCO-2002, W. B. Langdon, et al.,
Eds. San Francisco: Morgan Kaufmann Publishers, 2002, pp.
716–723.
[6] S. Luke, et al., “Population Implosion in Genetic Programming,” in
GECCO 2003, E. Cantú-Paz, et al., Eds. Berlin: Springer-Verlag,
2003, pp. 1729–1739.
[7] N. F. McPhee and N. J. Hopper, “Analysis of Genetic Diversity
through Population History,” in GECCO ’99, W. Banzhaf, et al., Eds.
San Francisco: Morgan Kaufmann Publishers, 1999, pp. 1112 –
1120.
[8] J. M. Daida, et al., “Visualizing Tree Structures in Genetic
Programming,” in GECCO ’03, E. Cantú-Paz, et al., Eds. Berlin:
Springer-Verlag, 2003, pp. 1652–1664.
[9] J. M. Daida, et al., “Analysis of Single-Node (Building) Blocks in
Genetic Programming,” in Advances in Genetic Programming 3, L.
Spector, et al., Eds. Cambridge: The MIT Press, 1999, pp. 217–241.
[10] J. M. Daida, et al., “What Makes a Problem GP-Hard? Analysis of a
Tunably Difficult Problem in Genetic Programming,” Genetic
Programming and Evolvable Machines, vol. 2, pp. 165–191, 2001.
[11] D. Zongker and W. Punch, “lilgp,” v. 1.02 ed. Lansing: Michigan

State University Genetic Algorithms Research and Applications
Group, 1995.
[12] M. Matsumoto and T. Nishimura, “Mersenne Twister: A 623-
Dimensionally Equidistributed Uniform Pseudorandom Number
Generator,” ACM Transactions on Modeling and Computer
Simulation, vol. 8, pp. 3–30, 1998.
[13] J. R. Koza, Genetic Programming: On the Programming of
Computers by Means of Natural Selection. Cambridge: The MIT
Press, 1992.
[14] E. R. Tufte, The Visual Display of Quantitative Information.
Cheshire, CT: Graphics Press, 1983.
[15] E. R. Tufte, Envisioning Information. Cheshire, CT: Graphics Press,
1990.
[16] E. R. Tufte, Visual Explanations: Images and Quantities, Evidence
and Narrative. Cheshire, CT: Graphics Press, 1997.
[17] J. P. Rosca and D. H. Ballard, “Rooted-Tree Schemata in Genetic
Programming,” in Advances in Genetic Programming 3, L. Spector,
et al., Eds. Cambridge: The MIT Press, 1999, pp. 243–271.
[18] W. B. Langdon and R. Poli, Foundations of Genetic Programming.
Berlin: Springer-Verlag, 2002.
[19] J. M. Daida and A. Hilss, “Identifying Structural Mechanisms in
Standard Genetic Programming,” in GECCO ’03, E. Cantú-Paz, et
al., Eds. Berlin: Springer-Verlag, 2003, pp. 1639–1651.
[20] J J. Kim and B Y. Zhang, “Effects of Selection Schemes in
Genetic Programming for Time Series Prediction,” in CEC ’99, vol.
1. Piscataway: IEEE Press, 1999, pp. 252–258.
[21] B Y. Zhang and J J. Kim, “Comparison of Selection Methods for
Evolutionary Computation,” Evolutionary Optimization, vol. 2, pp.
55–70, 2000.
[22] T. Bickle and L. Thiele, “A Mathematical Analysis of Tournament

Selection,” in ICGA ’95, L. J. Eshelman, Ed. San Francisco: Morgan
Kaufmann Publishers, 1995, pp. 9–16.
[23] T. Motoki, “Calculating the Expected Loss of Diversity of Selection
Schemes,” Evolutionary Computation, vol. 10, pp. 397–422, 2002.
1232

×