From: rlr@mole.merit.edu (Rick Riolo)
Newsgroups: comp.ai.genetic
Subject: Holland comments on role of schemata in GA performance
Date: Wed, 13 Jul 94 13:07:35 MET DST
Organization: Merit Network, Inc.  Ann Arbor, MI

John Holland asked me to post this for him;  I believe he has also 
posted a copy to GA-List.
  - r

---------------------------------------------------------------------
 
     Recently, there have been several commentaries, and even 
papers, that misconstrue the role of schemata in GA performance. 
Four facts should help clear the confusion: 
(1)  When a GA is properly implemented (see comment below), the 
number of instances of a schema can be expected (probabilistic 
sense) to increase at a rate proportional to the difference between 
the observed schema (marginal) average and the observed population 
average (with suitable allowances for the effects of genetic 
operators).  THE SCHEMA'S RATE OF INCREASE GOES TO ZERO over 
successive generations, EVEN IF THE SCHEMA HAS AN OBSERVED 
ABOVE-AVERAGE VALUE THAT REMAINS CLOSE TO A CONSTANT, and 
even if no better schemata are discovered.
     This happens because the population contains progressively more
instances of the schema, so that the population average approaches 
the schema average.  This is an obvious characteristic, known and 
commented upon in the earliest papers and books, but it has been 
taken as a newly discovered "fault" in some recent commentaries.
(2)  IF the instances of a schema have an observed average value 
that (i) is above the population average (after subtracting operator
losses from the schema average) and (ii) remains close to constant
(over successive generations), THEN that schema will exhibit a
nearly constant EXPONENTIAL INCREASE during the generations after 
its discovery UNTIL IT OCCUPIES A SUBSTANTIAL PROPORTION OF 
THE POPULATION, if other schemata are not increasing rapidly. 
     The population average remains close to constant (if other 
schemata are not increasing rapidly) until the instances of the new 
schema occupy a substantial portion of the population.  If the
original difference in averages is d and the proportion of the schema
is p, then the exponent is proportional to (1-p)d in this simple case. 
Again, this is well-known, in both theory and observation, from the 
earliest studies.
     (As an aside, a population is a sample, in the rigorous sense, 
drawn from the set of all possible strings.  As such, any marginal 
average calculated from that sample is a legitimate ESTIMATOR --
this holds a fortiori for instances of a given schema.  That the 
underlying distribution is non-uniform, or changing, has no bearing 
on this statement.  Such considerations only bear on the ease of 
USING the estimator.  There have been several recent statements 
that seem unaware of these elementary facts.)
(3)  Recombination of schemata that reside on different strings 
becomes quite effective once the instances of those schemata come to occupy,
say, 10% or more of a reasonably large population (say 
1000 individuals or more). 
     At 10%, one string in every 10 is an instance of the schema.  Hence,
about one recombination in 10 involves a new context for the 
schema (if operator losses are not large).  With 2-point crossover, 
string lengths of 200 or more, and schemata that have lengths no 
more than 10% of the total string length, the conditions are 
satisfied and the number of candidate combinations is astronomical.
     (4) GAs are at their best in complex situations where 
improvements, NOT GLOBAL OPTIMA, are the objective. 
     Global optima are often far removed from current practice, as in 
the design of complex strategies or apparatus, or even undefinable, 
as in the case of ecosystems.  GAs are good at finding new favorable 
combinations, but they slow down precipitously when near to optima
(where combinations are usually no longer effective).  If the optima 
are local and plentiful, then combination of above-average schemata
will often lead to the vicinity of new and better optima, but this is 
not an option at a global optimum. 
 
     These four facts bear directly on what is OBSERVED with a
well-designed GA.  ("Well-designed" is another topic, but a critical 
matter is a low scaled fitness for the current best string so that
d<0.2, with coordinated supression of hitchhiking.)*: 
     Newly discovered schemata that remain consistently above-
average quickly come to occupy a significant fraction of the 
population.  With a population of 1000, a size that works well with
'real' problems, even a schema with initial difference d=0.1
will occupy 10% of the population in about 60 generations (if it is
not lost by random effects in the first generation or two.)
      The exploitation of schemata along these lines can easily be 
observed if one runs a suitably designed GA against one of the Royal 
Road test functions (where the observer knows the important 
schemata in advance, but the GA does not).  If your problem is one of 
improvement, with 'natural' building blocks that can be represented 
by "compact" schemata in your string encoding, then you should make
sure your version of the GA operates efficiently on Royal Road 
functions.
     If performance is measured as best value vs. number of 
generations (or total trials) elapsed, rather than as the proportion 
of runs reaching some pre-determined criterion, you will get a 
better reading of the GA's performance.  It is particularly helpful to 
trace the life history of some of the schemata that attain large numbers of
instances in the population.
     John Holland

*  There's an all too common practice of reporting experiments with
a particular version of the GA, WITHOUT REPORTING PARAMETERS, with an explicit,
or implicit, implication that the  results apply to 
all GAs. 


From Germany.EU.net!EU.net!Austria.EU.net!newsfeed.ACO.net!edvz.sbg.ac.at!news Thu Jul 14 22:00:24 1994
Article: 3337 of comp.ai.genetic
Newsgroups: comp.ai.genetic
Path: Germany.EU.net!EU.net!Austria.EU.net!newsfeed.ACO.net!edvz.sbg.ac.at!news
From: helmut@cosy.sbg.ac.at (Helmut Mayer)
Subject: Re: Holland comments on role of schemata in GA performance
Message-ID: <CsvwM8.I32@wst.edvz.sbg.ac.at>
Sender: news@wst.edvz.sbg.ac.at (USENET News System)
Reply-To: helmut@cosy.sbg.ac.at
Organization: University of Salzburg / Austria
References: <300htn$5s0@lastactionhero.rs.itd.umich.edu>
Date: Wed, 13 Jul 94 17:06:07 MET DST
Lines: 31

Rick Riolo writes
# John Holland asked me to post this for him;  I believe he has also 
# posted a copy to GA-List.

# (1)  When a GA is properly implemented (see comment below), the 
# number of instances of a schema can be expected (probabilistic 
# sense) to increase at a rate proportional to the difference between 
# the observed schema (marginal) average and the observed population 
# average (with suitable allowances for the effects of genetic 
# operators).

Isn t it the ratio (not the difference) of average schema fitness to average  
population fitness?

# It is particularly helpful to 
# trace the life history of some of the schemata that attain large numbers of
# instances in the population.

How could this be done, if you don t know the (or a good) solution? One could  
do that by surveying all possible schemata, but this is equivalent to estimate  
all possible solutions. One thought I had is to concentrate on schemata with  
short defining lengths (building blocks) and count the occurrences of a state  
at a certain bit position through n generations. If the same state (0 or 1)  
occurs in 0.9n generations or more, it s assumed to be a defined bit of the  
schema. If it s under the (arbitrarily defined) threshold, it s an undefined  
bit. Sure, this is a rather heuristic approach... 

--
Helmut A. Mayer
University of Salzburg
Department of Computer Science