Re: Design, complexity, etc.

From: Iain Strachan (iain.strachan.asa@ntlworld.com)
Date: Tue May 13 2003 - 18:13:51 EDT

  • Next message: Iain Strachan: "Re: Evolutionary rate"

    Hi, David,

    I'll try and answer your point below, which is the one I feel best qualified
    to answer:

    > >(Irreducibly complex, in my definition, means: "requiring several
    coordinated changes to make any improvement")<
    >
    > I appreciate the definition; it is necessary for a meaningful discussion
    of irreducible complexity. However, it does raise some further questions:
    > How do you define improvement? How coordinated must the changes be? How
    many? What counts as separate changes?
    >

    I must make clear that I am talking specifically here about the use of
    Genetic Algorithms as optimizers of some fitness function. I realise at the
    outset that in nature in general it's a matter of differential survival,
    which is not the same as optimising a specific function. However, all this
    is in the context of genetic algorithms being used to justify evolution in
    nature, and in particular the Lenski paper on the Evolutionary Origin of
    Complex Features, which does indeed optimize a specifically crafted function
    (as indeed does the "WEASEL", or the Schneider Ev simulation).

    Hence

    I would define "improvement" in terms of increasing the value of the fitness
    function ( actually minimising a cost function is mathematically the same --
    in Schneider's paper, a "mistake count" is to be minimized & hence "cost
    function" would be a more appropriate term). In any kind of Genetic
    Algorithm, of course, a reduction in the fitness (increase in cost) can come
    about (as indeed it does in the Lenski simulation), but the overall
    objective is to improve the fitness. One of the "selling points" of GA's,
    which attracted my colleagues and myself to study them along time ago at
    work was their ability to escape from "local optima" of the fitness surface
    by temporarily reducing the fitness via mutation and then homing in on a
    hopefully superior optimum. They are regarded as "global optimization"
    techniques. It's my opinion that this only really works for low-dimensional
    problems, otherwise, if you're stuck in a local optimum in a high
    dimensional space, then the probability of a random mutation or crossover
    putting you in the basin of attraction of a different optimum becomes
    vanishingly small, due to a well-known mathematical phenomenon known as the
    "curse of dimensionality" (many hits can be found by typing this phrase
    into Google).

    As to what I mean by "how coordinated", I'll give an example of an
    experiment I ran in the early days (note that at this time I was very keen
    to exploit GA's, reasoning that evolution designed us, and so a biologically
    inspired algorithm like this must be able to do pretty remarkable things.
    I'd worked for some time in the biologically inspired field of Neural
    Networks, & know that they are very successful & hence it seemed logical
    that if we can exploit brain-architecture, that we should also be able to
    exploit evolution).

    We used a standard software package, now available on the web as GeNeSyS
    4.5; a set of C programs that runs a GA and allows you to define your own
    fitness functions. It came with a standard set of test functions regarded
    as challenges for optimizers. One such function was called "Shekel's
    Foxholes" (again, Google provides many hits if you want to visualize this
    function). It is a hard problem for an optimizer because it has many local
    minima. It is a function of two variables, x and y, and consists of a
    rectangular grid of sharp spikes (hence the "foxholes"), each rising to a
    different height & so one of them is the global optimum, and the others are
    the local optima. Outside the "spikes" it is pretty flat. The Genetic
    Algorithm performs spectacularly well on such a function, invariably finding
    the global optimum. However, despite the fact I wanted to be impressed by
    it, it occurred to me that with the rectangular array of "spikes", this
    problem was (unintentionally) tailor-made for both the crossover and the
    mutation operators. The genome is a string of 20 bits, of which the first
    10 give the x coordinate and the last 10 give the y coordinate. Since the
    mutation rates have to be low, then it is clear that a mutation can either
    only change the x or the y value. But since the optma are arranged on a
    rectangular grid, with the grid lines parallel to the x and y axes this
    allows it to hop from one hole to another quite easily given a _single_
    mutation. Equally, the "crossover" is very helpful; you only have to have
    one member of the population in the right row and one in the right column,
    and a single point crossover at the half way point will produce one that is
    in the right row and column. I thought that this might be the reason the GA
    worked so well. There was a simple test to perform to see if my conjecture
    was true. That was to arrange the "spikes" in the function so they were
    placed randomly on the surface instead of neatly in rows and columns. In
    this case, to hop from one spike to another would require simultaneous
    changes to the x and y values, or two coincident mutations. Equally, the
    "crossover" could not be expected to work. The outcome of the experiment
    was, alas, as I had expected; the GA now performed little better than a
    random search.

    Now, obviously that example is simplistic and unrealistic. And furthermore,
    it could easily be pointed out that in nature one does not necessarily
    expect to reach the global optimum; any of the local ones would do.
    However, this illustration is simply to point out what I mean by coordinated
    changes. The problem also arises in many more realistic and interesting
    optimization problems (such as training a neural network). Many such
    problems are characterized by what are described as "ridges" in the fitness
    surface (or the cost function surface). In order to stay on the ridge, you
    have to simultaneously change several of the parameters under optimization &
    this is what you can't do with a genetic algorithm.

    I wrote & you replied:
    >The problem is that any mutation is just as likely to be UN-done as it is
    to occur, while it is waiting for all the others to occur. Imagine that you
    were collecting coupon cards (like you used to get in packets of tea or
    cigarettes). But suppose you had the rule that if you got a duplicate of
    one you already had, that you had to destroy both of them & wait for another
    one. Then it's relatively straightforward to show that the expected time
    you have to wait to get the whole collection rises exponentially with the
    size of the collection.<

    However, in biological systems, getting a duplicate usually means you can
    start a new collection. The high level of parallelism also means there are
    plenty of backup copies.

    My reply:

    I think we're confusing terms here (Probably my fault). I was not referring
    to gene duplication here; just to the possibility of a mutation getting
    undone. Perhaps the card collection was a bit misleading. What you'd
    normally do with a duplicate card is use it for swaps with other collectors
    :-). The situation I was trying to describe was where, to get any benefit,
    you have to get the whole collection, but you are just as likely to lose a
    card as to get a new one (one such collection I had as a kid was cards with
    vintage cars on them. If you got the whole collection, you got a toy model
    car for free; no reward for five out of six cards; you needed all six to get
    the toy). I used the "duplicate card cancelling out the existing one"
    purely as an analogy to model the neutral mutations. The situation being
    modelled is perhaps better described as you having a row of, say 20 coins,
    and you want to make them all heads. To do this, you continually pick up a
    coin at random and toss it, irrespective of whether it is heads or not & you
    carry on till you get all 20 to be heads. Clearly the closer you get to
    your target, the more likely you are to pick a head and turn it into a tail.
    In this case, the length of time taken to get all heads is exponential with
    the number of coins. The "mutations" are neutral in the sense that as
    collection of 19 heads has no selective advantage over a collection of 1
    head. Just as with the coupon cards, there is no advantage till you have
    the whole lot.

    Regards,
    Iain.



    This archive was generated by hypermail 2.1.4 : Tue May 13 2003 - 18:14:01 EDT