Re: Protein Evolution

From: Peter Ruest <>
Date: Mon Sep 20 2004 - 11:04:20 EDT

Josh Bembeneck posted (on 7 Sep 2004) a prepublication abstract of Behe
M.J. & Snoke D.W., "Simulating evolution by gene duplication of protein
features that require multiple amino acid residues", Protein Science 13
(2004). Josh asks whether this paper might indicate that "ID is
beginning to produce" and might be "ground-breaking".

George Murphy responded that, on the basis of the abstract, "the claim
is that a certain mechanism won't produce mutations at a rate adequate
for evolution. If that's the case it's a purely negative result... There
is of course a big jump from that to the positive claim that therefore
there must be an Intelligent Designer."

David Campbell said (on 9 Sep 2004): "From the abstract, it is not
obvious that the calculation takes either of two factors into account:
a) there are generally numerous ways to achieve a particular protein
function; b) there are lots of genes to work with as possible starting
points for mutation". David points to in vitro evolution which works
rather rapidly. Also, "a significant novel function in a protein may
promote splitting of a population... thus strongly affecting the gene
frequencies in the populations", and "it would be unreasonable to then
claim that this calculation shows that things cannot change under a
situation with strong selective pressure."

Meanwhile, I have read Behe & Snoke's full paper. However, I must
emphasize that I didn't try to check their mathematics, only the
evolutionary biology.

They clearly state that their simulations test only one of different
possible mechanisms of evolution, namely a fresh gene duplicate
accumulating point mutations. It is generally considered a major route
to evolutionary novelty. But it also seems to be the only mechanism
which can be tested in a more or less rigorous way, without too many
arbitrary assumptions. This simulation is not dependent on any other
conceivable evolutionary mechanisms, so the results may indicate where
other paths would have to be investigated.

They consider a "multiresidue" feature defined by the minimum number
/lambda/ of specific nucleotide occupations being required before a
minimal function is achieved - on which natural selection could then
first take a hold for producing further improvement. They then determine
the numbers of generations (times) to fixation and/or population sizes
required as a function of /lambda/. This procedure contrasts sharply
with what is usually done in evolutionary simulations which _assume_
that, in _all_ situations, natural selection operates on _any_ novel
function emerging _from the start_. This assumption is unreasonable
because of the obvious interdependences between many different parts of
a functional sequence.

They don't start with any particular full-length coding sequence to be
mutated in search for a new function, but consider only the /lambda/
specific positions where a particular nucleotide is required, wherever
they might be in the sequence. They require these /lambda/ specific
nucleotides to be acquired _before_ any "null mutation" appears anywhere
in the coding sequence for a typical protein length. A "null allele" is
defined as precluding the particular novel function envisaged. The
probability of null mutations is estimated from literature findings
which indicate that in functional proteins, on average 6 different amino
acid replacements per position are tolerated without eliminating the
function and 14 (including stop), i.e. 2/3 are not (this doesn't imply
lethality for the organism, just inactivation of the particular
protein). The simulation replaces null alleles with copies of the
original set of /lambda/+1 nucleotides. All other mutations (neutral
ones, or non-destructive of the novel function) are accounted for in the
probabilities but treated as irrelevant. This reflects the fact that
gene duplication events occur about as often as point mutations per
nucleotide site in coding regions.

Starting with one gene duplicate per individual in the population is a
particularly conservative feature of the simulation, which however may
compensate for duplicates spreading by random drift or founder effects.

Ignoring the whole sequence except the /lambda/ required nucleotides
disposes of the difficulty that a novel feature might be derived from
various different pseudogenes: the entire sequence material available
can be dealt with at once, as _any_ duplicated sequence will do as a
starter. On the other hand, the simulation looks for a particular
function. If _any_ of a number of novel function will be useful in a
given context, the probabilities would have to be multiplied by the
number of these different novel functions (which would presumable be
much smaller than the number of functions already available).

After the "time of first appearance" of the multiresidue feature
(simulated without selection), the simulation is continued with
selection coefficients of 0.003 to 0.3 until the feature is fixed in the
population (by their definition: present in >50% of the population).
During this phase, the multiresidue feature found is kept unchanged. The
"time to fixation" is the sum of both phases. But the time for fixation
alone is negligible compared with the time required to produce the
multiresidue feature. Therefore this sum "varies linearly with 1/N" at
smaller population sizes. An improved success rate is possible, but only
for /lambda/ <= 5, if the population is pre-equilibrated by the
non-selecting mutation regime before the actual start of the actual
experiment, as this allows for the occurrence of serendipitous
"exaptations". For 3x10^8 pre-equilibration generations and /lambda/=5,
4, 3, 2, the time to first appearance is decreased by factors of about
1.3, 3, 7, and 25, respectively - less for shorter pre-equilibrations.

The model is least sensitive to the values of the mutation rate and the
selection coefficient, much more to the fraction of null mutations, and
particularly sensitive to /lambda/.

Generally, with /lambda/ values beyond 2-3, it seems hopeless to succeed.

Mike and his collaborator might want to comment themselves.

I agree with David's general conclusion, "This is a reasonable
scientific paper written by people who favor ID and probably motivated
by their ID views. However, it does not demonstrate ID, any more than
providing an evolutionary explanation for a particular phenomenon
refutes ID. It does raise some questions about the accusations of
overwhelming bias among conventional scientists."

In my lecture at the Conference "Sources of Infor-mation Content in DNA"
in Tacoma, WA (1988), I proposed a probability estimate for the
evolutionary emergence of cytochrome c activity, reaching very similar
conclusions to those of Behe & Snoke in their present paper, although in
a much less sophisticated way.

Today, there is very convincing evidence, on the basis of sequence
analyses, that all species, including humans, share common ancestors -
i.e. the "fact of evolution". And there exists convincing evidence for
the mechanism of evolution involving mutation and natural selection. But
it remains a virtually undocumented assumption that the scope of this
mechanism (and variations thereof, like recombination, indels, etc.) is
sufficient to account for novel functions. And virtually no evolutionary
biologist seems to care about this distinction. Apparently,
ontologically naturalistic axioms are shaping their belief that there is
no basic distinction between the microevolutionary mechanism of
improvement of an existing function and its extrapolation to the
emergence of novel functions. Thus, we have metaphysical predilections,
coloring what is conceived as significant, on both sides.

The reference to in vitro selection experiments is rather misleading, as
there are huge differences between artificial mutation/selection of
genotypes and natural selection of phenotypes naturally produced in
populations of entire organisms.

Although I doubt (for theological reasons) that there can be a
scientific conclusion to intelligent design in biology, I consider Behe
& Snoke's paper to be very productive.


Dr. Peter Ruest, CH-3148 Lanzenhaeusern, Switzerland
<> - Biochemistry - Creation and evolution
"..the work which God created to evolve it" (Genesis 2:3)
Received on Mon Sep 20 11:23:08 2004

This archive was generated by hypermail 2.1.8 : Mon Sep 20 2004 - 11:23:10 EDT