RE: No one believes proteins are the first form of life.

From: Peter Ruest (
Date: Mon May 27 2002 - 11:29:03 EDT

  • Next message: Jim Eisele: "Masoretic accuracy"

    Glenn Morton wrote (21 May 2002 21:07:58 -0700):
    > Of course it is rigging the roulette wheel when one says that experimental
    > evidence of multifunctionality in biopolymers is ruled out. And this is what
    > you do when you deny that there is any applicability of the RNA data to
    > biopolymers found in living systems.

    I am not ruling out any evidence, I just say that some of the evidence
    you adduce is not applicable to the question about which our discussion
    started, and I say why. Regarding the RNA data, I have responded
    repeatedly, and about the multifunctionality, I have done so in my
    response to your mail of 20 May 2002 21:59:02 -0700, under the subject
    "Polyphyly and the origin of life" (sent on Thu, 23 May 2002 18:41:09
    +0200 - which therefore you hadn't yet received when you wrote this).

    > You are the one who wrote on Sat 5/18/02 8:55 AM:
    > >This is in vitro RNA chemistry using some biochemical molecules. It may
    > >not have much to do with biology.
    > I don't know another way to understand that. You seem to rule it out as
    > irrelevant to biology. I am sure that some biochemists would be interested
    > that RNA chemistry is irrelevant to biology. And what they are doing with
    > those chemicals IS relevant to the RNA world thesis. So, you are rigging the
    > roulette wheel whether you want to hear it or not.

    If you read my sentence above, you see that I'm not saying RNA chemistry
    is irrelevant to biology. The in vitro RNA chemistry you were talking
    about was about artificial evolution of (biological or newly
    synthesized) RNA in vitro (using biological enzymes). The
    mutagenization, selection, fractionation were all artificial and very
    different from any conceivable natural conditions. You are right, they
    are trying to find out something about the RNA world thesis, and this is
    interesting in itself.

    But I dispute its relevance for the question of estimating the amount of
    information II (which I defined several times already), because
    information II is dependent on a natural environment. We don't know how
    RNA first emerged, we don't know how it would have replicated, we don't
    know whether an RNA world would be viable at all, and we don't know
    whether and how it could change into a cell-based system (which we know
    can evolve), so we don't know anything about its possible natural
    environment and possible evolution. So we can't deduce any indications
    about the amount of information II possibly stored in it and about the
    relevance of this for the information II on which today's biosphere is

    Furthermore, any artificial selection in RNA or protein evolution, as
    well as natural selection in vivo, unavoidably introduces functional
    information into the molecules selected. But what interests me is how
    much information is required for the first minimal functionality of a
    molecule, i.e. just before natural selection is able to set in: this is
    what I call information II. What Yockey measures, as "algorithmic
    information content" or "mutual entropy" in an orthologous protein
    family. We may call it "Yockey information". It is the sum of
    information II and information introduced by (natural, in his case)
    selection. But it is difficult to sort out how much of this sum is
    information II - which would provide a basis for estimating

    The RNA results do seem to indicate that some simple ribozyme functions
    are present in the RNA space at frequencies of perhaps 10^(-14)
    (depending on the particular function). So we can expect to find such a
    ribozyme function with a non-negligible probability - IF we have a
    suitable system of RNA production, RNA selection, RNA isolation. And
    outside the lab this is a very big IF. We just don't have all this apart
    from our artificial chemistry systems and apart from enzymes produced in
    today's biosphere. I do think (as the researchers working on RNA
    artificial selection systems probably do) that biochemical functions are
    far easier to find among ribozymes than among protein enzymes. On the
    other hand, the variety of ribozyme functions is probably much more

    > >Last time, I paid no attention to your adding "of smaller length". The
    > >small length of such "miniature proteins" is not at all the critical
    > >point, either with Lombardi or with me. What is important, instead, is
    > >the minimal number of specified amino acids (what I called the
    > >"invariant" above) - just as with Yockey's work. The protein may be
    > >longer, if the identity of the other amino acids adds nothing to the
    > >functionality. The question is: at how many positions in a protein do I
    > >need a particular amino acid (or for less stringent positions, any one
    > >in a given restricted set of amino acids), in order to get the function
    > >looked for? This is the "amount of specification" required for the
    > >function.
    > Not entirely. If one can then play the Yockey game and substitute
    > hydrophylic for hydrophylic amino acid etc, then the specification of
    > functionality becomes even less. And then one must also be sure that other
    > sequences of similar length don't perform the function. Shortening the
    > sequence doesn't tell you anything about other sequences.

    Yockey's substitutions are already taken into consideration in the
    "invariant". Other, independently evolved sequences (what I called
    "synonymous families"), on the other hand, are not, and they are a
    really interesting point, but, unfortunately, no one has been able to
    point out to me such an example, documented in the literature, having
    convincing evidence of independent evolution of the same function.

    Finding a minimal invariant has nothing to do with "shortening the
    sequence". Modern cytochromes c have an invariant of ~30, but a length
    of slightly over 100. Decreasing the invariant, while keeping the length
    constant, means decreasing features like activity, functionality,
    specificity, until you reach a point where all cytochrome c activity is
    lost. How large is the invariant just before that point? Such a
    minimally active protein would have had to emerge by a non-selected
    mutational random walk, but from then on it could improve by normal
    microevolution, i.e. natural selection of its functionality step by

    > >In contrast to Yockey, I add the theoretical requirement that this
    > >sequence is not derivable, by evolution with natural selection, from a
    > >different one with _less_ specification, but having nevertheless some of
    > >the function.
    > What you add as a requirement is meaningless. The work of Gerald Joyce does
    > show that molecules with tiny amounts of functionality can be evolved to
    > catalytic race horses. See Peter Radetsky, "Speeding Through Evolution,"
    > Discover, May, 1994, p. 85

    Joyce works with artificially seleced RNAs in vitro, and I agree that
    "catalytic race horses" (slight overstatement? ;-) may be evolved. But
    that's something completely different from biological natural evolution
    of proteins. I am frustrated that you still don't understand what I am
    talking about, after I explained it again and again. I also explained
    again and again why the RNA work is irrelevant for the question I'm
    pursuing, but you just ignore what I say.

    > >This is another way of saying that it had to be formed by
    > >means of a strictly random-walk mutational path. In this way, I hope to
    > >arrive at an estimate of the amount of information II. I would call such
    > >a protein a "minimal-functionality protein". This additional
    > >requirement, by the way, is not necessarily beyond experimental testing.
    > Well, you have to include the things I mentioned above as well as the fact
    > that tiny amounts of functionality can arise and Joyce showed it. Go read
    > that article. Yes it uses RNA but I contend that it is very relevant to
    > biology.

    See above.

    > >> And to assume that the original proteins performed precisely
    > >> the same task as evolved proteins do today is quite a leap.
    > >
    > >I have never said they did. All I assume for the estimate I am looking
    > >for is that it is a "minimal-functionality" protein.
    > You must define minimum functionality. Is it minimally functional if a
    > protein enzyme is able to catalyze X amount of reactions in 10 hours? in 20
    > hours? in 100 hours? One can't define minimal functionality based solely
    > upon what minimal length gives the enzyme the ability to catalyze X amount
    > of reactants in 2 minutes. Thus, your methodology is flawed.

    Please read my definition I gave many times before! And read it in
    context, without interrupting after every two sentences. What you are
    writing here shows that you haven't understood it, probably because you
    never read the whole definition in context.

    > > Subsequently, it
    > >may have evolved further by means of a normal evolutionary pathway with
    > >non-negative natural selection at each step. During this evolutionary
    > >path, its function may have been modified, as well as increased. But
    > >this further path is no longer easily tractable for a determination of
    > >information II.
    > The only way to define minimal functionality is to somehow define that if a
    > molecule catalyzes X amount of reaction in 1000 hours and 1 minute it is
    > nonfunctional but if it does it in 1000 hours it is functional. And that is
    > arbitrary.

    Apparently you assume that, for any functionality, an arbitrarily small
    amount of activity may be selectable in a biological system. How do you
    know that? Can you start your car with only one molecule of oxygen in
    your gas mixture in the engine - or with a spark providing 300 K in a
    cold motor, for that matter? Your refusal to accept the concept of a
    minimal functionality of an enzyme is unrealistic. I agree that when
    your chance of finding an RNase activity among random RNAs of the proper
    length is 10^(-14), and you artificially select for it, an RNase
    ribozyme will "emerge out of nothing", and there is no information II in
    it. That's exyctly why I am not looking for it there. But up to now, you
    have never given me any reason for believing there is not information II
    among proteins emerging in an evolving population of living organisms
    (e.g. bacteria), or in hypothetical "primitive" prokaryotes, or in a
    hypothetical RNA world situation, or in a hypothetical fully prebiotic
    geochemical system. I suspect the most promising way to find it is in
    artificial "minimal-selectable-functionality" protoenzymes, as I defined
    them. I don't see a way in which information II could be determined in
    an RNA artificial selection set-up.

    (MRCA=most recent common ancestor, cyt.c=cytochrome c)

    > >But what happened before the time of that coalescence in the MRCA? The
    > >MRCA must have evolved from earlier forms, which probably were simpler
    > >and less active, back to the minimal-selectable-functionality (MSF)
    > >cyt.c (about which I wrote in the other post, "Polyphyly and the origin
    > >of life").
    > >
    > >And before the time of this MSF cyt.c? The emergence of this MSF cyt.c
    > >is the only process not under natural selection (by definition), so it
    > >was a random mutational walk through sequence space, whose probability
    > >can be estimated if the size of the specification for the MSF stage can
    > >be determined.
    > >
    > >As for the RNA case c, you probably wanted to say that the improbability
    > >(rather than probability) is much less. This is correct for artificial
    > >selection systems. And it may even be correct for an initial natural RNA
    > >world - although we don't know this. But so what? I already explained
    > >that the RNA world probably cannot be used to estimate functional
    > >information content (II).
    > >
    > >I dealt in my last post with the irrelevance of the multifunctional
    > >proteins.
    > It seems that everything that goes against your position is 'irrelevant'.

    It seems that every explanation I gave you, even repeatedly, for
    something that goes against your position is forgotten by you.

    > >Please don't link me with Sydney Fox. Of course, proteinoids are no
    > >model for proteins, for many reasons, but primarily because of the
    > >problem of sequence information (II), which cannot emerge without
    > >reproduction and evolvability, as Joyce says. It's now 40 years ago that
    > >I became aware of Fox's proteinoids and immediately started criticizing
    > >them as completely useless for helping to explain the origin of life -
    > >at a time when everybody was celebrating him. So you are wrong in
    > >calling me "way behind the times".
    > No I am not wrong. You are linking yourself with Fox by holding that
    > proteins are the original informational carying molecules. No one believes
    > proteins are the origin of information yet you keep arguing that it is this
    > way. It isn't I who am saying what you are saying.

    You keep distorting what I said. I never said proteins were the original
    information carrying molecules. I only said they were the first
    molecules for which we might be able to detect genuine information II.
    And the reason for trying to do so is that _all_ current speculations
    about _natural_ origins of RNA, of replication, of protein, of DNA, of
    life, of information, of complexity, etc. are just that, speculations,
    and very incomplete and unrealistic ones at that, just-so stories, no

    > >Of course, I know the advantages of the RNA world hypothesis, as far as
    > >ribozyme functionality and the potential elimination of the
    > >protein-or-nucleic-acid- first chicken-and-egg problem are concerned.
    > >But we still don't know of any feasible prebiotic emergence of RNA and
    > >of replication.
    > There are self-replicating RNA's known now. So we do have knowledge of
    > their existence. Have we found a pathway from there to life? Not yet, but we
    > most likely will if given enough time.

    What are the known "self-replicating RNAs"? Please give me the
    reference. That would be sensational - even though we don't know yet how
    they could emerge prebiotically.

    > >Apparently, it's YOU who are the strawman builder! And you can only do
    > >it because you either haven't read what I wrote or because you ignored
    > >it or because you forgot it. I hope the last is the case. You are
    > >constantly upbraiding me for not having read the most recent papers you
    > >think were relevant, but you are not even up-to-date on what those you
    > >criticize wrote!
    > In all the above cases, I point you to your own words.

    I have answered you on all these points. None holds water.

    > And I will cite your
    > rejection of the RNA world as evidence that I am correct. You wrote Sat
    > 5/18/02 8:55 AM:
    > >In my post, I was discussing the evolution of functional proteins in a
    > >DNA-RNA-protein world, not evolution in an RNA world.

    How is this a rejection of the possibility of an RNA world? I was just
    saying that, in that context, I was not discussing it.

    > >I never claimed proteins were the first to evolve. It's just that I am
    > >almost as skeptical about current pet speculations about
    > >self-organization as I was about those of 40 years ago.
    > So in other words, I am correct that you don't hold the present views. That
    > makes you behind the times. Sorry, but that is just the way it is.

    So, everyone who is critical of some "present views" is behind the
    times?! Glenn, I can't believe you are serious about that. Consider that
    there even might be false hopes, illusions, and worse among the present
    views. If I remember correctly, you have often criticized "present
    views" in your own field.

    > > Some serious
    > >thinking about the emergence of biological information (II) is sorely
    > >needed - both with respect to the origin of life 3.9 billion years ago
    > >and with respect to the origin of novel molecular functionalities ever
    > >since.
    > As I have pointed out information II is a mirage. You can't recognize it in
    > a sequence of letters.

    Your first sentence is your belief, unsupported by sufficient evidence.
    The second one shows you have not understood what information II is.

    > >I'll repeat again what I wrote x times already: A string of any symbols
    > >has a computable Shannon information, and it has a computable maximum
    > >information carrying capacity (I). But it is unknown, without any
    > >further knowledge, whether it contains any semantic or functional
    > >information (II).
    > Yes, I know you wrote that and that is exactly my point. You can't even
    > define information II. You can't recognize it, it is a mirage, a figment of
    > antievolutionary imaginations.

    You keep picking out some small part of my definition of information II,
    ignoring all the rest, and jump to some, usually wrong conclusion on the
    basis of that one expression or sentence. So we keep turning in circles.

    > > Unless you know the appropriate language, you can't
    > >read it. You may, by statistical analysis, find that, whith a certain
    > >probability, it does contain some information.
    > What statistical analysis. Be specific. The plain fact is that a sequence
    > with lots of information has high Shannon information and a random sequence
    > equally has high Shannon information. You show here that you don't really
    > understand SHannon information.

    In his "Information theory and molecular biology", Yockey defined the
    information content of a family of orthologs. My definition of
    information II is a defined subset of this Yockey information content.
    As for the connection with Shannon information, which is not at all the
    same thing, look it up in Yockey's book! I have said before that Shannon
    information depends on the alphabet size and the sequence length
    (although, in my reference to information carrying capacity (I) quoted
    above, I just assumed the alphabet size is known, 4 for DNA, 20 for
    protein). That implies that a sequence which looks random may or may not
    contain semantic or functional information, and that corresponds about
    to what you write above. I never questioned that "a sequence with lots
    of information has high Shannon information and a random sequence
    equally has high Shannon information".

    > > And if the probability is
    > >high enough and the text is long enough, you may even be able to learn
    > >the appropriate language (including its grammer, syntax,...), e.g.
    > >Sumerian. ...
    > snip
    > > But 6 words
    > >are definitely insufficient to deduce an unknown syntax, which is a
    > >requirement for selecting the legal word placement you ask me for. Try
    > >the same with Latin! There, you may find many possible word sequences in
    > >a sentence to be legal, particularly with poetry.
    > Then I would point you to the Beale Papers discussed on page 82-93 of Simon
    > Singh's The Code Book. No one knows if it is a language or not. These papers
    > are purported to tell where great treasure is, but no one has been able to
    > decipher them. No one knows if they are a language or random. How would you
    > propose to go about determining that they are or are not a language. There
    > are 23 pages. For over 100 years no one has been able to decipher these
    > pages. How do you propose to simply tell us that there is a message there?

    What I wrote here referred to the 7 word permutations of the Doric
    phrase you asked me to "read". I never claimed to be able to discern
    whether _any_ sequence of letters or other symbols is a language. Since
    I don't know any Doric, I just trusted you that it is a valid human
    language. Starting from this point, I said that it clearly does look
    like a language to me, and even ventured to guess it was an indogermanic
    one. Knowing German and English, I also might guess the meaning of some
    of the words, but without certainty. After saying this, may I risk to
    say your paragraph above is irrelevant for our discussion?

    > >> (GM:) And no one believes, like you seem to, that proteins were the first
    > >> biopolymers. They weren't, and you are arguing for a 50-year-old rejected
    > >> idea. At least stay with the program and the current thinking on
    >the topic.
    > >> Most researchers beleive that life evolved through the RNA.
    > >
    > >You are constantly misrepresenting what I wrote. See above. I'll just
    > >add some short remarks about the homonyms, which I skipped last time.
    > Then why on earth on Sat 5/18/02 8:55 AM did you write:
    > >In my post, I was discussing the evolution of functional proteins in a
    > >DNA-RNA-protein world, not evolution in an RNA world.
    > And in this message above you wrote:
    > >
    > >I never claimed proteins were the first to evolve. It's just that I am
    > >almost as skeptical about current pet speculations about
    > >self-organization as I was about those of 40 years ago.
    > How can I take you seriously when you say you don't believe that proteins
    > were involved in the origin of information, when you turn around and say
    > that they were? It is very confusing to discuss these things with you.

    I did say (or imply) that I believe that:
    (1) novel proteins evolved in a functioning DNA-RNA-protein world many
    times after the origin of life;
    (2) de novo emergence (in a prebiotic world) of RNAs with some (although
    little) function useful for life is much more probable than de novo
    emergence of proteins with some such function;
    (3) both such de novo emergences are fully speculative, with virtually
    no experimental support (none for proteins; some for RNAs, but only
    under artificial, not natural selection conditions);
    (4) the claims that self-replication of RNA emerged, that an RNA world
    is viable, and that an RNA world can be transformed into a
    DNA-RNA-protein world are fully speculative, with no experimental and
    virtually no theoretical model support;
    (5) functional molecules (RNA or protein), in a system functioning as a
    replicating individual, are subject to selection (artificial or
    natural), which may improve on this function;
    (6) whether "minimal-selectable-functionality" protoenzymes (or
    -ribozymes) of an invariant greater than 2 (for proteins, greater but
    unknown for RNA) exist depends on the frequency in sequence space of the
    particular activity requested, the detectability level of this
    functionality, the sensitivity and selectivity of the selection
    environment, and the efficiency of the fixation mechanism;
    (7) if such a protoenzyme does (potentially, not necessarily in reality)
    exist for a given functionality, an information II content can be
    estimated, and the probability of its spontaneous emergence is
    non-trivial; this is the main case of interest;
    (8) if the "minimal-selectable-functionality" invariant is smaller, its
    emergence is only a matter of microevolution, i.e. it will emerge, given
    enough resources in molecules, selection and time, so that no
    information II can be determined;
    (9) all (artificial or natural) selection introduces some Yockey
    information from the environment (but not information II) into the
    (10) the sum of the information II and all information introduced by
    selection corresponds to the Yockey information.

    I never claimed any of the following (which you apparently claim I did):
    (1) proteins were the first biopolymers;
    (2) there never was an RNA world;
    (3) spontaneous emergence of a self-replicating RNA is impossible;
    (4) emergence or further evolution of an RNA world is impossible;
    (5) evolution of an RNA world species into a DNA-RNA-protein one is
    (6) emergence of information II was restricted to the origin of life;
    (7) after the origin of life all further evolution occurred under
    (8) identity of information II or Yockey information with Shannon
    (9) derivability of information II from Shannon information;
    (10) derivability of information II from a given symbol sequence.

    I never claimed that the emergence of life and of genuine novelty during
    its further evolution could _not_ occur spontaneously (much less that
    such a claim could be proven), but I seriously doubt it, given what we
    know today. On the other hand, there is, today, no support whatsoever
    for the claim that life _did_ emerge _spontaneously_ and that genuine
    novelty did during its further evolution.

    I am sorry if this confuses you. I think it's self-consistent and it's
    consistent with what is known today. It just means that you can't just
    pick out one statement, or one fact, out of the whole context of the
    origin and evolution of life.

    > >> "The fundamental problem of communication is that of reproducing
    > >> at one point either exactly or approximately a message selected
    > >> at another point. Frequently the messages have _meaning-, that
    > >> is they refer to or are correlated according to some system with
    > >> certain physical or conceptual entities. These semantic aspects
    > >> of communication are irrelevant to the engineering problem." C.
    > >> E. Shannon, " A Mathematical theory of Communication" The Bell
    > >> System Technical Journal, 27(1948):3:379-423, p. 379
    > >>
    > >> What part of the term 'irrelevant' do you not understand?
    > >> glenn
    > >
    > >I understand that you completely misapply this quotation from Shannon to
    > >our discussion. We may perhaps apply it in a _partially_ meaningful way
    > >by saying that there is a communication channel from DNA to protein, and
    > >that the semantic aspects of the biological information transmitted are
    > >irrelevant to the engineering problem of transcription and translation.
    > >That is, the irrelevance applies only apart from the fact that the
    > >transcription and translation machineries themselves are also specified
    > >by the semantic aspects of the biological information. Thus, unlike
    > >information technology, the biological system is self-referential.
    > This clearly shows you have no understanding of Shannon. It matters not
    > whether the transmission of information is by knotted rope as the Incas did
    > it, or by bits, or 8bit bytes, hexadecimal or by letters or by cellular
    > machinery. The same pheonomenon is occurring--the replication of sequence A
    > at site B. This can take place with a translation (which is what computer
    > compression is as well as translation to proteins) except that with
    > proteins, information is actually lost in going from DNA to proteins.
    > You got the last word.
    > glenn

    I concede that your knowledge of Shannon information is greater than
    mine. But still you have not shown me any case where I have
    misunderstood or misapplied it, including your last paragraph about it.
    You did not carefully enough read what I said. Your criticism doesn't
    apply at all.

    But thank you for discussing with me, anyway!


    Dr. Peter Ruest, CH-3148 Lanzenhaeusern, Switzerland
    <> - Biochemistry - Creation and evolution
    "..the work which God created to evolve it" (Genesis 2:3)

    This archive was generated by hypermail 2b29 : Mon May 27 2002 - 12:09:25 EDT