RE: Emergence of information out of nothing?

From: Peter Ruest (
Date: Mon May 13 2002 - 11:29:21 EDT

  • Next message: Sondra Brasile: "Gay question"

    Hi Glenn, you wrote (8 May 2002 21:45:23 -0700):
    > Hi Peter, you wrote:
    > >-----Original Message-----
    > >From: Peter Ruest []
    > >Sent: Wednesday, May 08, 2002 9:21 AM
    > >Of course, a function can only be executed if the appropriate substrate
    > >is available. The operation on this substrate is what defines the
    > >particular functionality (or capacity for functioning). But this
    > >functionality requires a certain specific structure. A different
    > >structure will perform a different function - or none at all. Now this
    > >structure is inherently present in this protein molecule, whether or not
    > >its substrate is around. The function just is not being executed.
    > This fits exactly my point as the substrate is merely another molecule. and
    > functionality is only seen in relation to a given molecule. Bring another
    > one in and you will have a different functionality or none at all. Function
    > is relative not absolute.

    Biological function always is defined with respect to its natural
    reacting components. Enzymatic function is defined with respect to its
    natural substrate(s). You may talk of "relative" if you artificially
    introduce unnatural reagents (which, of course, is often done in
    experiments), but that is beside the point in this discussion.

    > >Therefore, I consider some kind of functional, structural, or semantic
    > >information (II) to be contained in the structure of this protein, and
    > >this structure is a consequence of its amino acid sequence, which in
    > >turn is a consequence of the DNA sequence coding for it, and this
    > >information (II) is a certain (usually unknown) fraction of the
    > >information capacity (I) of this length of DNA.
    > You use the example of Yockey's work with cytochrome c below. He found there
    > were 10^93 different cytochrome c's which would perform the work of that
    > molecule. What he didn't prove was the possibility of huge numbers of other
    > FAMILIES of proteins which will also do that job.

    Agreed - in principle. Yet, if there are any other families (let alone
    huge numbers) which will perform the same function (in the same
    organismal environment), I find it strange that no such example has been
    found to date, as far as I know. You may say this is because just one
    family happened to be present in the universal common ancestor and was
    inherited by the whole biosphere. But such an argument would just push
    the problem back to the origin of life: if there were such huge numbers
    of unrelated possibilities, why was there an universal common ancestor
    at all, rather than a huge number of unrelated ones?

    > Given the work of Joyce
    > and others (which you didn't mention in your reply) they have found that if
    > you choose a function and search for it with random molecules and random
    > mutation, you can find any given function with a probability of 1 in 10^14
    > to 1 in 10^18. I cite this:
    > Andrew Ellington and Jack W. Szostak "used small organic dyes as the
    > target. They screened 10 13 random-sequence RNAs and found molecules
    > that bound tightly and specifically to each of the dyes.
    > "Recently they repeated this experiment using random-sequence
    > DNAs and arrived at an entirely different set of dye-binding
    > molecules. ...
    > "That observation reveals an important truth about directed
    > evolution (and indeed, about evolution in general): the forms
    > selected are not necessarily the best answers to a problem in some
    > ideal sense, only the best answers to arise in the evolutionary
    > history of a particular macromolecule."~Gerald F. Joyce, "Directed
    > Evolution," Scientific America, Dec. 1992, p. 94-95.

    p.48 not 94

    > And I cite this:
    > "We designed a pool of random sequence RNAs, using the minimal ATP
    > apatamer as a core structure. By creating a pool that was
    > predisposed to bind ATP specifically and with high affinity we hoped
    > to increase the likelihood of generating molecules with ATP-dependent
    > kinase activity. The ATP apatamer core was surrounded by three
    > regions of random sequence of 40, 30 and 30 nucleotides in length,
    > respectively. The ATP-binding domain itself was mutagenized such
    > that each base had a 15% chance of being non-wild-type, to allow for
    > changes in the apatamer sequence that might be required for optimal
    > activity. To increase the likelihood of finding active molecules, we
    > attempted to create a pool containing as many different molecules as
    > possible. Because it is difficult to obtain an acceptable yield from
    > the sysnthesis of a single oligonucleotide of this lenght (174
    > nucleotides), we made two smaller DNA templates and linked them
    > together to generate the full-lenght DNA pool. Transcription of this
    > DNA yielded between 5 x 10^15 and 2 x 10^16 different RNA
    > molecules."~Jon R. Lorsch and Jack W. Szostak, "In Vitro Evolution of
    > New Ribozymes with Polynucleotide Kinase Activity," Nature, 371,
    > Sept. 1994, p. 31
    > We can act as if the probability is very low to find a given functionality,
    > like YECs act as if the earth is young, but acting like it isn't going to
    > change the fact that functionality is found much more readily than
    > anti-evolutionary activists want to believe.

    Glenn, you know very well that I am neither a YEC nor an
    anti-evolutionary activist (cf. All I insist on
    is that an adequate mechanism for producing evolutionary novelty is as
    yet elusive.

    > So, given that I am mentioning this work for a second time, will you respond
    > to it's import now?

    You have not mentioned these papers (if I remember correctly), but
    similar ones, and I responded in detail. But I may do it again, giving
    you a new example if you insist. A. Lombardi, et al., "Miniaturized
    metalloproteins: Application to iron-sulfur proteins", PNAS 97 (2000),
    11922, attempted to design a minimal redox enzyme, but haven't achieved
    their goal as yet. Their dimeric undecapeptide can hold an iron atom,
    but is unstable, being too small to shield off the environmental water.
    The invariant of their (intelligently designed) construct amounts to at
    least 5 specific amino acid occupations, which is too much to be
    attainable by an evolutionary process without selection.

    > >This only works because you first give me the book, which contains all
    > >the relevant semantic information. With the signal, you just send me
    > >ln(3) bits of information, not lots.
    > I believe that is exactly what I said in my note. I haven't sent you lots of
    > shannon information, but I have sent you lots of colloquial information.

    [Sorry, I should have written log2(3), instead of ln(3).] How can you
    transmit colloquial information (in the book) without any Shannon
    information? Whatever you transmit through whatever medium can be
    measured by Shannon information (which, however, also includes all the
    uninteresting noise and the irrelevant part of the colloquial

    > You want to keep the signal small
    > >in order to transmit it fast, therefore it cannot carry all the semantic
    > >information you want me to have for executing your plan, so you transmit
    > >the large amount of information beforehand and make the signal nothing
    > >but a pointer to one of the 3 large texts you transmitted beforehand.
    > You miss my point. You had stated that semantic information is related to
    > SHannon information. I gave you a case where that wasn't the case. Shannon
    > information isn't related to semantic information.

    It _is_ related, see above, just not 1-to-1. I didn't miss your point,
    but your example doesn't work. Without the book the transmitted pointer
    is of no use at all. Its Shannon information remains the same, but its
    semantic information is zero. With the book and the pointer, the total
    Shannon information transmitted is huge, the semantic information just
    equivalent to what you wanted to have me know at the end, namely about
    one third of the semantic information in the book.

    > >> Because of this private agreement for meaning, one can't quantify it. And
    > >> unless one can quantify it, he can't quantify your 'biologically relevant
    > >> information'.
    > >
    > >I agree that any semantic information or meaning depends on language
    > >conventions agreed upon beforehand. But the only reason you cannot
    > >easily quantify it is linguistic ambiguity (synonymous words, phrases,
    > >sentences, paragraphs,... errors, imprecision, errors,...).
    > I disagree strongly with this assertion. The reason you can't quantify
    > semantic information is because you can't quantify the agreement.

    You are saying exactly what I did, using different words.

    > You know
    > that gift doesn't mean the same in German. Poke doesn't mean the same in
    > American english as it does in English english.

    Homonyms may be difficult to find in biology! They occasionally occur in
    our languages, even within the same language.

    > And American english doesn't
    > have terms like 'jobworthy', or 'puckle' or 'bobbies' as English english and
    > Doric english do. How do you quantify the clear and obvious (to me) semantic
    > meaning when you don't know the semantic meaning. And because of this,
    > semantic information becomes SUBJECTIVE not OBJECTIVE. It has nothing
    > whatsoever do do with ambiguity. Puckle is a clearly defined word with no
    > imprecision.
    > Hearing German means nothing to me because I don't know the language. I
    > can't even tell if someone using a gutteral language is really speaking
    > German. I can have an idea that they are, but that doesn't mean that they
    > are. Thus I can't OBJECTIVELY determine meaning without being in on the
    > private agreement about what sounds mean what.

    It's the same with biological functions we don't understand yet. I never
    claimed to understand all biological functionality, even of a single
    enzyme. But I claim that biological molecules _do_ have precise
    functions - and therefore semantic information -, just as linguistic
    words do. Meaning is relative to a specific language, as you maintain,
    and it's the same with biological "words", but this doesn't eliminate
    information for the system that "knows" the appropriate language. And
    that's what counts in biology.

    > >> It is the same problem as trying to determine which of the following
    > >> sequences has meaning.
    > >> ni ru gua wo shou bu de bu dui jiao wo hao hao?
    > >[I skip some of your long "message"]
    > >> If you can tell which has meaning, then you can determine biological
    > >> functionality.
    > >
    > >Which meaning? Which functionality? What language or code? I.e. I agree
    > >that meaning or biological functionality is not derivable from the
    > >sequence alone, but must be found by the knowledge of the language or
    > >biological observations.
    > The very fact that you have to ask what meaning, what language what code
    > admits of the fact that meaning isn't objectively determinable.

    I have never claimed functional meaning of a protein can be read
    directly from its DNA code. We need to do a lot ob proteomics before we
    can really use the infomation in the sequenced human genome.

    > I have
    > presented this type of test to everyone who has made the claim you do, that
    > semantic information is a scientific concept

    There are lots of things in biology which we are not (yet?) able to
    precisely formulate mathematically. You are asking the impossible. You
    can't tell me how many bits of semantic information 'puckle' contains.
    But do you claim linguistics is not a science? Or that words don't
    convey (and therefore: contain) information to one knowing the

    > and not a single one of you has
    > had the guts to even guess which sequence.

    I don't think you seriously require knowledge of Chinese for anyone who
    wants to think about biology... ;-)

    > The sequence which had meaning was:
    > 47bt7qb29bwy97be3bg7be78bh8bu8q9b29byq9bg7byq9P
    > Which was a caesar cypher for a Mandarin message in pin yin (lacking the
    > tonations) which said:
    > ru gua wo shou de bu dui ni jiao wo, hao bu hao?
    > If Adrian Teo works hard, I bet he can tell you what it means. But you still
    > don't know if I am telling you the truth that it has meaning. Meaning is not
    > a scientific concept. It can't be quantified and it isn't objective to an
    > outsider. You all keep saying that it is 'information' but you never, ever
    > even guess which sequence has it much less tell me HOW MUCH semantic
    > information there is in the sequence.

    Read again what I said about quantifying semantic information.

    > >> Information is related to entropy and must have a p log(p) format. Can you
    > >> derive such a thing for what you define as information II? If you can't,
    > >> then it isn't any form of information. It may be something else, but just
    > >> not information.
    > >
    > >It is not (unqualified) Shannon information. It is information about in
    > >the same way as a written paragraph contains information. It tells us
    > >something. Therefore I call it semantic information.
    > But you can't tell me the first thing about this supposed 'semantic
    > information' other than that it tells us something. How much 'tells us
    > something' was in the chinese expression I gave you? How much 'tells us
    > something' is in this sentence? What is the difference in 'tells us
    > something' between the German 'gift' and the English 'gift'? Is there any
    > difference between 'ma(1)' in chinese and 'ma' in English? Or how much
    > difference in the 'tells us something' is there between pa(4) in chinese and
    > 'pa' in English? Is there a difference in 'tells us something' between 'nai
    > nai' in Chinese and 'Grandmother' in English?
    > The concept is useless, empty and misleading. It does nothing for us other
    > than make us feel like we are really being scientific when in fact we
    > aren't.

    Maybe we'd better talk about this again after you had a look at Yockey's
    book. Otherwise, we may not get any productive discussion.

    > >H.P. Yockey, "Information theory and molecular biology" (Cambridge
    > >University Press, 1992, ISBN 0-521-35005-0) has defined the information
    > >content of iso-1-cytochrome c family of enzymes with respect to their
    > >common functionality (excluding the unknown additional information due
    > >to species-specific requirements). He took into account genetic code
    > >degeneracy and mutational transition probabilities between amino acids
    > >equivalent with respect to enzyme function in this protein family. By
    > >defining sequence similarity in terms of "mutual entropy" in p log(p)
    > >format, he linked semantic information with information potential for
    > >the given sequence length.
    > I believe this is a total misreading of Yockey. He does not connect semantic
    > information with information potential. But I will admit that my copy of his
    > book is in storage in Houston while I live here.

    If you think I'm misrepresenting Yockey, we may have to wait with
    discussing this until you find a copy of his book.

    > >> As to multiple pathways. I recall back in the early 1960s, the argument
    > >> against the chance formation of a peptide was based upon the chance of
    > >> finding a single sequence out of all sequence space. So for
    >oxytocin, an 8
    > >> amino acid peptide, the chance of making human oxytocin was 1 in 8^20 or 1
    > >> in 10^18.
    > >
    > >We better write 1 in 20^8 or 4 in 10^11.
    > You are correct, thank you. Maybe one too many toddies that night.
    > >> Of course they would use a 100 unit long peptide and have 20^100
    > >> or 1 in 10^130. and give an indignant conclusion as to how could anyone
    > >> believe such odds. But as we have learned things over the past
    >30 years, we
    > >> have brought those numbers down to 1 in 10^40 because we know
    >now that more
    > >> than one sequence can perform the same task.
    > >
    > >1 in 10^40 is still much too small to be of any use.
    > >> And experiments by Gerald
    > >> Joyce, Jack Szostak and others, show that functionality in a
    >test tube full
    > >> of RNA is found at a rate of 1 in 10^14 to 1 in 10^18. That is
    > >
    > >We have discussed this before, without coming to an agreement. RNA is
    > >not protein. And artificial selection in vitro is not natural selection
    > >in vivo - or even random mutational walks with no selection.
    > Can you cite an experiment which shows that the same is not applicable to
    > proteins? I mean experimental data not merely someones opinion. Afterall RNA
    > is related to DNA and DNA makes proteins.

    I did that last time we discussed this, if I remember correctly. Of
    course, you realize that positive results (which are feasible with
    artificial selection of RNA in vitro) are published, but this is usually
    not done (or not possible!) with negative results (with natural
    selection of proteins in vivo). Thus, we can at most expect to find
    partial results, such as the one by Lombardi I cited above.

    > >It will be fascinating to watch possible progress towards interesting
    > >results.
    > Agreed.
    > glenn


    Dr. Peter Ruest, CH-3148 Lanzenhaeusern, Switzerland
    <> - Biochemistry - Creation and evolution
    "..the work which God created to evolve it" (Genesis 2:3)

    This archive was generated by hypermail 2b29 : Mon May 13 2002 - 19:56:55 EDT