I had written:
> >I call this a fundamental distinction between the following two concepts:
> >(I) Maximum information carrying capacity;
> >(II) Functional information relevant for biological systems.
> I am delighted to see this split because too many people who get into this
> area totally confuse the two. I think I would prefer to not to call II
> 'information' but would call it functional probability between two
> molecules. Consider an isolated protein which is in a universe of its own
> with no other object. No matter how long it's sequence is, does nothing
> useful. It just sits there letting quantum fluctuations wiggle it. It has
> no 'information relevant for biological systems.'
> Now bring in a second protein into this universe and let it bounce around
> bumping into the original one. At some of the collisions, the two molecules
> stick together for a bit. at others, one of the proteins splits--i.e. it has
> been catalyzed by the other one. What we have is not information but a
> probability of a particular functional interaction. This can only happen as
> two molecules interact. Lacking a second a second protein in the universe,
> there is no function and thus no biologically relevant information in your
Of course, a function can only be executed if the appropriate substrate
is available. The operation on this substrate is what defines the
particular functionality (or capacity for functioning). But this
functionality requires a certain specific structure. A different
structure will perform a different function - or none at all. Now this
structure is inherently present in this protein molecule, whether or not
its substrate is around. The function just is not being executed.
Therefore, I consider some kind of functional, structural, or semantic
information (II) to be contained in the structure of this protein, and
this structure is a consequence of its amino acid sequence, which in
turn is a consequence of the DNA sequence coding for it, and this
information (II) is a certain (usually unknown) fraction of the
information capacity (I) of this length of DNA.
> In some sense this is identical with language where meaning is a private
> agreement between various peoples that certain sounds 'mean' certain things.
> Biological functionality is only possible when two molecules combine to
> perform some chemical action.
I am happy you agree with the comparison between the biological code and
> >For at least 20 years, I have drawn this distinction between (I) and
> >(II) clearly, in both talks and articles. Glenn, thank you for your fine
> >summary of Shannon information, which I knew in principle, but could not
> >have formulated as well as you did.
> THank you for the kind words.
> >It [Shannon information] corresponds to (I), and to what
> >C.J. Hogan discussed in the paper I cited. I agree that, in itself, this
> >has nothing to do with semantics, meaning or function denoted by concept
> >(II). But it denotes an absolute upper limit of the amount of semantic
> >information that can be transmitted or stored in a given system. And
> >this is one of the points I wanted to make in my post.
> I could not agree that I places a limit on the transmission of semantic
> information. Semantic information simply isn't related to I. Suppose you are
> a commander in my army and I give you a book with detailed plans on what to
> do and where to go under 3 different conditions, plan A, plan B and plan C.
> And I tell you that I will 3 days from now send you on your computer a % for
> plan 1, a * for plan 2 and a R for plan 3. Each symbol stands for or 'means'
> the entire relevant plan. When I send you that signal, I have sent you lots
> and lots of sematic information but very little Shannon information.
This only works because you first give me the book, which contains all
the relevant semantic information. With the signal, you just send me
ln(3) bits of information, not lots. You want to keep the signal small
in order to transmit it fast, therefore it cannot carry all the semantic
information you want me to have for executing your plan, so you transmit
the large amount of information beforehand and make the signal nothing
but a pointer to one of the 3 large texts you transmitted beforehand.
> Because of this private agreement for meaning, one can't quantify it. And
> unless one can quantify it, he can't quantify your 'biologically relevant
I agree that any semantic information or meaning depends on language
conventions agreed upon beforehand. But the only reason you cannot
easily quantify it is linguistic ambiguity (synonymous words, phrases,
sentences, paragraphs,... errors, imprecision, errors,...). And with a
biological function/molecule, you have the additional uncertainties
about when, where, how much natural selection helped or hindered its
evolution. The amount it helped is in fact the amount of information
transmitted from the environment. The only part of its evolutionary
trajectory that is somewhat easier to judge is the beginning where none
of the function under consideration was developed, as yet, such that
natural selection with respect to this function was impossible.
> It is the same problem as trying to determine which of the following
> sequences has meaning.
> ni ru gua wo shou bu de bu dui jiao wo hao hao?
[I skip some of your long "message"]
> If you can tell which has meaning, then you can determine biological
Which meaning? Which functionality? What language or code? I.e. I agree
that meaning or biological functionality is not derivable from the
sequence alone, but must be found by the knowledge of the language or
> >The semantic, meaningful, or functional information (II) is extremely
> >difficult to define properly for natural systems, as Howard correctly
> >points out.
> >-- The term "semantic" indicates that it is coded in DNA, in analogy to
> >a language.
> >-- The term "functional" indicates that it provides a specification for
> >a function, or what a biological macromolecule, complex, or other system
> >part will _DO_, as Howard emphasizes.
> >-- The term "meaningful" indicates a teleological view which designates
> >the effect of this function in the context of the whole organism.
> Function is only between 2 or more molecules, not necessarily within a
> whole system. a protein catalyzes a certain reaction regardless of whether
> it is in an organism or not.
In simple cases, yes. But not always. Some proteins and functions no
longer work in vitro. But, yes, the functional information is in the
molecule, regardless of whether it is active or not.
> >While it is easy to compute the amount of "information" (I), as Glenn
> >has shown, different factors make it difficult to estimate an amount of
> >"information" (II).
> >(1) Synonymy: different molecular structures or molecules may have the
> >same effect, such that it doesn't matter which one is used.
> >(2) Redundancy: different operational pathways may salvage a system in
> >case one of them is damaged.
> >(3) Ecology: depending on the current environment, a given function may
> >or may not be needed, or may have different selective values.
> >(4) Population dynamics: population size and time may determine the
> >survival of a given feature.
> >(5) Microevolutionary accessibility: different sequence configurations
> >may be more or less easily reached by a mutational random walk.
> >(6) Robustness: depending on its location in sequence space, a
> >macromolecule may be more or less apt to survive during evolution.
> >These factors happen to come to mind at present ... there may be more.
> >I'm sure biologists will be able point out others.
> >Now, is this information (II) perhaps equal to zero, such that it can be
> >neglected entirely? Then a virtual infinity of viable evolutionary paths
> >would be possible, and it would be certain that life evolved wherever
> >the conditions are not extremely inimical. In this case, E.L. Shock,
> >whom I also quoted, probably wouldn't consider it to be "a major
> >challenge" to find feasible ways leading from the ubiquitous small
> >organic "building-blocks of life" to living systems. This is the second
> >point I wanted to make in my last post.
> Information is related to entropy and must have a p log(p) format. Can you
> derive such a thing for what you define as information II? If you can't,
> then it isn't any form of information. It may be something else, but just
> not information.
It is not (unqualified) Shannon information. It is information about in
the same way as a written paragraph contains information. It tells us
something. Therefore I call it semantic information.
H.P. Yockey, "Information theory and molecular biology" (Cambridge
University Press, 1992, ISBN 0-521-35005-0) has defined the information
content of iso-1-cytochrome c family of enzymes with respect to their
common functionality (excluding the unknown additional information due
to species-specific requirements). He took into account genetic code
degeneracy and mutational transition probabilities between amino acids
equivalent with respect to enzyme function in this protein family. By
defining sequence similarity in terms of "mutual entropy" in p log(p)
format, he linked semantic information with information potential for
the given sequence length.
> As to multiple pathways. I recall back in the early 1960s, the argument
> against the chance formation of a peptide was based upon the chance of
> finding a single sequence out of all sequence space. So for oxytocin, an 8
> amino acid peptide, the chance of making human oxytocin was 1 in 8^20 or 1
> in 10^18.
We better write 1 in 20^8 or 4 in 10^11.
> Of course they would use a 100 unit long peptide and have 20^100
> or 1 in 10^130. and give an indignant conclusion as to how could anyone
> believe such odds. But as we have learned things over the past 30 years, we
> have brought those numbers down to 1 in 10^40 because we know now that more
> than one sequence can perform the same task.
1 in 10^40 is still much too small to be of any use.
> And experiments by Gerald
> Joyce, Jack Szostak and others, show that functionality in a test tube full
> of RNA is found at a rate of 1 in 10^14 to 1 in 10^18. That is observation.
We have discussed this before, without coming to an agreement. RNA is
not protein. And artificial selection in vitro is not natural selection
in vivo - or even random mutatuional walks with no selection.
> So the problem isn't nearly as bad as apologists have been saying for years
> and years.
It will be fascinating to watch possible progress towards interesting
-- Dr. Peter Ruest, CH-3148 Lanzenhaeusern, Switzerland <firstname.lastname@example.org> - Biochemistry - Creation and evolution "..the work which God created to evolve it" (Genesis 2:3)
This archive was generated by hypermail 2b29 : Wed May 08 2002 - 13:30:17 EDT