# Re: Information: Brad's reply (was Information: a very

Glenn R. Morton (grmorton@waymark.net)
Sat, 27 Jun 1998 17:04:22 -0500

At 05:06 PM 6/27/98 +0800, Brad Jones wrote:
>>>------------------------------------------------------------
>>>Glenn,
>>>A DNA sequence of AAAAATAAAA will output this each and every
>>>time eg: AAAAATAAAA AAAAATAAAA AAAAATAAAA
>>>AAAAATAAAA
>>>
>>>A zero memory source with the probabilities given would produce
>>>something like: AATAAAAAAA AAAAATAAAA AAAAAAATAA
>>>ATAAATAAAA
>>
>>

>>I believe that you are mixing the way memory works in such systems. Zero
>>memory usually applies to a Markov chain which doesn't use the previous
>>character to determine the next.
>
>A Markov source is defined as a source where the next symbol is dependent
>on one or more previous symbols. A zero memory source is the opposite of
>this. It is therefore impossible by definition to have a markov source where
>the next symbol is not dependent on the previous.
>

This is not correct. It is indeed possible to have markov chain which does
not depend upon the previous symbol. It is a special case of Markov
matrices. A Markov chain is a probability matrix in which the state of some
system is sequentially followed by other states with a given probability.
For instance in English the letter q is followed by u with a probability of
100%. The matrix representing the change from state q to the next letter
would look like:

from \ To state
state \ a b c ....q r ...z

q 0 0 0.....1 0....0

To simplify things, consider the 4 nucleotides, A,T, C,G. In the DNA
sequnce one can set up a probability matrix

from \ To next symbol in sequence
state \ A T C G
A | a1 a2 a3 a4
T | t1 t2 t3 t4
C | c1 c2 c3 c4
G | g1 g2 g3 g4

Where a1+a2+a3+a4 =1; t1+t2+t3+t4=1; etc. This is because A must be
followed by some letter so the sum of the probabilities must be 1.
If you have a transition probability matrix (Markov matrix) with values like:

from \ To next symbol in sequence
state \ A T C G
A | 3 7 55 35
T | 15 23 12 50
C | 67 10 3 20
G | 44 32 5 18

The presence of the state C is highly correlated with the next letter being
A. conversely in this case the presence of A is highly correlated with the
next letter being C. Such a transition matrix should give lots of
ACACACAC's. The system has memory and remembers what the last character
was and behaves accordingly.

But by definition "The persistence of memory is a function of the
determinism of the system, because a completely deterministic system
'remembers' forever, whereas a random system has no memory at all." John C.
Davis, Statistics and data Analysis in Geology (New York: John Wiley,
1973), p. 285

What is a random system referred to by Davis above? It is a Markov
probability matrix with the following components:

from \ To next symbol in sequence
state \ A T C G
A | 25 25 25 25
T | 25 25 25 25
C | 25 25 25 25
G | 25 25 25 25

Any letter is equally likely to be followed by any other letter. That is
what Yockey was saying in Brian Harper's post. The above transition
probability matrix is a special Markov matrix with absolutely no memory of
the previous state because it doesn't matter what the previous letter is,
the next letter is randomly selected from the 4 possibiliites. It is
mathematically a Markov matrix.

When you stated above in your original post
:
>>>A DNA sequence of AAAAATAAAA will output this each and every
>>>time eg: AAAAATAAAA AAAAATAAAA AAAAATAAAA
>>>AAAAATAAAA
>>>
>>>A zero memory source with the probabilities given would produce
>>>something like: AATAAAAAAA AAAAATAAAA AAAAAAATAA
>>>ATAAATAAAA

I was puzzled. I think I now know what you are mixing up. A DNA sequence
in a reproductive system has 2 dimensions. There is the dimension of the
sequence itself and on the other axis is the dimension of what is passed on
to the offspring in the next generation. It is the generation axis.

Generation
Axis
5
4
3
2
1
0 AAAAATAAAA
Sequence axis

When you state that the DNA sequence will output this each and every time
you are referring to the generational axis. What is passed from parent to
offspring. It would look like:

Generation
Axis
5 AAAAATAAAA
4 AAAAATAAAA
3 AAAAATAAAA
2 AAAAATAAAA
1 AAAAATAAAA
0 AAAAATAAAA
Sequence axis

But in our original notes on information theory, both Brian and I were
talking about the Sequence axis. Information is measured along the
sequence axis, not per se the generational axis.

Thus when I pointed out that the sequence AAAAAAAAAA had zero information
content, and the mutation to AAAAATAAAA represented an increase in
information it does because we are not talking about the generation axis.
But even putting it into your terminology, the output(generational axis) of
the DNA sequence AAAAAAAAAA is not always AAAAAAAAAA but occasionally is
AACAAAAAAA or AAAAATAAAA. There is a Generational markov matrix that is
something like:

from \ To next symbol in next generation
state \ A T C G
A | .9999999997 .0000000001 .0000000001 .0000000001
T | .9999999997 .9999999997 .0000000001 .0000000001
C | .0000000001 .0000000001 .9999999997 .0000000001
G | .0000000001 .0000000001 .0000000001 .9999999997

This generational transition matrix allows the next generation to avoid
very many mutations. But along the sequence axis, not the generational
axis, the Markov matrix is such that each letter is independent of the
previous choice or:
from \ To next symbol in sequence
state \ A T C G
A | 25 25 25 25
T | 25 25 25 25
C | 25 25 25 25
G | 25 25 25 25

>>>
>>You are using a 10th degree Markov chain and that is not what DNA is. Brian
>>would you care to comment on this?
>
>I just proved that it is at least a 10th order and you claim it isn't with
>no supporting evidence (let alone mathematical analysis!) ?!?
>
Actually you mixed up the generational axis with the sequence axis. You
haven't proved it yet.

>>A zero memory Markov chain is what a random sequence is.
>
>There is no such thing as a zero memory markov chain. But a random sequence
>is a zero memory source, this has no relevance to the topic though.
>

See above. Are you trying to say that equal probabilities for all choices
is not a matrix? Or are you saying that the name of the matrix isn't
Markov? If the latter then you are really playing a semantic game.

>>Maybe you should tell this to Hubert Yockey. But I don't think he would
>>agree with you. By the way, DNA is not like a CD. There are mutations in
>>DNA and they can add information.
>
>I don't know who Yockey is but this is what Engineers are taught the world
>over in relation to information theory since Shannon and Wiener invented it
>in 1948.
>

Yockey is only one of the leading figures of information theory as applied
to biology. Your lack of familiarity with him shows that you haven't
looked at the biological problem very closely. here are some of Yockey's
publications:

1956 "An application of information theory to the physics of tissue damage,

1958 "A Study of aging, thermal killing and radiation damage by information
theory," Symposium on Information Theory in Biology ed. H. P. Yocky, R.
Platzman and H. Quastler, pp 297-316

1974 "An applicaton of informatin Theory to the Central Dogma and the
sequence hypothesis" Journal of Theoretical biology 46:369-406--a must read

1977 "A prescription which predicts functionally equivalent residues at
given sites in protein sequences," Journal of Theoretical Biology 67:337-343

1977 "on the Information Content of Cytochrome C" Journal of Theoretical
Biology 67:345-376

1977 "A calculation of the probability of spontaneous biogeneisis by
information theory" Journal of Theoretical biology 67:377-398--This is one
that will fit into your father's preconceptions.

1978"Can the Central Dogma be derived from Information Theory? Journal of
Theoretical Biology 74:149-152

1979 "Do overlapping genes violate molecular biology and the theory of
evolution? Journal of Theoretical biology 80:21-6

1981 "Self Organization origin of life scenarios and information theory
Journal of Theoretical biology 91:13-31

Information Theory and Molecular Biology, (New York: CAmbridge University
Press, 1992)

When you have familiarized yourself with Yockey's work, then you will be
ready to discuss the issue.

>reasons why DNA is similar to a CD in terms of info theory:
>
>1. is a channel for encoded information.
>
>2. outputs a set sequence of codes repeatedly.
>
>3. can be replicated.
>
>4. random errors/mutations can occur in replication process.
>
>on what grounds do you object to this comparison?

You are using the generational axis and we were using the sequence axis in
our calculation of information. That is why your analogy is flawed.

>
>If it is only the fact that mutations (in your opinion) add information
>where errors in CD replication does not. Well, this is precisely the point
>being debated and so it is obviously not a valid argument.

Actually if you mutate the digits on a CD some of the mutations will add
information and some will remove information. Both those that add and
those that subtract may remove the message, but the informational content
of a sequence is not the same as the message content.

>
>Do you have any objections actually based on why they are different in
>relation to information theory?

See my last paragraph. Familiarize yourself with Yockeys' work!!!!!

>
>Is anyone interested what a realistic analysis of the problem would show if
>done correctly as an information channel?
>
>>>The mutations of DNA seem analogous to the errors encountered
>>>when copying a CD which is quite easily modeled by a correct
>>>application of information theory.
>>>
>>>By doing it this way it is possible to model the random mutations and
>>>the effect they have on the information, ie the difference they make to
>>>the information content as opposed to the actual information content.
>>>The measure of this is called the mutual information of a channel.
>>>
>>>I hope this clears it up somewhat, it is quite difficult to explain this
>in
>>>easy terms and I would recommend finding a good textbook if you
>>>really want to pursue this.
>>
>>
>>I was about to make the same recommendation to you.
>
>Sorry Glenn but I know what I am talking about here.

And you haven't heard of Yockey????????

>Further more you have not refuted my calculations with anything but your
>personal opinion which does not seem to be based on any knowledge of the
>material you are discussing.
>
>You misunderstand the basics of information theory if you think that random
>noise consists of or can create information in any form whatsoever.

You are equivocating on the word 'information' as meaning semantics. My son
is a EE and you guys talk about fidelity of the MESSAGE being conveyed.
But that is not informational content of a sequence. Yocky states

"One must know the language in which a word is being used: 'O singe
fort' may be read in French or German with entirely different meanins. The
reader may find it amusing to list all the words in languages that he knows
that are spelled the same but have different meanings. For example a
German-speaking visitor to the United States might have all his suspicions
about America confirmed when he finds that there is a Gift shop in every
airport, hotel and shopping center." The message 'mayday, mayday' may mean
a distress signal, a Bolshevik holiday or a party for children in the
spring, all depending on the context.

...

"The examples cited above show that the meaning of a sequence of symbols in
natural languages is subject to the arbitrary agreement between source and
receiver. This question of meaning is best left to philosophers, linguists
and semanticists. The communications engineer, in designing his equipment,
need not concern himself with the meaning of the sequence of symbols.
indeed, no humans may be involved. The message may be, for example, a
computer communicating data to another machine or a spacecraft sending data
from which pictures of other planets will be made."
"A great deal of arbitrariness is also found in the sequences that carry
specificity in a protein, as we shall see in Chapter 6. In fact, as I
shall show in Chapter 9, there are 9.737 x 10^93 iso-1-cytochrome c
sequences that differ in at least one amino acid, each carrying the same
specificity. Like a good communications system, the genetic information
storage and transmission apparatus is independent of the specificity of the
genetic messages. It deals with specificity of the genetic messages only
through the information those messages carry." Yockey, Information Theory
and Molecular biology, p. 59

And

"Information theory shows that it is fundamentally undecidable whether a
given sequence has been generated by a stochastic process or by a highly
organized process. This is in contrast with the classical law of the
excluded middle (tertium non datur), that is, the doctrine that a statement
or theorem must be either true or false. Algorithmic information theory
shows that truth or validity may also be indeterminate or fundamentally
undecidable."~Hubert Yockey, Information Theory and Molecular Biology,
(Cambridge: Cambridge University Press, 1992), p. 81-82.

This last means that you cannot possibly tell whether a given sequence has
MEANING created by a HUMAN or whether it is random gibberish. Show me the
algorithm that will tell these two apart. You can't and no one else can.
This is not my opinion but Yockey's.

The fact
>that the next symbol is random does not imply in any way that it is random
>values that are being produced. You should look into source coding to see
>what is actually meant by the symbol probabilities of a source.
>
>Information theory uses words such as random in a very different way than
>what a layman means. For example If you are sending english down your modem
>then I would model that as a random source, but it is NOT random noise that
>is being sent, it is english text that any randomness would corrupt not
>enhance. In fact, as common sense suggests, any random modification of the
>signal will ALWAYS reduce the information.

I absolutely agree that randomness would korupt thu missage won wunts ta
sind. But by making new arbitrary definitions about the meanings of a word,
the language evolves. Meaning is not information by the definition of
information theory. Why do you think we of English descent don't still
speak Latin? Mutations to the spoken language were given new arbitrary
definitions. The language mutated but it wasn't destroyed. Meaning is not
the same as information.

>
>Additional information on Markov sources can be found in any mathematics
>book that deals with random processes (this is quite heavy going if you are
>not up on your probability theory).

Let's not start the ad hominem attacks again.

>--------------------------------------------