Re: [asa] writing DNA...

From: Preston Garrison <>
Date: Mon Sep 29 2008 - 16:45:24 EDT

>Are there two different things going on here? One is encoding names
>in synthetic DNA sequences like Ventner did. The other is
>discovering a code in natural proteins that is the identical
>sequence as their names, like Preston's team did?
>We've occasionally heard the ID analogy of pouring a bowl of
>alphabet cereal and finding one's name. Looks like it happened in
>the case of DNA?!? Now we know that the intelligent agent
>responsible for DNA knew English. No evidence of a name in Hebrew,
>is there? Or Akkadian?


Our assignment of letters to the amino acids is completely arbitrary
as far is chemistry is concerned, so you could find just about any
short name in any language by assigning the letters from the language
to the 20 amino acids in the right way. There is a huge amount of
protein sequence in the database now - I haven't checked lately, but
the nucleotide total is perhaps in excess of a trillion nucleotides.
Somewhere in billions of amino acids of sequence you'll find the name
just by chance (whatever chance really is). Of course if a long name
in some exotic language had more than 20 different letters in it, you
couldn't code it. And above a certain length (about 9-10 amino acids
if I've done the arithmetic right in my head) the odds start to be
against any particular random sequence being present in the database.
(There are 20^10 ~= 10^13 possible 10 a.a. sequences.) Because
protein sequences are not random and there are many closely related
sequences in the database, the number of different 10 a.a. sequences
in the database is actually much less than these simple calculations
would suggest.

This actually has made me think of a question that I hadn't
considered before. Are there short sequences that statistically
should occur in proteins, but don't because their structure would be
so inimical to all protein functions that they would be universally
strongly selected against? I doubt that any short sequence could
have that strong an effect, but it would be interesting if there were
any such sequences.


