Re: Human Genome May be Longer than Expected

Stephen E. Jones (
Tue, 10 Aug 1999 21:43:34 +0800


On Mon, 02 Aug 1999 20:18:29 -0400, Francis Maloney wrote:

FM>I am curious about this. How could they use sampling techniques? If they
>used them on correspondence of elements, we would be one hundred percent
>equal with chimps, if they used it on frequencies of nucleic acids, again
>probably one hundred percent correspondence. If they randomly sampled
>actual genes, the correspondence would be very low because the frquency of
>any individual gene is very low, perhaps one in 3 billion. If they
>compared sequences in known genes there is an element of bias in the
>selection process; and then they would need to claim one to one
>correspondence, gene for gene, human to chimps to extrapolate the results.
>This fact or myth that we share 98 percent of the genetic make-up of chimps
>comes up often and is a powerful argument, at least on the surface, for
>evolution. I'd like to know more about it

I have just read King & Wilson's 1975 paper in SCIENCE where the
"humans share 98% (or 99%) of their genes" claim apparently started. Here
is an excerpt from my post to another list I am on regarding what I found.
(Apologies for any overlap with what I have already posted on the

Briefly, King and Wilson did not actually say that we share 98% (or 99%)
of our *genes* with chimps. They said we share an average of 99% of a
sample of 19 and 44 *proteins* with chimps. They cited two earlier studies
that we differ by 1.1% (therefore we share 98.9%) of our DNA with
chimps. But King and Wilson seem to actually doubt that we *do* differ by
1.1% (share 98.9%) of our DNA of our DNA with chimps.

It is entirely possible, in view of the experience of recent genomes
sequenced, and the major physical and mental differences between humans
and chimps, that when the whole human and chimp genomes are fully
sequenced that humans will have many novel genes.


First, it has been pointed out to me privately...that, my
"billion more genes" was a mistaken interpretation. The article said
"chemical units", by which is meant *nucleotide base-pairs*, not genes.
The actual number of extra genes was not actually given in the article but it
does say that "The human genome may larger than
previously thought..." and that "the number of human genes is likely to lie
at the higher end of the range usually quoted -- 60,000 to 100,000".

The usual previous estimate for the human genome was 100,000 genes:
"The human genome contains very roughly 100 000 genes..." (White A.,
"The greatest apes", New Scientist, 15 May 1999., so an
additional "one-third" could mean an extra 33,000 genes!

Second, I want to make it quite clear that I have no problem in principle
with humans and chimps sharing a common ancestor, and therefore having
99% of their DNA in common. My problem is whether it is a *factual* claim
that humans and chimps share 99% (or 98%) of their DNA.

I have read both King & Wilson's original 1975 study (King M.-C. &
Wilson A.C., 'Evolution at two levels in humans and chimpanzees",
Science, 188: 107-116, 1975), and what Gribbin & Cherfas say in their
book "The Monkey Puzzle about King & Wilson's use of the DNA
hybridisation/annealing method to arrive at their 99% shared DNA with
chimps claim.

Gribbin and Cherfas explain that the process used for comparing the
percentage difference between different species' DNA was DNA
hybridisation (aka DNA annealing). Gribbin and Cherfas explain:

" would still be nice to make- that final step and come down from the
amino acid sequence to the DNA itself, but just as the fingerprint technique
is intermediate between immune response and amino acid sequence, so too
there is a technique that is intermediate between amino acid sequence and
nucleotide sequence. It is called DNA hybridisation.

The idea behind DNA hybridization is simplicity itself. We separate the
double helix of one species' DNA into its component single complementary
strands. We do the same to the DNA of the species we wish to compare it
with. Then we mix all the separated strands together. Where the two
species have identical sequences along their DNA, the complementary
bases from the two different species will be able to come together and
join...we would also find in our sample an equal amount of DNA in which
one strand had come from one sample and the other from the other, but
unless we had previously labelled one of the lots of DNA we would not be
able to tell which double-stranded molecule was which. If one sample had
been made highly radioactive, and if we could count the radioactivity of
each newly-formed double helix we would find that a quarter of the
molecules were as radioactive as the original sample, and another quarter
were not at all radioactive. if the two samples were very similar, so that
long stretches of each did indeed contain almost the same sequence of
eases, then they would still be able to form so-called hybrid molecules...

What we then want to know is the extent of the similarity, the percentage
of the two DNAs that is in fact identical. To do this we make use of the
fact that it is only the binding between individual pairs of complementary
bases that holds the two strands together. The more bonds there are, the
harder it will be to separate the strands of the hybrid DNA, and to measure
the strength of the bonding between the strands we need only put in energy
to overcome the binding energy and see when the strands drift apart. What
we are really doing is finding the melting point of the DNA. Solids are held
together by bonds between their component atoms. Inject energy into the
solid, in the form of heat, and you break the bonds so that the solid is free
to become a liquid. The energy needed to break the molecular bonds is the
melting point of the substance. To get back to DNA, if the two strands are
totally complementary they will be held together with a certain strength and
it will take a certain amount of energy, a temperature of around 85 degrees
Celsius, to separate them. If the strands are only partially complementary
the force holding them together will be weaker and it will take less energy,
a lower temperature, to make them drift apart. Impure DNA, in the sense
of being made of non-identical strands, has a lower melting point than pure
DNA, just as impure water has a lower melting point (freezes at a lower
temperature) than pure water...

The first step, as we've said, is to purify the DNA from the two species.
Then, one species' DNA is labelled with a radioactive tag, usually the
radioactive isotope of iodine; this does not affect the way the strands work,
but simpler acts as a tracer that enables researchers to follow the single
strands from that species. With the labelling done, the two sets of DNA are
mixed and slowly heated. At around 85 degrees Celsius the bonds between
opposite bases, which normally hold the strands together, are broken, and
the strands drift apart. Now the mixture is allowed to cool slowly so that
heteroduplex molecules can form from the two species of DNA. Once the
mixture is cooled the few remaining single strands are removed and the
business of measuring the melting point begins. The temperature is raised
by about one degree and the dissociates DNA removed and assayed for
radioactivity with an improves version of the old-fashioned geiger counter.
Then the temperature is raised another notch and the next lot of single
strands removed and counted. A repeated series of counts at steadily
increasing temperatures produces a so-called dissociation curve, the peak
of which represents the melting point of the hybrid DNA. This can then be
compared directly with the melting point of a pure hybrid, that is DNA
from the target species heated and allowed to recombine, so that any quirks
due to the heating processes and so on are evened out. The size of the
difference in melting points between heteroduplex and normal DNA is
directly related to the dissimilarity between the two strands. A difference of
one degree Celsius is roughly equivalent to a difference in one per cent of
the DNA; one in a hundred of the nucleotides are not identical in two
species that show a melting point depression of one degree Celsius....

Heteroduplex DNA made from Pan troglodytes and Homo sapiens melts at
a temperature just one degree Celsius below that of pure Homo sapiens
DNA. Ninety-nine out of a hundred bases are identical in man and the
chimp, which are not put in the same genus, or even the same family."

(Gribbin J. & Cherfas J., "The Monkey Puzzle", 1982, pp96-99)

But Gribbin & Cherfas admit that DNA annealing is not as accurate as
actually counting the DNA nucleotide sequences themselves:

"DNA annealing is a very powerful technique, but like protein
fingerprinting it is a step away from the information we are really after. The
ultimate truth resides not in the melting point of mixed DNA molecules but
in the sequence of bases along the DNA itself. These are the dice that
evolution rolls, and here we will find the successful moves that set one
species off from another. Technology, again, is everything. If you've
managed to extract the particular bit of DNA you are interested in from a
variety of species, pure enough and in sufficient amounts to work with,
actually uncovering the sequence of nucleotides is probably the easiest part
of the whole business. That doesn't mean that it is simple, but such
advances have been made in the genetic engineer's toolkit that one can
think realistically of knowing the entire nucleic acid sequence of an
organism. Indeed this has already been achieved, notably for the virus
phiX174 whose complete DNA sequence occupied several pages of the
scientific journal Nature in 1980...

The entire DNA sequence of a virus is much shorter than a human genetic
blueprint, and molecular biologists are still a long way from being able to
write down the DNA sequence for a man, or any other mammal. But the
technique is essentially a refinement of the molecule chopping and
reconstruction techniques used to sequence pro thins. DNA molecules are
cut into pieces, labelled with radioactive tracer atoms, the fragments
separated from one another by electrophoresis, and the sequence
reconstructed. Of course the work is more difficult than protein
sequencing, operating on a smaller molecular scale. It is rather as if we had
described how to make a grandfather clock and then said that making a
Cartier watch involved the same physics. It does, but our statement doesn't
do justice to the watchmaker's craft, just as we cannot do justice here to
the work of Wally Gilbert and Fred hanger. For our purposes all that
matters is that once the complete sequence of nucleotides along the DNA
of different species is known, the evolutionary history of those species can
be explored right down at the very level at which evolutionary changes take

(Gribbin J. & Cherfas J., 1982, pp100-101).

It seemsm to me there could be a major error here in that the DNA
hybridisation/ annealing method, could go wrong because it assumes
a close correspondence between chimps and humans' gene loci. But
if humans had a large number of novel genes, these would presumably
be regarded as anomalous strands of non-matching fragments of DNA
and just discarded. Note what Gribbin & Cherfas say above:

"Once the mixture is cooled the few remaining single strands are removed and
the business of measuring the melting point begins."

How do they know that these "few remaining single strands" do not contain
novel human genes? King & Wilson's hybridisation/annealing method of removing
all DNA except that which is a close match between the human and chimp genome
seems *guaranteed* to produce such a close match!


King & Wilson's 1975 study itself has some noteworthy points:

1. The 2% molecular difference between humans and chimps is regarded as
being *too small*:

"The intriguing result, documented in this article, is that all the biochemical
methods agree in showing that the genetic distance between humans and
the chimpanzee is probably too small to account for their substantial
organismal differences...the genes of the human and the chimpanzee are as
similar as those of sibling species of other organisms...the paradox remains.
In order to explain how species which have such similar genes can differ so
substantially in anatomy and way of life..." (King M.-C. & Wilson A.C.,
"Evolution at Two Levels in Humans and Chimpanzees", Science, 11 April
1975, 188 (4184), 107-116, 107).

"However, with respect to genetic distances between species, the human-
chimpanzee D value is extraordinarily small, corresponding to the genetic
distance between sibling species of Drosophila or mammals (Fig. 4).
Nonsibling species within a genus (referred to in the figure as congeneric
species) generally differ more from each other, by electrophoretic criteria,
than humans and chimpanzees. The genetic distances among species from
different genera are considerably larger than the human-chimpanzee genetic
distance. The genetic distance between two species measured by DNA
hybridization also indicates that human beings and chimpanzees are as
similar as sibling chimpanzees are as similar as sibling species of other
organisms...In summary, the genetic distance between humans and
chimpanzees is well within the range found for sibling species of other
organisms. The molecular similarity between chimpanzees and humans is
extraordinary because they differ far more than sibling species in anatomy
and way of life. Although humans and chimpanzees are rather similar in the
structure of the thorax and arms, they differ substantially not only in brain
size but also in the anatomy of the pelvis, foot, and jaws, as well as in
relative lengths of limbs and digits. Humans and chimpanzees also differ
significantly in many other anatomical respects, to the extent that nearly
every bone in the body of a chimpanzee is readily distinguishable in shape
or size from its human counterpart. Associated with these anatomical
differences there are, of course, major differences in posture (see cover
picture), mode of locomotion, methods of procuring food, and means of
communication. Because of these major differences in anatomy and way of
life, biologists place the two species not just in separate genera but in
separate families. so it appears that molecular and organismal methods of
evaluating the chimpanzee-human difference yield quite different
conclusions." (King M.C. & Wilson A.C., 1975, 113).

2. The 99% similarity between humans and chimps was based on *amino
acids* sequences in *proteins* not in nucleic acid sequences in DNA:

"Sequence and immunological comparisons of proteins ... By applying the
microcomplement fixation method to large proteins, however, one can
obtain an approximate measure of the degree of amino acid sequence
difference between related proteins ... Based on the proteins listed in Table
1, the average degree of difference between human and chimpanzee
proteins is... 7.2 amino acid sites per 1000 substitutions. That is, the
sequences of human and chimpanzee polypeptides examined to date are, on
the average, more than 99 percent identical." (King M.-C. & Wilson A.C.,
1975, 107-108).

"Agreement between electrophoresis and protein sequencing...Therefore
the expected degree of amino acid difference between human and
chimpanzee is ... 8.2 substitutions per 1000 sites, with a range (within one
standard error) of 7.5 to 9.1 differences per 1000 amino acids. The
estimate based on amino acid sequencing and immunological comparisons
... agrees well with this estimate. Both estimates indicate that the average
human protein is more than 99 percent identical in amino acid sequence to
its chimpanzee homolog." (King M.-C. & Wilson A.C., 1975, 109,112).

"Amino acid sequencing, immunological, and electrophoretic methods of
protein comparison yield concordant estimates of genetic resemblance.
These approaches all indicate that the average human polypeptide is more
than 99 percent cent identical to its chimpanzee counterpart." (King M.-C.
& Wilson A.C., 1975, 114-115).

3. The number of proteins compared was only 19 in the first
microcomplement fixation method and 44 in the second electrophoresis
method (King M.-C. & Wilson A.C., 1975, 108).

4. Comparison of DNA. King & Wilson are a bit vague here, referring to
studies of 1970 and 1972 by the same workers. The actual length of DNA
compared is not given. The difference of 1.1% is obtained by an arbitrary
selection of widely differing estimates of hybridisation disassociation
temperatures and calibration factors:

"Working with "nonrepeated" DNA sequences, Kohne [1970] has
estimated that human-chimpanzee hybrid DNA dissociates at a temperature
(/\T) 1.5 degrees C lower than the dissociation temperature of reannealed
human DNA. Hoyer et al., [1972] on the other hand, have estimated that
/\T equals 0.7 degrees C for human-chimpanzee hybrid DNA. If /\T is the
difference in dissociation temperature of reannealed human DNA and
human-chimpanzee hybrid DNA prepared in vitro, then the percentage of
nucleic acid sequence difference is k X /\T where the calibration factor k
has been variously estimated as 1.5, 1.0, 0.9, or 0.45. Based on k being 1.0
and /\T being 1.1 degrees C, the nucleic acid sequence difference of human
and chimpanzee DNA is about 1.1 percent." (King M.-C. & Wilson A.C.,
1975, 113).

5. Indeed, King & Wilson says that the difference cannot be this small
because of the fact that the genetic code is redundant with the same
proteins can be produced by differing genes, and because there are
noncoding sections of DNA that do not produce functional proteins:

"The evidence from the DNA annealing experiments indicates that there
may be more difference at the nucleic acid level than at the protein level in
human and chimpanzee genomes. For every amino acid sequence difference
observed, about four base differences are observed in the DNA... There are
a number of probable reasons for this discrepancy. First, more changes may
appear in DNA than in proteins because of the redundancy of the code and
consequently the existence of third-position nucleotide changes which do
not lead to amino acid substitutions...In addition, many of the nucleic acid
substitutions may have occurred in regions of the DNA that are not
transcribed and are therefore not conserved during evolution. Proteins
analyzed by electrophoresis, sequencing, or microcomplement fixation
techniques, on the other hand, all have definite cellular functions and may
therefore have been conserved to a greater extent during evolution." (King
M.-C. & Wilson A.C., 1975, 113).

6. King and Wilson conclude by trying to explain away the result! They
point out that there are major differences in the arrangement of genes on
the human and chimp chromosomes:

"Second, the order of genes on a chromosome may change owing to
inversion, translocation, addition or deletion of genes, as well as fusion or
fission of chromosomes. These gene rearrangements may have important
effects on gene expression, though the biochemical mechanisms involved
are obscure...Although humans and chimpanzees have rather similar
chromosome numbers, 46 and 48, respectively, the arrangement of genes
on chimpanzee chromosomes differs from that on human chromosomes.
Only a small proportion of the chromosomes have identical banding
patterns in the two species. The banding studies indicate that at least 10
large inversions and translocations and one chromosomal fusion have
occurred since the two lineages diverged. Further evidence for the
possibility that chimpanzees and humans differ considerably in gene
arrangement is provided by annealing studies with a purified DNA fraction.
An RNA which is complementary in sequence to this DNA apparently
anneals predominantly at a cluster of sites on a single human chromosome,
but at widely dispersed sites on several chimpanzee chromosomes. The
arrangement of chromosomal sites at which ribosomal RNA anneals may
also differ between the two species." (King M.-C. & Wilson A.C., 1975,

And they again draw attention to the discrepancy between the genotypic
and phenotypic differences between humans and chimps:

"...The genetic distance between humans and chimpanzees, based on
electrophoretic comparison of proteins encoded by 44 loci is very small,
corresponding to the genetic distance between sibling species of fruit flies
or mammals. Results obtained with other biochemical methods are
consistent with this conclusion. However, the substantial anatomical and
behavioral differences between humans and chimpanzees have led to their
classification in separate families. This indicates that macromolecules and
anatomical or behavioral features of organisms can evolve at independent
rates...A relatively small number of genetic changes in systems controlling
the expression of genes may account for the major organismal differences
between humans and chimpanzees. Some of these changes may result from
the rearrangement of genes on chromosomes rather than from point
mutations." (King M.-C. & Wilson A.C., 1975, 115).

I don't know if King & Wilson's study has been updated, but based on my
reading of their original 1975 paper, it seems to be both incorrect and
misleading to say that `humans and chimps share 98% (or 99%) of their

In view of discoveries of novel genes in those few genomes which have
been mapped to date and the news that the human genome may contain
tens of thousands more genes than originally thought, it may well be that a
complete sequencing of the human and chimpanzee genome will reveal a
large number of novel genes in the human genome compared to the chimp

The larger the number and the more novel these genes are the greater the
problem for Darwinism and the stronger the argument for supernatural
intervention in the creation of humans.

"Indeed nothing remains except a tactic that ill-befits a grand
master...namely to blow thick pipe tobacco-smoke into our faces. The
tactic is to argue that although the chance of arriving at the biochemical
system of life as we know it is admitted to be utterly minuscule, there is in
Nature such an enormous number of other chemical systems which could
also support life that any old planet like the Earth would inevitably arrive
sooner or later at one or another of them. This argument is the veriest
nonsense, and if it is to be imbibed at all it must be swallowed with a jorum
of strong ale...So far from there being very many indistinguishable
chemical possibilities, it seems that we have an exceedingly distinguishable
system, the best." (Hoyle F. & Wickramasinghe C., "Evolution from
Space", [1981], Paladin: London UK, 1983, reprint, p25