# RE: Design detection and minimum description length

From: Iain Strachan (iain.strachan@eudoramail.com)
Date: Fri Dec 01 2000 - 22:12:44 EST

• Next message: Michael Roberts: "Re: Evolution & Identity of the ID designer"

Thanks, Adrian, for that thoughtful summary of the discussion, which
gives much to chew on.

I'd like, if I may, to respond briefly to one of your points; the
other points need more careful thought & things are busy at the
moment. I've a short trip up to Edinburgh this week to (finally!)
collect my PhD, (seven years plus doing a full-time job; not a
recommended procedure for staying sane :-) so will not be able to
respond promptly to any further points that may follow on this week.

>
>Iain offers the example of a "perfect deal" in a game of bridge.
>Anyone who has sees this situation would become highly suspicious,
>BUT, only if that person has some knowledge of card games and
>numbers. And that is precisely Glenn's point. For a person who has
>never seen a pack of cards, and don't know anything about written
>numbers, the "perfect deal" is unintelligible and random.
>

Yes, this is what I understand as well to be Glenn's problem with the
whole thing. You need to have "side knowledge" in order to detect a
design (or a pattern, to use a less loaded word). Your maths teacher
told you about prime numbers, and that is the side knowledge you need
to detect a pattern in a sufficiently long sequence of coin-tosses
that is based on prime numbers (e.g. "The Nth toss is heads if N is
the product of exactly two primes, otherwise it is tails"). Of
course you need to know about prime numbers to get this pattern. But
I don't see that as equivalent to someone telling you that this
numbers, and then you used your own intelligence to detect the
pattern based on primes.

Dembski also addresses this point in "No Free Lunch" (p63):

"The SETI researchers [in the film Contact] who inferred that an
extraterrestrial intelligence was communicating with them needed to
know about prime numbers before they could ascertain that an incoming
sequence of radio signals representing an ascending sequence of prime
numbers was in fact just that and therefore due to another
activity of another intelligence by the judicious exploitation of
background knowledge".

I really can't (sorry, Glenn), see any problem with this, and how one
can go from this to saying that the method has failed because the
researchers had to be told it was designed.

>Iain's examples of detecting mathematical relationships/correlations
>seems irrelevant to me. What would perhaps be a better analogy would
>be the detection of causality. As in the case of attempting to
>detect design, one may be able to say that an event (no pattern) is
>so improbable that we have to reject the null and conclude that
>there is a pattern, just as in detecting a correlation, one
>concludes that the null hypothesis is so improbable that we reject
>it and therefore conclude that there is a relationship. But it is an
>entirely different matter to go from pattern to design, which would
>be analogous to going from relationship (correlation) to causation.
>

Perhaps I can explain why I think it is relevant. I think the step of
going from correlation to causation comes down to how complex the
specification is, and this is close to the heart of how I currently
understand the Dembski Complexity/Specificity criterion, and the
"Explanatory Filter". For design, according to this, we must first
have a sufficiently long sequence that its probability is very low.
This constitutes what he calls "complexity". However, that is not
enough; one must also be able to describe the data in a much shorter
length than the string itself. "This is a sequence of primes" for
example. This gets one around the retort "OK, you've found an
interesting pattern, but how many other interesting patterns could
you find. I bet you could make any pattern 'interesting' with a
little ingenuity". The reply to this is the counting argument I've
used earlier. If the specifier string is M bits long, then there are
at most 2^M patterns of similar interest (by 'i!
nteresting' I now mean, "can be described in a compact way"). Then
if the string itself is N bits long and N >> M, we have a low
probability of that pattern (or any like it) occurring.

But there is a third box to the filter, called "contingency". The
argument here is "but what if it has to be that way via a natural
cause". This is covered by the case where the specifier string
itself is so short that it could be attributed easily to a natural
cause.

Such a string might be a third degree polynomial that fits nicely
through a set of 20 (x,y) points. One has found correlation there,
but no one would suspect design to be at work.

But suppose the following happened. I sent you a dataset of 10,000
(x,y) points, with x monotonically increasing, and y wiggling about
all over the place, apparently at random. Then you decided to
analyse it by fitting higher and higher degree polynomials, and you
use my minimum description length test to decide the best model.
(i.e. you calculate the bit-length of the model, proportional to the
model order, plus the number of bits you need to transmit the
residuals). Suppose you find that actually the best fit comes when
you try a degree 1000 polynomial. Suppose the numbers are all
digitised at 16 bits, and the residuals you get from the model can
all be expressed in 8 bits or less (i.e. the model always gets it
right to within around 0.5 percent). So the description length will
be:

16x1001 (for the polynomial coefficients of a degree 1000 polynomial),

+ 10000x8

This comes to 96016 bits.
The original data (the y points) would be 160000 bits long and the
difference in length is 63894 bits. So the probability that you
could do better than this given the null hypothesis (no
correlatio/causation) is 2^63984.

Now, how do we get to the idea that this is design rather than
correlation? The IDer would argue (I think) that it is because the
description string is so long itself. How many physical processes do
we know that are governed by degree 1000 polynomials? I think if you
received such a dataset from me, you'd suspect that I deliberately
arranged it by generating the data from a polynomial of degree 1000.
Would the conclusion be different if you didn't get it from me, but
it arrived from outer space?

Here's another example that I kind of like. We know that the
description "all heads" is much simpler than the description of a
sequence of coin tosses governed by some relationship involving prime
numbers.

There is a party trick that the mathematician John H. Conway (of
"game of life" fame) performs. I imagine it has won him many a drink
in a bar. He describes it as being able to "cheat probability". He
takes around 20 American 1-cent coins, and balances them all
carefully on their edges on a table. What's the odds you can make
them all come down heads? Roughly 1 in a million. He taps the
underside of the table at just sufficient force to get them to all
topple, and they all come down heads! (I guess this trick requires a
large amount of practice to bring off). Now, for such a simple
description (all heads), there could well be a simple naturalistic
explanation. There is. It is all down to the slight asymmetry in
the milling process of the coins that biases the way it will fall if
just toppled. But now supposing you performed a different trick.
Suppose you stuck labels on the "heads" sides of all the coins, and
wrote the numbers 1 to 20 on them. Then you tapped th!
e table and at the end, the only numbers showing were primes.

I think anyone who saw this done, if it could be done repeatedly
would suspect a cheat; such a feat might be performed by a magician
who cheated (i.e. designed it that way), but it couldn't be down to
natural causes. The difference is down to the complexity of
describing the pattern of heads and tails.

Hope this offers some more food for thought.