The Origin Codes
"But seriously"
But Seriously
The Origin Codes webpage was written tongue-in-cheek. I expect it
was cheesy enough to be obvious. It was created to show just how
easily one can find skip codes, and tight-fitting clusters of
related words that seem to predict events that occurred subsequent
to the publication of the original text. Provided you have a
computer to do the grunt work for you, and a large enough document
to process, you can find just about anything if you use a little
care. If fact, though it may look as though I put a great deal of
effort into the Origin Codes, it was remarkably simple, and required
very little
effort at all.
Not only was the effort on my part minimal, but the computer program
was written in such a way that it barely scratched the surface in
terms of examining the superabundance of hidden skip codes in the
text. All of the skip codes and clusters in the Origin Codes were
found by a computer program that limited itself to skip distances
between 2 and 50,000 letters (in order that the program would run
faster). In a text over a million characters in length this
constraint means that only a small fraction of the available letter
sequences were even brought into consideration by the computer in
order to find the codes and clusters I've presented. Had I made full
use of the resources, it's quite likely that better codes and tighter
clusters would have been found.
There are several important pieces of information that are
deliberately left out in order to exaggerate the "effect"
of the codes.
1)
A great many of the skip code words shown in the grid images occur
an astronomical number of times
in the text, so that finding one instance that is clustered tightly
with the other words in a cluster is trivial. This is especially
true for words that are very short, and for words containing
letters that are very common in the English language such as R, S,
T, L, N, and E. Consider the following: the text from Darwin's book
is 1,009,229 characters long. Using an even harsher skip distance
constraint than the one mentioned above (2 to 5,000 letters instead
of 2 to 50,000) the word "LEE", can be found
7,309,040 times
- that's seven times more than the number of letters in the bare
text! The number of ways to find skip codes is truly mind-boggling,
and makes the task far easier than one might first presume. In the
mind of the reader then, failing to grasp the sheer vastness of this
search space will likely artificially boost the significance of any
words and clusters discovered. As a final illustration of this point,
here are two images of the Human Genome Project grid from the Origin
Codes. The first is the cluster I presented in the link off the main
page. It contains a single instance of "DNA" highlighted
in bright red. The second image shows the same grid, but all instances
of "DNA" which fit inside have been highlighted. It should
be clear from this example that the placement of "DNA" in
the original image is more akin to an artistic choice than anything
with some deeper meaning. The same is true of longer words, although
there are less occurrences of them. If one picks a few words, the
computer can find the closest-fitting group. Given the number of
possibilities, that grouping is often very tight.
The Human Genome Project Cluster from the Origin Codes
The Human Genome Project Cluster, with all instances of the
skip code "DNA" highlighted. If you can't piece them all
together, it's because of extensive letter overlap.
2)
The second important piece of information that has been left out is that
I
tell the computer
exactly
what to look for. I tell it precisely which words to search out,
and it finds the smallest clustering of those words that it can.
Darwin's text, through the computer program, is not telling you
or I
anything
about events of historical significance that have or will transpire
in the outside world (well, the
skip codes
aren't telling us anything - the actual text is certainly telling us
a lot
about biology!). The only information the computer is providing is
the size and location of the cluster that has been requested, if
there is one to be found. Everything else comes
from me
.
It's up to me to determine not only what constitutes an interesting
group of words, but exactly what those words are. It's as if I were
to scan the first billion digits of pi to find my own birthday. Sure,
I might find it, but no significance can be attached to that discovery.
For example: when searching for the Yitzhak Rabin clusters, I had
originally tried to cluster the words "Rabin" and "Yitzhak",
but the computer found no instances of the latter. This wasn't a
serious problem; I simply compensated by finding both "Yigal"
and "Amir", and all the other related words. The artificially
inflated impression of significance was not diminished, especially
since I
didn't mention
that "Yitzhak" is nowhere to be found. Note also that it's
completely up to me to decide which words are "related" to
an historical event as well. Instead of looking at a cluster and
declaring "Gee, look what the computer found!", a more
appropriate comment would be "Gee, look what this guy told the
computer to search for!". Sure, it's a lot less exciting, but
that's the point.
3)
The clusters that have been presented for specific historical
events are by no means the only ones present in the text. If I
were to use different words to "describe" an event, I'd
find different clusters of different sizes, and in different
locations. Even if I were to stick with the same group of words,
there are other clusters to be found - the computer simply tries
to find the smallest one. The fact that a cluster can be found for
an event is not so much a matter of luck as it is a matter of
perseverance on my part - if no cluster is found for a particular
set of words, I need only try another set of words. Eventually a
set can be found that
will
cluster nicely. Take, for example, the assassination of Yitzhak
Rabin. I presented the following two clusters in the Origin Codes...
Amir ,
Assassin ,
Rabin ,
Dead ,
Mid ,
East
[24x23]
Yigal ,
Amir ,
Assassin ,
Rabin ,
Tel ,
Aviv ,
Middle ,
East
[32x97]
Those are almost certainly not the best clusters I could have
found (the second one is quite large). Had I chosen different
words, however, I could have presented other clusters. For example,
by adding a few more words, I could have built upon the following one...
Rabin ,
To Be Shot
Or maybe this one...
Yigal ,
Amir ,
To Kill ,
Rabin
Or there's this one...
Rabin ,
Gunned ,
Down
And this one...
Rabin ,
Killed ,
By Amir
And this one too...
Rabin ,
Shot ,
By Amir ,
It Might Well Happen ,
Extinction Of Life
I could go on indefinitely here, continually producing
variations on this theme, which reminds me...
Rabin ,
Death ,
Gunshot ,
Continually Recurring, Variations, Can Go On Indefinitely
Another approach would be to find clusters that seem to
contradict historical events. Like this one...
Rabin ,
Lives ,
To Age ,
Ninety ,
And Then Pass Away ,
No Doubt This Has Occurred
Although what I've presented here is by no means a mathematical disproof of the Bible Codes and Torah
Codes, I think I've shown here that, at the very least, they're a lot less impressive than they first
seem. These clusters are everywhere in a large enough text, and with the right computer program, they're
easy to find. By picking and choosing which words to cluster, one can find "predictions" of
just about anything one likes. Given the enormous number of possible sets of words that one might use to
"describe" an event, the different tenses of verbs, plural vs. singular forms, first and last
names, nicknames, acronyms... etc, it shouldn't be surprising that these tight-fitting groups can be
found. For a much better treatment of the issues and problems with Bible and Torah codes, and for a more
technical and mathematical debunking, visit
Brendan McKay's page
on the subject or
THIS PAGE
from the New Mexicans for Science and Reason.
There are two good articles on the CSICOP website
regarding this issue as well,
HERE
and
HERE
.
I guess I'll end this here, with a few more
clusters I found in the text - just for fun ;)
France ,
Halts ,
Nuke ,
Tests ,
Oh-Six
--
a prediction maybe? :P
Bush ,
Elected ,
Twice ,
Too ,
Bad
Bible ,
Codes ,
Pretty ,
Silly
James ,
Randi ,
Prize ,
Safe ,
For ,
Long ,
Time
That last one refers to
THIS
:)
NOTE:
For my fellow computer geeks out there:
HERE
is the
Origin of Species
text file that contains all these clusters. If anyone would like the code-hunting
computer program I wrote, just email me (lee AT stellaralchemy DOT com). I spent
very little time on it, so it's not especially user friendly (otherwise I'd just
post it up on the web).