The Origin Codes
"Nine Trees"
The Nine Trees Cluster
On March 4th of 2006 I found the following very tight
cluster of nine types of trees in the Origin of Species...
Spruce ,
Peach ,
Birch ,
Apple ,
Cherry ,
Maple ,
Pecan ,
Orange ,
Beech ,
They Are Rendered ,
Much More Close
[89x40 = 3,560]
The exact location of each tree name is given in the table below.
|
Tree
|
Start index
(base 0)
|
Skip
|
|
Spruce
|
685,769
|
8,200
|
|
Peach
|
866,146
|
16,368
|
|
Birch
|
636,617
|
24,599
|
|
Apple
|
702,177
|
24,585
|
|
Cherry
|
620,239
|
10
|
|
Maple
|
612,050
|
24,596
|
|
Pecan
|
620,290
|
8,184
|
|
Orange
|
816,969
|
16,385
|
|
Beech
|
612,070
|
8,197
|
The cluster is 89 characters wide and 40 characters high.
In it is written
"THEY ARE RENDERED MUCH MORE CLOSE"
.
I decided to see if that's truly the case. I checked other
large texts of equal or greater length for the same cluster
of trees. In each case I trimmed the text down to its first
1,009,229 characters so that it matched the length of
The Origin of Species
exactly, and found the best cluster of the nine trees
listed above, according to the formula below.
In order to compare the results from different texts, I
devised the following score formula for each cluster:
score = W * H * max(W/2H , 2H/W)
W * H
is the area of the cluster (width times height) and
max(W/2H , 2H/W)
is a factor that reflects the degree to which the cluster
deviates from an aesthetic 2:1 aspect ratio. As the enclosing
rectangle's aspect ratio deviates from 1:1, the more likely it
becomes to contain a given cluster anyway, but I'll post up
more on that in the near future. This extra factor makes the
score more closely approximate what one might intuitively
consider a "tight" cluster than simply giving the
raw area. It penalizes clusters with aspect ratios that are
far from 2:1. As you'd expect, a lower score means a better
cluster. I didn't break ties based on area, so when I ran a
search on
The Origin of Species
for the cluster, it found one slightly larger in area, but
with the same score as the one shown above.
Here are the results of that experiment...
|
Text
|
Author(s)
|
Cluster dim.
|
Score
|
|
The Origin of Species
|
Charles Darwin
|
89 x 42
|
3,960.5
|
|
The Descent of Man
|
Charles Darwin
|
92 x 42
|
4,232.0
|
|
The History of England
|
David Hume
|
97 x 42
|
4,704.5
|
|
A Treatise of Human Nature
|
David Hume
|
91 x 49
|
4,802.0
|
|
War and Peace
|
Leo Tolstoy (trans: Louise & Aylmer Maude)
|
102 x 47
|
5,202.0
|
|
Ulysses
|
James Joyce
|
103 x 49
|
5,304.5
|
|
Wealth of Nations
|
Adam Smith
|
103 x 51
|
5,304.5
|
|
The Count of Monte Cristo
|
Alexandre Dumas
|
105 x 51
|
5,512.5
|
|
Summa Theologica Part I-II
|
Thomas Aquinas
|
98 x 53
|
5,618.0
|
|
Shakespeare's First Folio/35 Plays
|
William Shakespeare
|
109 x 54
|
5,940.5
|
|
The KJV Bible
|
Unknown
|
109 x 56
|
6,272.0
|
Not too bad for a cooked list.. er.. um, I mean a cluster I found
near the words "They Are Rendered Much More Close". It's
interesting (and meaningful?) to note that the runner up is also a
book by Darwin, and that the book that came in dead last is the
KJV Bible.
These results have prompted me to work a bit harder at evaluating
the "goodness" of a cluster, so that I can come up with
experiments potentially much more impressive than this one. I will
post more on that front in the future.