The Nine Trees Cluster On March 4th of
2006 I found the following very tight cluster of nine types of trees in the Origin
of Species... |

|
The exact location of each tree name is given in the table below.
The cluster is 89 characters wide and 40 characters high. In it is written "THEY ARE RENDERED MUCH MORE CLOSE". I decided to see if that's truly the case. I checked other large texts of equal or greater length for the same cluster of trees. In each case I trimmed the text down to its first 1,009,229 characters so that it matched the length of The Origin of Species exactly, and found the best cluster of the nine trees listed above, according to the formula below. In order to compare the results from different texts, I devised the following score formula for each cluster: score = W * H * max(W/2H , 2H/W) W * H is the area of the cluster (width times height) and max(W/2H , 2H/W) is a factor that reflects the degree to which the cluster deviates from an aesthetic 2:1 aspect ratio. As the enclosing rectangle's aspect ratio deviates from 1:1, the more likely it becomes to contain a given cluster anyway, but I'll post up more on that in the near future. This extra factor makes the score more closely approximate what one might intuitively consider a "tight" cluster than simply giving the raw area. It penalizes clusters with aspect ratios that are far from 2:1. As you'd expect, a lower score means a better cluster. I didn't break ties based on area, so when I ran a search on The Origin of Species for the cluster, it found one slightly larger in area, but with the same score as the one shown above. Here are the results of that experiment...
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Not too bad for a cooked list.. er.. um, I mean a cluster I found near the words "They Are Rendered Much More Close". It's interesting (and meaningful?) to note that the runner up is also a book by Darwin, and that the book that came in dead last is the KJV Bible. These results have prompted me to work a bit harder at evaluating the "goodness" of a cluster, so that I can come up with experiments potentially much more impressive than this one. I will post more on that front in the future. |