A physicist loose among the liberal arts

Month: April 2016

I’m in!

The NY Tolkien Conference 2016 has accepted my paper.  They’ve posted the first iteration of the program, and I’m on it.

They’ve extended the deadline for submissions to June 1st. This has happened with depressing frequency at aviation conferences in the last year – the paper I’m presenting Tuesday at I-CNS is one I submitted when I found out about their extension.

Surname Mapping

An annotated bibliography of surname mapping. Research by James Cheshire and his collaborators underlies this ardagraphic study.  Dr. Cheshire has a blog in addition to his university site linked above.

Oliver O’Brien, Suprageography

O’Brien’s data visualization blog post got this project started.  The public-access web portal provided the qualitative data for classification of family names.  If you don’t have an English name, the latter site hosts a world-wide version (at a much lower resolution).

Cheshire, James A., Paul A. Longley, and Alex D. Singleton. “The surname regions of Great Britain.” Journal of Maps 6.1 (2010): 401-409.

The map of surname regions in Great Britain shows that distributions of names track well with the administrative regions.  The map itself, available for download at the link (17 MB) is gorgeous.

Paul A. Longley, James A. Cheshire, Pablo Mateos, “Creating a regional geography of Britain through the spatial analysis of surnames”, Geoforum, 42, 4, July 2011, Pages 506-516.

Mapping names in the 21st century is valid for this practice because Longley, Cheshire, and Mateos’s techniques make it possible to identify “combinations of location specific surnames that date back 700 or more years”.   Figure 5 shows that the  “Lasker distances” between Census Area Statistics Wards in a region cluster into a tight grouping, and each region is unlike other regions in England.  In fact, some geographic resemblances are visible through the multidimensional clustering:  wards in the West Midlands look like they feel a gravitational pull from Wales, as names originating in the Welsh language diffuse across the English border.

The Lasker Distance is elegantly simple. If we write the fraction of people in a small area i who have the name n as p(i,n), then the distance between areas i and j is -ln(Σ p(i,n)×p(j,n)) where the summation is over all names.  Names that don’t exist in one of the areas don’t contribute to the sum.

Once the distance in “name-space” between population points is established, the next step is to cluster the points in that space, and set the cluster sizes so that the result is interpretable in geographic terms.  The method used here is “k-means” clustering, and I hope I’m not being uncharitable if I describe it as “try every possibility and keep those that work”.  That’s unfair, of course — independent consistency checks are applied at each step; the choice isn’t arbitrary.

Cheshire, James, Pablo Mateos, and Paul A. Longley. “Delineating Europe’s cultural regions: Population structure and surname clustering.” Human Biology 83.5 (2011): 573-598.

Figure 7 in this paper shows the relationship between physical distance and Lasker distance for the countries they studied in Europe.  Culturally homogenous places like Poland and Luxembourg show a tight cluster of points, lying on a line that’s almost horizontal.  That is, you find the same names, no matter where in the country you go.

Scatter plot of Lasker and Geographical distance

Some aspects of cultural history are visible in this figure copied from Cheshire, et al.

Countries unified by language, such as France, Italy, and Germany, show a slanted line (on a log-log plot), with a moderate upward slope.  The further apart two villages are, the more likely you are to find different names in them.  (France has a small Alsatian tail.) Norway and Denmark are fascinating exceptions:  the line slopes downward! I’m just guessing here, but it could be due to the fact that until recently you didn’t get from one place to another by land.  By sea, travel times depend on wind and currents as well, so genetics and patronymics can have a more complicated relationship with distance.  (There might be a follow-on project, there, if I could only find family names in the Sagas.)

Spain has two distinct parts:  One for the mainland and and one for the islands.  They’re identical with respect to names.  The mainland isn’t a long, thin shape, it’s an incoherent blob, caused by mixed Catalan, Spanish, Arabic, and possibly a Basque scattering off to the side.

The United Kingdom is a dense horizontal sprawl of English, with oddly-shaped protuberances of Welsh, Scottish, and Irish that make drawing a best-fit line through the points an exercise in graphical uniformity, not statistical rigor.

are always interested in technical
details when the main question is
whether the stuff is
literature or not

Beyond Good and Evil

The criticism of The Lord of the Rings that annoys me the most, and I think I share this opinion with most fans, is when people say the characters are black and white; bad guys are wholly bad and good guys are wholly good, and never the twain shall meet.  These criticisms are made by people who’ve never tried to classify the characters into those groups.

In the post that started this ardagraphic quest, I used the term “bad guys” because I was joking.  Now that I’m seriously trying to make something of that work, I need to replace it with something relevant to the text.  The utility of the good guy/bad guy distinction fell apart for me when I tried to classify Lobelia.  She’s built up as a villain all through the first three chapters, but you have to love an elderly lady, two feet tall, attacking a six-foot oppressor with her umbrella. In any case her repentance at the end, which leads her to give Bag End back to Frodo, ought to disqualify her from the “bad guy” label. “Bad guy” is only useful when talking about Uruk-hai or Bill Ferny.  I hereby abandon it.

A better classification comes from my own experience living in Virginia.[1]  It’s not so much good and bad people, as there are the people you keep close to you, and those about whom you always find yourself saying, “Bless his heart,” (if you’re a woman) or “That’s just Joe” (if you’re a man). They’re not bad guys, per se, but they frequently seem to act in a way that interferes with other folks getting on with their lives. It’s good practice to keep them at arm’s length.

Separation, then, is the classification I’ll use.  The hobbits themselves talk in those terms, and the narrator reinforces it.  I’ll use the terms “close” and “arms-length” to describe the two types of characters above.  Bagginses, Tooks, and Gamgees are “close”; Sandymans and Sackville-Bagginses are “arms-length”.

I see two other kinds of hobbits, besides these Hobbiton types.  First are the fringe elements, who are perceived as being a bit strange and often uncanny. “They still had many peculiar names and strange words not found elsewhere,” the narrator says about the Brandybucks.  (Prologue, i) Likewise the Hornblowers, from ‘way off in the Southfarthing, who “had hardly ever been in Hobbiton before”. (I, i) Since this is a scholarly work, I won’t call them “fringe”.  I’ll say “liminal”.

The last category are the “distant” hobbit-names.  Hobbits in Bree have them.  Frodo uses the name “Underhill” when he’s in Bree because Gandalf (who’s been everywhere) knows that someone who hears it won’t think of the bearer as living anywhere near the Shire.

graph of names and importance

Fig. 1. Classification of Hobbit family names

The counts work out to 13 Close families, 5 Arms-length families, 4 Liminal, and 5 Distant.  These numbers are big enough to be just at the threshold where it doesn’t look silly to put them on a histogram. In Figure 1, the blue bars are the counts of families, and the red line is the sum of the importance of each family in the group.  The distribution of importance is also reasonable; distant characters are less important to a hobbit (and to a story), and the weighting shows that effect.  Families in the Liminal category are slightly more important to the story than those in the Arms-length category because of the presence in the former of Merry Brandybuck.


[1] This is not crazy, as the original post hints.  My (English) ancestors, like other long-time Virginians, originated in the West Midlands near Tolkien’s boyhood home, but they left to come here between 1619 and 1750.  They missed out on the birth of the Industrial Revolution in Birmingham, and kept an agrarian lifestyle until recently.  It’s reasonable to conclude that they’re exactly the kind of peasants JRRT had in mind when he imagined the Shire.[back]

Brandy is too a Panacea

For anybody who’s laughing at the doctors in Bram Stoker’s Dracula who prescribe brandy for any illness: Remember that Elrond gave Gandalf a flask containing a cordial that was a sure-fire cure for hypothermia and squid attacks.

Powered by WordPress & Theme by Anders Norén