A physicist loose among the liberal arts

Month: May 2016

Persistence of Family Names

If I still had any doubts about how much we can rely on the persistence of family names, Matt Yglesias just fixed them.  The article reports the results of a study of wealthy people in Florence, Italy in 1427 and 2011.  The richest people today have the same family names as the richest people in the fifteenth century.

The original paper is by Barone & Mocetti of the Bank of Italy.  More information is in a column from the Center for Economics and Policy Research.

Proving a Thesis and its Limits

Prof. Olsen’s Dracula Lecture 8 includes a special bonus rant on the wrong way to write papers about literature. It matches up marvelously with the next section of my paper. The issue, in a nutshell, is that if students think up a thesis and then look for evidence to support it, they can usually find some.  Which is a good first step, but it doesn’t go far enough. Stopping there lets the writer get away with a thesis that’s not necessarily true. Ideally, the writer should also collect all the evidence that the thesis is wrong, and then decide which set is more convincing.

This is one of those cases where being a scientist helps.  Standard methods for data analysis take contrary evidence into account on an equal footing with supporting evidence, so the subject of Prof. Olsen’s rant is one of “the blunders we didn’t quite commit” (in Piet Hein’s words).

Which brings us to the core of the paper:  how do the regions of England that provide the names of hobbits relate to their role in the story?

Hypothesis: Family names from Birmingham or the West Midlands are close to the Narrator; names from other parts of England indicate families to be kept at arms length; and names that aren’t found in England indicate families that are liminal or distant from the Shire.

regions of England

Administrative Regions of England

I’ve previously defined the categories of families. The regions of England are from Wikipedia.  Birmingham, where J.R.R. Tolkien grew up, stretches from the “W” to the “a” in “West Midlands” now; it was much smaller a hundred years ago.

These are administrative regions, but I’ve checked with an English colleague, who confirms that the regions have cultural significance as well as political.  If they were both in London, for example, a person from Warwickshire and a person from Shropshire would agree that they are almost neighbors, as if  they came from the same place.  (An example of the opposite case would be a Virginian and a Marylander. We don’t feel like we’re from the same place, even when we’re both in California.) So it makes sense to include everyone from the West Midlands in a single category, which is essential to this project because the heat-maps are only that precise.

role vs. region.

Hobbit families, by region and role

When we count the number of hobbit families in each group and region, the relationship looks like this figure.  Birmingham names are dominant among the “close” group and rare among the others.  Names from other parts of England are almost as common among the close group, dominate the “arms-length” group, and drop off in the other groups.  Names that do not appear commonly in England are steady across the four groups.  Of the three clauses in the hypothesis, the first seems likely true, but the second and third are dubious.  Not so good.

group vs. region, weighted by importance

Hobbit families by group and region

All names are not equally important, though.  When the importance of each family to the story is included, the graph looks very different.  Important characters with Birmingham names are overwhelmingly close to the narrator.  Other English names dominate the “arms-length” group, as we expect.  The high value of the red line in the “close” group is almost entirely due to Sam Gamgee, as we noted ‘way back at the beginning of this project.  (If Sam were “close”, the red line would drop to 15 at “close” and the purple line would jump up above 35. More on that later.) The big spike of important, non-English names in the “liminal” category is mostly due to Merry Brandybuck.  “Distant” families aren’t important at all.

So, to take us back to the top of this post, the preponderance of the evidence supports the hypothesis. The “Birmingham” line slopes sharply downward, the “Middle-Earth” line of names that sound strange slopes upward, and the “England” line of names that should sound like they’re from far away is in between the two.  The causality runs only one way: if we’d tried to prove that families close to the Narrator were from the West Midlands, the first graph wouldn’t agree.  (Only about half of the “close” families are from there.)  Using a scientific approach tells more than one side of the story, and sets limits on the strength of the conclusion.  With that I shall close, and amuse myself by imagining the look on the face of my high-school English teachers if I’d ever turned in a paper with graphs in it.

Seward’s Folly

Dr. Seward, the narrator of a large part of Dracula, sometimes seems like he’s there to make the reader feel relatively intelligent.  His inability or unwillingness  to comprehend things outside his experienced make him, despite his self-avowed erudition, the last person to understand what’s going on.

Dr. Seward refers to himself as a “sceptic” four times over the course of the novel. Old Pyrrho being unavailable, I’ll step in to say that’s not really what he is. Skeptics don’t believe absolute knowledge is possible, and that includes their own preconceptions. Seward has a solid base of things he knows, and anything contradicting it gets disregarded.  Skeptics doubt their own working assumptions and even the framework in which they reason, the same as new information they receive.  Dr. Seward isn’t doing that at all.  In terms of Bayesian logic, he’s reasserting strong prior probabilities in the face of evidence to the contrary. There’s a word for that: the economist Noah Smith calls it “derp”.

Wait – what’s a “prior probability”? Bayes’s theorem is one of those amazing mathematical results that sits there for centuries before anyone really gets its significance. The basic idea (and you can look to Dr. Smith’s blogpost for a better explanation than mine) is that every thinker has a certain prior base of knowledge that she uses to interpret new information.  As new information comes in, it modifies the odds of each thing in the base, leaving the thinker with a new “prior distribution” of (in this case) the likelihood that each possible cause gives rise to future observed effects.  The mathematical operation that makes that happen is multiplication. One immediate result, therefore, is that if your prior distribution says the likelihood that thing X caused event Y is exactly zero, then the new information gets multiplied by zero.  There’s no amount of new data that can make you think X is really going on.  Dr. Seward has a prior distribution with zeroes assigned to everything he didn’t learn in school.  A skeptic uses a prior distribution with no zeroes in it at all (like a bell curve), because those zeroes are awesomely powerful things, and they’re not to be trusted.

Perhaps I’m being too hard on the good Doctor, but I feel justified because there are examples right next to him of better ways to reason. Characters who use prior probabilities derived from literature seem to work much better. Mina Harker, for example; at times she seems to be the brains of the whole outfit. Why do I say her priors come from literature? Mina may have taken Corey Olsen’s Faërie and Fantasy class (or the 19th-century equivalent).  She knows how to compose an oath so it doesn’t later cause trouble in an entirely-predictable way.  Maybe this is why Prof. Olsen says she’s “awesome”.  Mina reasons from stories. Here’s how she swears never to read her husband’s diary: “I would never open it unless it were for his own dear sake or for the sake of some stern duty.” (Chapter IX)  When I read that, I said, “Brava!” [1]   She drew the crucial lesson from Arthurian romances – be really careful how you swear oaths.  That “unless” clause made the happy ending possible.  (Oops – spoiler!) The sons of Fëanor should have been so wise.

Jonathan Harker has a similar skill at hedging his oaths, though I’m sure his prior probabilities come from law school.  On the expedition to Dracula’s castle, Mina tries to make him understand that her life is secondary in importance to ridding the world of vampires. “’Jonathan, I want you to promise me something on your word of honour. A promise made to me, but made holily in God’s hearing, and not to be broken though I should go down on my knees and implore you with bitter tears. Quick, you must make it to me at once.’
“’Mina,’ I said, ‘a promise like that, I cannot make at once. I may have no right to make it.’” (Chapter XXIV)  You have to love that “may have” — he won’t even commit to that, without consulting his books of precedents.  Any knight of the Round Table would have sworn instantly and suffered for it for the rest of the poem.

The similarity of Mina’s and Jonathan’s thought processes raises a question.  Instead of school, did Mina learn this mode of thought after meeting Jonathan, to be a better wife?  Doing such a thing would be consistent with her character, since it’s not much more difficult than memorizing Transylvanian railroad schedules without speaking Romanian. But I prefer to think that it’s the way she was educated. Victorian girls were taught by literary example (I admit it: my own prior probability distribution is influenced most heavily by Alice’s Adventures in Wonderland). I’m sure that a mode of thinking so consistent with the British legal system was one of the salutary qualities that attracted Jonathan to her in the first place.

[1] Some people write in the margins of books.  I talk to them.[back]

Powered by WordPress & Theme by Anders Norén