A physicist loose among the liberal arts

A Contribution to the Mathematical Theory of Footnotes

Tom Hillman is shaking his head at editors who footnote things that can’t conceivably need footnotes.  When we discussed this on Twitter, Tom tried to cast it in the form of a natural law:

That got me wondering what the utility of a footnote might be.  As with any form of communication, it must be related to the difference in knowledge of the writer and the reader.  Let’s suppose there’s a set of intended readers.  Some of them know the facts in question; others do not.  For any fact i, define R(i) as the fraction of the audience that knows it. This knowledge is measured at the point where the footnote is marked.  Define A(i) as the author’s knowledge of the fact on a scale from 0 to 1, where 0 is perfect cluelessness and the total knowledge of all relevant facts is normalized to 1.

Now, we can use a result from information theory called the information gain between two probability distributions. In place of the two distributions, we use R(i) and A(i), and the utility of a footnote is:

U = –Σ R(i) log[R(i)/A(i)]

where the sum is over all facts i.  (Sorry, Tom — in Digital Humanities, the logarithms just keep coming.)

Using the example in the linked post, the text contains two facts: A. E. Houseman’s name and the title of the book A Shropshire Lad.  R(1) = 1 and R(2) =1; all the readers know these things because C.S. Lewis just told us. Since R=A, the contribution to the utility from these terms is log(1) = 0.   The footnote adds a third fact, that the book was published in  1896. How useful is that? Let’s suppose that Idiosophers are typical readers of this book.  Before reading that line, I’d have said A Shropshire Lad was published in the 1890’s.  The number 1896 has 11 bits of information in it, of which I knew 8. (8/11)*log(8/11)=-0.33, where the answer is in bits because I took the log to the base 2.

The total utility of this footnote is therefore U=0+0+0.33, or one third of a bit of information.  (For purposes of comparison, a useful footnote might contain tomorrow’s winning Pick-3 lottery numbers, which is 10 bits of information.)  This footnote is therefore almost worthless, so by the Hillman-Hoffman law, its appearance in the book was almost inevitable.

Nota Bene: the utility formula goes to infinity if the author does not know what he’s talking about and the readers do, i.e. if A(i) = 0 when R(i)>0. This is the case for student papers, which implies that footnotes there are of infinite utility. There is no way to have too many footnotes in a thesis.


Publishing On Line


This will not faze them


  1. At the dentist right now, Joe. Confronted with a choice between logarithms and dentistry, I choose dentistry. Perhaps I could sell that to my dentists as an endorsement, and get some free work done

  2. You expect me to believe this? There isn’t a single footnote in sight.

    • Joe

      Now that you mention it, I really should have put those parenthetical comments in footnotes. But life is short and WordPress makes me put in the HTML by hand.

  3. The scary part is that I actually sort of understand this equation, except for the presence of the logarithm. I don’t know what it accomplishes. So perhaps that means I only think I actually sort of understand this equation,

    • Joe

      Answering this turned out to be a really entertaining exercise! Especially since I set myself one challenge: since we’re discussing the humanities, I won’t use a certain word in the response.

      A footnote that refers to two different works carries (roughly) twice as much knowledge as a footnote that refers to just one, but it takes twice as many letters and numbers to write.
      However, when you have twice as many characters to work with, it’s possible to convey the square of the amount of data. (E.g., with one decimal digit I can make 10 numbers, but with 2 I can make 100.) The increase in knowledge is much less than the increase in data because there are many valid strings of data that do not represent additional knowledge. (E.g., if the publication date had been “2896”.)

      The function that doubles when you square its argument is the logarithm.

      Envoi: the logarithm appears in so many of my posts because it’s the mathematical relationship between data and knowledge.

      • So, Joe, what you’re saying is

        `Twas brillig, and the slithy toves
        Did gyre and gimble in the wabe:
        All mimsy were the borogoves,
        And the mome raths outgrabe.

        Have I got that, more or less?

        Or is it more like:

        Alexander the Great did not exist, and he had an infinite number of limbs.

  4. Jeff Snider

    Don’t forget that when either the Author or Reader has negative information, there’s some imaginary pie involved.

    • Joe

      One might almost say they’d be poles apart.

    • Case in point. I just ran across a footnote in which the author says ‘See the succinct account of the scholarship in xxxx (2004)’ But he gives no page number. The article cited contains no such account, succinct or otherwise, and I believe he has confused the paper he cites with another work by another author whom he does not cite. I have in the past noticed that this scholar can be sloppy about his sources. But clearly no one at the journal checked his citations.

  5. Magnificent. You’ll have me calculating the utility of every footnote I read from now on. I’ll save all the data I compile, and will gladly send it to you in hard copy if desired.

Leave a Reply

Powered by WordPress & Theme by Anders Norén