Expert Political Judgement: How Good Is It? How Can We Know?

One of the unsung benefits of stable employment in a place with a relatively rigid union contract is that you are expected and required to take your full lunch break. No subtle hints are dropped that we could really use your help, Jacques, we’re under a bit of pressure here.

Whatever is one to do with such a lunch hour? Some folk, not understanding the game theory precept that you should anticipate everyone else’s move, use lunch hour to run errands only to discover that every other bastard had the same idea. Some people head out for lunch — same problem.

I however found that lunch hour was a perfect opportunity for reading, especially in combination with my Kindle DX. I’ve ploughed through dozens of books an hour at a time.

Wrapping up employment and moving to Perth threw a spanner in these idyllic works. It so happens that I now don’t have a particular hour each day when I get reading done. So the process of reading Philip Tetlock’s Expert Political Judgement: How Good Is It? How Can We Know? has been greatly drawn out over the course of several months.

Which is a pity. Because it’s a smashing good read.

What’s it about?

Over the course of twenty years, Tetlock and his team performed a uniquely interesting study. They interviewed hundreds of political experts and pundits. These interviews, including carefully structured questions, we used to harvest thousands of testable predictions about how different political situations would play out.

Tetlock doesn’t just summarise the findings (in short: humans suck at this task, but some suck less than others). He also goes into great detail about the hypotheses he was testing and how they were tested. He also fastidiously gathered theoretical objections and built them into his study design. So for example he has a number of “adjustment factors” which can be applied to predictions to make the scoring “fairer”, if this is desired.

The result is a nuanced, introspective book. Tetlock has successfully married the self-deprecating honesty of the academic common room with a good eye for fun, flowing prose.

Yes, yes … but what were the findings?

Humans Suck At Predicting Things

The first finding is that humans are pretty dreadful at predicting the behaviour of complex systems.

This shouldn’t, one supposes, come as a total shock. In his book The Logic Of Failure: Recognizing And Avoiding Error In Complex Situations (unreviewed), Dietrich Dörner discussed experiments in which he tortured otherwise lovely people by asking them to govern simple systems. Astute readers may remember my review of Drift into Failure I mentioned a simple system with a dial for controlling the admission of cold air into a room and an adjacent thermometer readout.

I based that example on an experiment Dörner performed, asking subjects to reach and maintain a goal temperature. This is one of the simplest possible control tasks you could ask and yet most subjects simply could not achieve it. The mere presence of lag in feedback was enough to stymie most attempts at the kind of smooth temperature control that can be achieved by 5 cents of electronics.

Subjects would invent elaborate hypotheses to explain their observations. Apparently the most popular being that the whole situation was “rigged”. Game programmers will recognise this phenomenon — protests about the mathematical properties of pseudo-random number generators will fall on deaf ears because it is a dead certainty that the 7 of spades always follows the jack of hearts when two queens are on the table and the third player to the left of the dealer has called and …

So I suppose that learning that humans are just terribly bad at making predictions about the unfolding of complex systems isn’t surprising. Indeed there’s a whole gaggle of sciences built around trying to characterise systems which are nominally deterministic but which defy human comprehension and prediction. Tetlock says that he deliberately chose the messiest, noisiest domain he could find: politics.

Some Humans Suck Less Than Others

Yet while the average of all predictions is basically terrible, some classes of predictors are better than others.

For example, experts perform better than students. But don’t get too excited — because it doesn’t much matter what the experts were expert in. Experts with a deep and ongoing engagement with some geopolitical subject (old Russia hands, middle-East buffs, South Africa watchers) struggled to baseline above dilettantes who were expert in some different field. Basically any reader of The Economist or The New York Times, possessed of a decent education and ebullient sense of self-importance, can come along and make predictions that are about as likely to pass as the leading experts in the field.

But the most important distinction amongst experts is between “foxes” and “hedgehogs”. Foxes are intellectual Bower birds, content to assemble concepts and theories from anywhere it is found into a jumbled stew of intellectual juices. Hedgehogs instead have a theory or family of theories, which they face outwards to govern their comprehension of the world. Hedgehogs know “one big thing”, foxes know “lots of little things”.

The bottom line is that foxes perform better than hedgehogs in every fashion that Tetlock tested. Without making generous allowances, hedgehogs are left far behind in terms of accuracy. They tend to predict outcomes which were further from the baseline, to predict them with higher certainty and to argue more vociferously that they still deserved credit even when circumstances did no so turn out.

Where foxes can get trapped is in scenario thinking. Encourage a fox to think of multiple scenarios and he or she can quickly get tied up chasing tails and burrowing down rabbit holes; coming up with narratively coherent but otherwise unlikely predictions. Hedgehogs, who always have a north star in the form of their “one big thing”, were less prone to this weakness.

But Humans Still Suck

Don’t start preening just yet, you sly fox, you. Foxes performed terribly when compared to even embarrassingly simple linear extrapolations. More formal statistical models of how a system’s trajectory might unfold absolutely trounced all humans of every stripe — foxes, hedgehogs, experts, dilettantes, students, regardless of education or experience or access to information.

Similarly, humans are poor at updating their theory of the world. Folk love John Maynard Keyne’s quip: “When the facts change, I change my mind. What do you do, sir?” But Tetlock’s evidence suggests that Keynes, whether or not he was a fox or a hedgehog, would probably not have changed his mind all that much. Humans update their beliefs far more begrudgingly than various logical frameworks (particularly Bayesian probability) would require us to.


As I was sketching out this review, this time capsule of predictions by science fiction writers was published. An entire profession of people who think about the future: and they got almost everything wrong.

It’s tempting to point and laugh, except — we all suck. I suck, for example. Several years ago I predicted the inevitable demise of shared hosting. So far, no luck. I think I also guessed that Twitter would go the way of Second Life — a long, slow and irreversible descent into total and well-deserved obscurity. Sadly I was wrong.

It would appear that we need, from time to time, to place our faith in the numbers. This is definitely the credo of engineer-y types. “In god we trust”, goes the remark by Deming, “all others must bring data”. Of course, the models are wrong. Every model is wrong. But statistical models are going to be sufficiently less-wrong that, for many important classes of problem, they are more useful than human intuition.

Wrapping it up.

In a little Socratic dialogue in the back of the book, Tetlock channels four different characters, including a strict positivist and a rampant postmodernist. It’s a lovely and literary self-indulgence. It doesn’t add much to the findings, but it did add greatly to my sense of satisfaction as I was coming to the end of the book.

I won’t lie: I found myself cheering for the positivist. In that respect, I suppose I am a hedgehog. And here’s a big thing that I know: this is a good read.


This entry was posted in Books. Bookmark the permalink.

4 Responses to Expert Political Judgement: How Good Is It? How Can We Know?

  1. Add it to your list, LE.

  2. Pingback: The Essence of Hayek (Part 1) | Journal de Jacques

  3. Eli says:

    Of course, the big, massive problem here is that formal models have damn huge tractability issues. Given a human being who can bullshit a model in enough time to make an actual prediction *before* a fact arrives that requires updating beliefs, versus a formal model that will compute for a long time before eventually spitting out an answer that agrees retrospectively with what was observed, which is more useful? How can we quantify the balance between the two?

Comments are closed.