Legacy of Anecdotes
Paul R. Pillar, a long-time intelligence professional, has written an article in the current issue of Foreign Affairs to defend the record of the CIA as a prognosticator of future events against the assertion in the book Legacy of Ashes that the CIA is pretty much worthless in this capacity. It seems to me that the whole debate is being conducted – at least in public – without the data that would be required to draw any meaningful conclusions.
Legacy of Ashes describes a whole series of Agency prediction failures. Pillar responds, somewhat sensibly, that prediction in this field is inherently hard, and then lists correct CIA predictions for the Six-Day War in 1967, the Tet offensive in 1968 and the social and political aftermath of the invasion of Iraq in 2003.
While Pillar is certainly right that the correct standard to apply to CIA predictions is not “correct 100% of the time or else the CIA is incompetent and worthless”, it is also not true that the right standard is “correct anything more than 0% of the time and the CIA is worth any amount of money we spend on it”. If the CIA made 3,000 predictions between 1950 and today, and got exactly the three that Pillar cites right, a 0.1% rate of correct predictions doesn’t even come close to beating just flipping coins. What we need to know is the CIA’s predictive batting average versus the best available alternative. Once we have that, we could try to evaluate the positive benefits of any increased accuracy and compare it to the costs, broadly considered, of the CIA.
Of course, one of the reasons people often cite batting averages is that baseball has lots of discrete episodes within a given game, and therefore lends itself to statistical evaluation. Intelligence is surely more like soccer, with fluid play and ambiguous beginnings, ending and results of episodes. So such an analysis of CIA predictive accuracy would therefore by necessity incorporate many elements of judgment. But without it, all we have is dueling anecdotes.
(cross-posted at The Corner)
Legacy of Ashes is a book I think everyone should read (but I would say that, wouldn’t I!) I think that Pillar has a point. The thing is, though, that the incompetence on display in Legacy of Ashes isn’t just predictive; it’s functional. Time and again, the CIA mangled its operations and failed to achieve the ends it was trying to accomplish. What makes this particularly galling is that these failures occurred so often in tandem with truly lamentable behavior to the people of other countries. As I’ve mentioned here before, an interesting aspect of being a member of the non-interventionist left (as I am) is how denialism tends to give way to justification. As more evidence is assembled, and as the Freedom of Information act does its work, arguments shift from “No, of course the United States didn’t arm Turkey to aid in their assaults on the Kurds”-style to “Yes, the United States armed the Turks and facilitated their assault on the Kurds; but doing so made the United States safer, and was thus sadly necessary.” But what Legacy of Ashes demonstrates is that the agency failed even to accomplish its basic goals, even while committing so many crimes against foreign populations. It makes the lamentable actions of our country’s foreign policy tragic, in that we often did despicable things towards no constructive end.
— Freddie · Mar 6, 11:05 PM · #
I think you’re right about what sort of test would be meaningful when applied to social/political/historical predictions, but I wonder if such a test has ever been successfully applied to any such prediction-generating enterprise. I imagine there’d be a lot of pretty tricky methodological difficulties involved, viz. the two prediction-makers you’re comparing would have to have made a lot of predictions that are mutually exclusive with the other’s predictions. Either of the type where one says “X will happen” and other says “not-X will happen”, or the type where one says “if A, then X” and the other says “if A, then not-X”, and then A does, in fact, occur.
— Bryan · Mar 6, 11:18 PM · #
It seems like your assessment’s a little unfair. I mean, if you really wanted a good assessment of US intelligence it’s not as simple as a percentage of correct predictions. You’d weight those by how much they mattered. So if the CIA predicts the outcome of the latest Russian presidential elections, big whoop. If they predict India’s nuclear tests, way good, man, that’s one we’d want to note. And we’re hardly without means there, even without access to classified information, because it’s certainly within our capacity to make a list of big BIG events, in terms of US interest, that people tried to keep secret from us before they happened — India’s nuclear tests, Iraq’s nuclear program ca. 1991, Iraq’s WMD ca. 2002, the breakout of the Six Day war. And then you go looking for a percent hit on those things at least on the basis that, well, screw all the small stuff. That’s what Pillar’s ``anecdotes’‘ are doing. It ain’t totally quantitative, but surely it’s not just scatter either!
— Sanjay · Mar 7, 12:42 AM · #
Well, since the CIA’s accuracy is not only unknown but, if they’re doing their job well, downright unknowable, we really need to look at it from the opposite angle: Given the limits in confirming the CIA’s accuracy, what level of state actions can we justify based on classified CIA reporting? Clearly, full-scale invasions are out of the question for the future, while occasional UAV potshots at al-Qaeda leaders would still be considered ‘actionable’. But that leaves us with a whole big middle ground full of interesting cases. Could we ‘Bomb, bomb, bomb Iran’, as one presidential candidate so eloquently put it, based on nothing more than a CIA analysis?
— Bo · Mar 7, 04:03 AM · #
Freddie:
Consider an analogy. Some conservative tells a bunch of stories about waste and stupidity in some government welfare program. All the stories are true, and they’re bad and there are a lot of them. Should we terminate the program? Wouldn’t you first want to understand whether “a lot” was 0.001% of cases the program handled or 25%, and wouldn’t you want to answer the question “if we shut this program down, what is our next best alternative?” before deciding what to do?
Bryan:
Check. Oracles normally try their best to make non-falsifiable predictions as one of many strategies to avoid accountability, it’s just human nature (e.g., look at your horoscope from last month – was it “right”?) It is an important function of management to establish some predictive competition and meausrement.
Sanjay:
Check. This is why I said you would need a separate excercise to value gains in predictive accuracy, which would clearly have to weight the importance of accuracy differently for “X will set off a nuke in Manhattan next week” than for “Y will win the regional noiminating contest for election to the deputy minister of cultural affairs of Malaysia”. I think the probelm with Pillar and Legacy is that we don’t have any view of a relevant batting average (even if this is a batting average only for “important” predictions) or a comparison to an alternative method of prediction to know if this batting average is good or bad.
Bo:
The CIA’s accuracy is unkown to US, but I think that is different than saying it is unknowable. I think it would be, relative to making global strategy decisions, cheap to get that information prior to using CIA predictions or assuming some low accuracy level.
— Jim Manzi · Mar 7, 05:30 AM · #