Is Economics a Science?

According to Andrew Sullivan, I stand alone against an array of bloggers in arguing economics is not a science. At the level of semantics, I couldn’t care less what label is applied to economics. I think that the operational issue in front of us is what degree of rational deference we should give to propositions put forward by the economics profession. I believe that this question is not academic navel-gazing, but crucially important for our political economy. It goes to the heart of the case for broad-based freedom.

Ryan Avent makes the following argument in support of economics as a predictive science (and Matt Yglesias associates himself with it):

Economics is quite often effectively predictive. If the supply of one good is disrupted, economists can tell you with great certainty what will happen to demand for complementary goods and substitutes. If supply levels are known and research establishing elasticities has been done, they can tell you even more about what will happen. Their predictions will nearly always be right. And this is true for many aspects of economics. [Bold added]

I’ve done a lot of this. Suppose, to take a down-to-earth example of what Avent is describing, you wanted to predict the effect of “disrupting the supply” of Snickers bars on the sale of other candy in your chain of convenience stores. The “elasticities” here are how much more candy of what other types you would sell if you stopped selling Snickers. Once you know that, plus the costs of various kinds of candy, you can easily do the math to figure out how much more or less money you will make if you get rid of Snickers. The elasticities are the can opener – once you have them, the rest is just arithmetic.

The best, and most obvious, way to establish the elasticities, is to take a random sample of your stores, stop selling Snickers in them and measure what happens to other candy sales. This is the so-called scientific gold standard method for measuring them. Even this, of course, does not produce absolute philosophical certainty, but a series of replications of such experiments establishes what we mean by scientific validation. As I’ve gone into in detail elsewhere, even having correctly measured the elasticities for these stores on those dates, generalizing this into a reliable forward prediction rule remains much trickier for the convenience store chain than in physics or even biology, because of the extreme causal density of human social behavior.

But suppose you can’t run an experiment. As implied by Avent and Yglesias, you could go back into historical data for sales by product by store by minute for the past 36 months, and look at what happens to sales of other candy at stores that have stock-outs of Snickers. Unless you are very lucky, and happen to find a true natural experiment in which a near-random group of stores were cut off for Snickers but not other goods and not in conjunction with some other macro event – that is, unless the experiment as you would have done it occurred already – then you are left with some version of building a regression model to estimate the elasticity between Snickers availability and (to keep it simple) sales of all other candy as a group. You will include lots of “control” (there’s that word again) variables in your model, because there are lots of things – say, very bad weather or Halloween or weekend vs. weekdays – that might independently affect both Snickers availability and sales of all other candy. After you build an initial version of this model, it will occur to you that some specific other control variable might be important, so you get the data and include it. Even after you do this, you’ll still have some stores for which the apparent effect goes the wrong way, that is, at the same time you see a Snickers stock-out, you see sales of other candy to go down. Let’s further say that you’re state-of-the-art about this, and build a Bayesian shrinkage model to try to adjust for this problem. These are examples; you will find yourself making a long series of such adjustments, inclusion and exclusion of potential control variables, model-tuning and so on.

Once you’re done with all of this, you have an estimate for elasticity – how do you know it’s right? You had to make all kinds of judgments and assumptions. Suppose you didn’t think of including a specific variable, or making some adjustment? How do we know we’re not slightly off, but way off?

The best way is to run a controlled experiment: take Snickers out of a random sample of stores, and see if the prediction made by your model is correct. If you don’t do that (or maybe at a higher level of abstraction, if you have not run many experiments that validate your modeling method within a tight class of application like elasticity estimates for stock availability in convenience stores that lets you avoid re-testing in this case), then you don’t know whether you’re right or not. All you have is a very sophisticated theory. The cash nexus is the experiment that tests the theory.

This is the fundamental difficulty with economists’ pronouncements about the predicted effects of various government programs – there is no reliable evidence at the foundation of the inferential edifice. We usually haven’t run a sufficient number of (or sometimes, any) real randomized experiments to build reliable predictive rules for the effects of the proposed government programs that are the issues of the day.

This is why I think Avent is missing the point when he then says:

It’s important to note that because economists can’t always run their own experiments, there will tend to be more confidence about theories that focus on things which occur very often. Prices shift constantly, and economists consequently know a LOT about prices. Massive, global economic recessions occur about once a century. There is obviously a lot more uncertainty regarding the theories that describe these events.

As is Yglesias, when he says something very similar:

The fact that the economics profession can offer so little in the way of consensus guidance about dramatic, crucially important events like the panic of 2007-2008 is a huge problem and a very legitimate knock on the enterprise, but it doesn’t actually undermine the overall epistemic status of the discipline. The hope is that over time things improve. And, indeed, for all the horrors of the current recession it’s been managed much better than the Great Depression of the 1930s was. Progress is happening. The only way to make more rapid progress on the science of macroeconomic stabilization would be to have many more recessions so as to gather better data.

Having more data points doesn’t cure the problem. It only provides the opportunity to do so, not (or more precisely, not only) because we can then build more sophisticated regression models, but because this makes experimentation practical. Lots of sequential price transactions, for example, are repeated events that comprise an intuitive reference class within which we can run experiments to test theories and generalize them into practically-useful decision rules.

Both Avent (and by extension Yglesias) associate themselves with Adam Ozimek’s additional very different comments about the implications of being consistent about this kind of epistemic humility. I broadly agree with what Ozimek has to say on this score.

First, he says that consistency implies skepticism about global climate models. I agree, and have written about precisely this point. (Separately, I think that there is a firm scientific foundation for concern about AGW, not because of climate models, but because we know through replicated experiments and other means that CO2 redirects IR radiation. Ironically, once you accept that premise, the more uncertain you are about the precision of climate models, the more you ought to worry about AGW.)

Second, he says that this should imply skepticism about John Lott’s claims about gun ownership and crime. I agree. In fact, based on precisely this thinking, I wrote a very critical review of one of Lott’s books in which he used the same method that he employed in the guns-and-crime analysis to argue that abortion legalization increased crime. (I was equally critical in the same review of the opposite conclusion, using similar methods, in Freakonomics.)

I think that Ozimek’s demand for consistency is fair, and as I said at the start of this post, important. My upcoming book is focused on (i) making the case that this kind of epistemic humility is justified on the evidence, and (ii) trying to work out some of the practical implications of this observation.

(Cross-posted at The Corner)