David Frum argued, in part, that:
As a general rule, the more unequal a place is, the more Democratic; the more equal, the more Republican.
He and Andrew Gelman have gone back and forth on this and other points in the article. In the course of this exchange, David Frum has said that he thinks it’s more useful to consider inequality vs. partisan voting at the county level rather than the state level. That is, that one should most usefully define “place” in the quoted sentence as “county” rather than “state”. This question seemed to me to amenable to empirical resolution, so I decided to see if I could take a quick look at it.
I was able to find 2007 Gini coefficients, a standard inequality metric, produced by the Census for about 800 large US counties. But it occurred to me that if you read David Frum’s article, he is describing neighborhood effects rather than just individual effects. That is, if you had two equivalently unequal populations, one of which had the same (unequal) distribution of incomes in every neighborhood, and the other of which had a few very rich neighborhoods and lots of poorer neighborhoods, these two hypothetical societies might tend to vote very differently.
So I created my own additional inequality metric. The Census reports the average per capita income by Census Block Group (CBG). A CBG is a small geographic unit containing something like 1,000 – 2,000 people. There are about 211,000 CBGs in the U.S. There are about 3,000 counties or county-like political jurisdictions in the U.S. (using what the Census terms a FIPS number). So there are about 65 or so CBGs per county equivalent. By running a whole lot of spatial queries, I mapped each of the 211,000 CBGs to a FIPS county equivalent, thereby creating an average of about 65 point-estimates for average neighborhood per capita income within each county equivalent. I then calculated the standard deviation of the average 2007 per captia income across the CBGs that approximately comprise each county, and used this as an additional metric for inequality by county. This was designed to test for the kind of neighborhood effect that I referenced, but also has the advantage of providing an inequality metric that is uniformly defined for all counties, not just large ones. I calculated this same metric for the year 2000.
I used the percentage of the vote received by Bush in the 2004 election as the base measurement of Republican vs. Democrat, and used the change in the Bush vote between 2000 and 2004 as the measurement for change in Republican vs. Democrat.
In summary, more unequal counties tend to be less Republican in recent elections.
This finding is robust against time periods, samples of counties and definitions of inequality. Examined across the large counties for which I had both Gini coefficients and the neighborhood inequality index, inequality is negatively correlated with Bush’s vote percentage in 2004 and 2000 at a significant level with both indices. This same result holds for the full slate of about 3,000 counties using the neighborhood inequality index. I calculated the within-state relationship for the counties within each of the 48 lower states, and 46 show this same directional finding. (Professor Gelman will not be surprised to learn that Connecticut and New Hampshire are the outliers.)
The neighborhood inequality index is somewhat more correlated with voting outcomes across the large county sample than are the Gini coefficients. Interestingly, the correlation between these two inequality metrics is only about 0.6, and when combined each provides incremental information. In order to keep things simple, I created a hybrid index by normalizing and adding the metrics, which performs better than either one in isolation with very high significance. Here is the graphic:
When I examined the change in inequality 2000 – 2007 vs. the change in vote 2000 – 2004, I found no significant relationship between change in inequality and change in vote in any sample.
Based on this analysis, David Frum is correct that unequal places are less likely to vote for Republicans at the national level. To bound this, however, I should emphasize that this analysis neither supports nor refutes any assertions about causality. Inequality may cause changes in voting behavior, but it is certainly entangled with many other factors.
Trying to build some kind of a cross-sectional model that “holds all other factors equal” is almost certainly a fool’s errand. Interactions between drivers are a central, not a peripheral, component of such a complex social phenomenon. This complexity would overwhelm 3,000 data points pretty quickly. (Professor Gelman, who is one of the best statisticians in America, is acutely aware of this generic issue.)
If anything, the observation that changes in inequality don’t correlate with changes in voting tends to undercut the argument for (simple) causality, though obviously this was pretty crude analysis – I didn’t even have well-aligned time periods, didn’t consider possible lag or confounding effects and so on.
Ideally we would have some kind of structured experiment to establish causality, but it’s pretty hard to see that happening. Barring that, the next best solution would be natural experiments (e.g., look at location-periods with a sudden immigration spike because of some weird legislative change, etc.), but even there it’s hard to see how you would not have deep confounding. Even so, that seems to me to be best bet for some way to disentangle effects.