Why Are We Here? (In a Big Lecture, That Is)
Robert Waldmann Protests that the Obama Bounce Is Significant...

Why Oh Why Can't We Have Better Pollsters?

Matthew Yglesias says that friends don't let friends read the Gallup tracking poll:

Matthew Yglesias » The Three Day Itch: One thing a week’s vacation from blogging helps you get perspective on is the Gallup tracking poll. On August 1 when I had my last day at The Atlantic it was time for panic as McCain had tied things up. Then Obama started to regain ground, going up to a four point lead. Then the race tightened again, then Obama opened up a five point lead, and now it’s tightening again but with Obama back to a smallish lead having beaten back the strong challenge McCain was mounting around August 1. In short, McCain’s “Celebrity” ad and drilling attacks were working well, but when the McCain campaign went after Obama on the tire gauge thing he came up with effective countermeasures and regained his lead.


Or maybe none of that happened. As everyone knows, there’s sampling error associated with polling. As a result, if you poll 1,000 people on August 1 and then you poll 1,000 different people on August 2 you shouldn’t be surprised to see the results differ by several percentage points even in the absence of any change in the underlying public opinion. Beyond that, doing one poll per day throughout a long campaign would mean that you’d expect to see one or two relatively rare outlier results per month even under circumstances of total stasis. And as Alan Abramowitz points out if you look at the daily results this is actually what you see — incredible volatility with Obama’s lead oscillating violently around an average of 3-4 points. Since it’s not plausible that the public mood is really swinging anywhere near as rapidly as a very naive reading of the Gallup daily results would suggest, people could see that this is basically statistical noise in a stable race.

But Gallup doesn’t report its daily results, they report a multi-day rolling average. Abramowitz notes that if you report a ten day rolling average, you get a chart where nothing happens — Obama maintains a flat lead of 3-4 points. Again, a stable race. But if instead of doing either of those things you do what Gallup actually does and report a three day rolling average, you get these pleasant looking peaks and valleys in the race. The change over time here is large enough in magnitude (unlike on the ten day chart) but also slow enough in pace (unlike on the one day chart) to be plausibly interpreted as public opinion shifting in response to events. And since the human mind is designed to recognize patterns and construct narratives, and since it suits the interests of campaign journalists to write narratives, people interpret the peaks and valleys of the three day average as real shifts in public opinion. But while I have no way of proving that it’s just statistical noise and nothing’s really happening, the “nothing happening” narrative is completely consistent with the data, and it’s telling that the conventional narratives collapse when the data is presented in different ways whereas the “noise” narrative is consistent with multiple ways of displaying the information...

And, indeed, today the Gallup Organization Writes:

Gallup Daily: Obama Moves Ahead, 48% to 42%: PRINCETON, NJ -- Democratic candidate Barack Obama has gained ground in the latest Gallup Poll Daily tracking average from Monday, Tuesday, and Wednesday, and now leads Republican John McCain among registered voters by a 48% to 42% margin...

The truly diabolical thing about the Gallup Organization is that all the roots of the MA process used to construct the three-day moving average from the raw daily data are on the unit circle, so there is no way to back out the daily numbers from the averages. And even if you do know what the daily poll numbers were at some two consecutive dates in the past so that you can then deduce what the third day's polls in the next moving average were (the three having to add up to the reported average, you see), the rounding errors in each day's results propagate undamped over time and so grow without bound. Here, for example, you see what estimates of the daily numbers you get if you assume that the Obama share on July 18, July 19, and July 20 were all equal to 47%:

[Workbook1]Sheet1 Chart 2

What is happening is that the rounding errors are being passed through an amplifying filter with a strong spectral peak at the three-day period--and so the three-day cycles in the estimated daily numbers are freaking out.

One way to get some insight into the data is to notice that the difference between today's moving average and yesterday's moving average is simply equal to one-third times the difference between today's results and the results from three days ago. Given that we have a 900 person daily sample and a vote share near 50%, the standard deviation of each day's sample should be 1.67%, which means:

(1) The standard deviation of the difference between today's sample and the sample of three days ago should be 2.35%--meaning that the daily change in the moving average has a standard deviation of 0.79%. only a one-day change in the moving average of 2% is interesting--smaller changes are likely to be statistical noise from a hypothesis-testing point of view.

(2) The difference between today's moving average and the moving average of two days ago is one-third the difference between the sum of today's and yesterday's sample and the sum of the samples from three and four days ago--meaning that the two-day change in the moving average has a standard deviation of 1.11%. A two-day change in the moving average of 2% is not nearly as interesting as a one-day change of 2%--statistical noise grows from the one to the two-day change.

(3) The difference between today's moving average and the moving average of three or more days ago is one-third the difference between the sum of the most recent three days and the previous or further back three days--meaning that the standard deviation of the three-day change in the moving average should be 1.36%--meaning that a two-day change of 2% is truly not very interesting to a hypothesis tester at all: only a 3% move over three or more days is truly statistically interesting (and, of course, the more persistent such a move is at the more than three-day horizon the more interesting it is).

The right way to deal with the Gallup tracking poll is, I think, to compare it to how it was at some benchmark date in the past, and then to ignore all changes in the Obama or McCain share of less than three percentage points--ignore all changes in the spread of less than six percentage points.