LIES, #@%*$# LIES, AND OPINION POLLS
Tuesday, April 29, 2008 at 01:06PM Perhaps you noticed that in the previous two days (April 27-28), three opinion polls were released showing Hillary Clinton leading John McCain in a hypothetical Presidential election by spreads of one (1) to nine (9) points - - quite a difference. The 9-point poll, of course, got the most play in the news media. So what do these polls mean, if anything?
If you were around and paying attention to such nonsense in 1988, you may recall that at this point in the Presidential election Mike Dukakis was favored over G. H. W. Bush by a substantial margin (15 points or better). Dukakis was clobbered in November; so we must be mindful that things can happen between now and election day. But just for grins . . .
Keep in mind also Samuel Clemens' dictum that there are three kinds of lies: “lies; damned lies, and statistics.” These opinion polls are a variety of statistics and, Clemens might say, the worst form of a lie. Yet there are rational explanations for the differences and I will parse a couple of them for you.
1. Polling Methodology. These are the three recent polls:
Rasmussen Tracking (LV) - Clinton 46% McCain 45%
AP Ipsos (A) - Clinton 50% McCain 41%
Gallup Tracking (RV) - Clinton 47% McCain 44%
The key is the parenthetical notation following the name of each polling organization. “A” means “Adults;” “RV” means “Registered Voters;” “LV” means “Likely Voters.”
So the AP-Ipsos poll included responses from all of the 1001 adults surveyed by that organization, whether or not they were registered to vote. The actual number of registered voters in that sample was 760, but the Clinton-McCain figures for that sub-sample were not released - - and that is how you make statistics lie. Political scientists normally consider “A”-sample polls to be unreliable. Only about half of the eligible voters actually go to the polls; the participation figures in recent Presidential elections are: 1988 - 50.1%; 1992 - 55.1%; 1996 - 49.1%; 2000 - 51.3%; 2004 - 55.3%. Obviously, it doesn't matter who non-voters favor when it comes down to election day. The best predictor of future behavior is past behavior and experience has demonstrated that non-voters simply don't vote, even if they say they will.
The Gallup Tracking sample used registered voters, the assumption being that they are more likely to vote than unregistered folks. Pretty good assumption. Note that among registered voters Clinton's apparent lead drops to 3 per cent, which is at the edge of the 3-point margin of error. Finally, the Rasmussen Tracking poll used likely voters, that is, those who claimed to have a history of actually going out to vote; there, Clinton's apparent lead drops to 1 point, a virtual tie. There is no clear preference between registered-voter polls and likely-voter polls. Registered-voter polls tend to include some who are non-voters; on the other hand, likely-voter polls tend to exclude recently registered voters. The best that can be said at this point is that if the election had been held yesterday, Clinton would likely have received a 1-to-3 point margin. Note that all three polls had a 9 per cent “undecided” factor, which exceeds the margin of error.
So, here's some happy news for the Hillary Clinton folks: Hillary is very popular among those who don't vote.
2. Inherent Bias. Now that polling methodology is clear as mud, let's examine the phenomenon of inherent bias. These kinds of opinion polls are taken by evening telephone calls, supposedly on a national, randomized basis. Obviously, if someone is at work at that time of day, that person doesn't get polled. Neither do folks with unlisted telephone numbers, or with no telephones at all. Polling organizations try to account for these variances, but historically even the polls taken the day before an election are off. (Google “President Tomas Dewey” and see what you get.)
Another type of inherent bias is introduced in Presidential elections by means of the Electoral College. As you should know, each state has "electors" who are determined by adding the number of Congressmen to the number of Senators for that state. The number of Senators is always two (2), no matter how large or how small the relative population of that state. So California has 55 electoral votes (53 Congressmen + 2 Senators) while Montana has three electoral votes (1 Congressman + 2 Senators). Currently, this means that the national polls are skewed in favor of the Democratic Party because of the substantial influence of Democratic-leaning California. Here's how it works: California presently has just over 12 per cent of the country's population; so, in a truly random national survey, 120 of every 1000 poll respondents should be a Californian. However, California is under-represented in the Electoral College, with 55 of 537 electoral votes, or 10.24 per cent of the total. So, to be accurate, a survey should count only 102 Californians for every 1000 poll respondents; as far as I know, no polling organization does this.
Now, let's look at the same 3 polls, but for Obama-McCain:
Rasmussen Tracking (LV) - Obama 46% McCain 44%
AP Ipsos (RV) - Obama 46% McCain 44%
Gallup Tracking (RV) - Obama 45% McCain 45%
That's not a typo beside AP Ipsos - - that organization did in fact publish registered voter numbers for Obama while using the adult-voter numbers for Clinton. Go figure.
Anyway, these polling figures show a virtual tie in the hypothetical Obama-McCain contest; which means McCain would win, all other things being equal. Factoring in the pro-Democrat bias from California, New York, and Illinois, and factoring out pro-Republican Texas, the popular votes will not translate into enough electoral votes to overcome the Republican advantage in the Electoral College.
So there you have it - - the polls accurately give us statistically-sound information on what might have happened if the election had been held yesterday.



Reader Comments